Dataset

class formulation_bench.dataset.Dataset(root)[source]

FormulationBench dataset.

Parameters:
rootstr or pathlib.Path

Path to the root directory containing the FormulationBench dataset. See Dataset Schema for the expected directory structure.

Attributes:
rootpathlib.Path

Resolved absolute path to the dataset root.

problemsdict[int, Problem]

Mapping from integer problem ID (e.g., 1 for p1) to Problem.

reformulationslist[Reformulation]

List of all labelled reformulation pairs in the dataset.

Examples

Load the dataset from a local ./dataset directory:

>>> from formulation_bench import Dataset
>>> ds = Dataset("./dataset")
>>> ds
Dataset(root=..., n_problems=20, n_reformulations=96)

Access a specific problem and one of its formulations:

>>> p1 = ds.problems[1]
>>> p1.formulations["a"].valid
True

Iterate over labelled reformulations:

>>> pos = [r for r in ds.reformulations if r.is_reformulation]
>>> neg = [r for r in ds.reformulations if not r.is_reformulation]
>>> len(pos), len(neg)
(70, 26)
classmethod load(version=None, cache_dir=None, force=False)[source]

Load the FormulationBench dataset, downloading it if necessary.

Thin wrapper around formulation_bench.download_dataset() that downloads the dataset and constructs a Dataset. See that function for versioning and caching semantics.

Parameters:
version, cache_dir, force

Passed through to formulation_bench.download_dataset().

Returns:
datasetDataset

The loaded dataset.

Examples

Download the default version of the dataset (or load from cache):

>>> from formulation_bench import Dataset
>>> ds = Dataset.load()
>>> sorted(ds.problems)[:5]
[1, 2, 3, 4, 5]