download

formulation_bench.download.ASSET_NAME = 'dataset.tar.gz'

Name of the dataset tarball asset in the releases.

formulation_bench.download.DEFAULT_DATASET_VERSION = 'dataset-v0.2.0'

The snapshot version of the dataset this package was built against. The package is compatible with all dataset versions sharing the same major version.

formulation_bench.download.REPO = 'henryrobbins/flare'

GitHub repo containing the dataset releases.

formulation_bench.download.download_dataset(version=None, cache_dir=None, force=False)[source]

Download the FormulationBench dataset.

A tarball is fetched from the GitHub release tagged version and extracted under <cache_dir>/<version>/. Subsequent calls with the same version reuse the cached copy unless force=True. Also see Downloading the dataset.

Parameters:
versionstr, optional

Release tag, e.g. "dataset-v0.2.0". Defaults to DEFAULT_DATASET_VERSION, the snapshot version this package was built against.

cache_dirstr or pathlib.Path, optional

Cache root. Defaults to $FORMULATION_BENCH_CACHE or $XDG_CACHE_HOME/formulation_bench (~/.cache/formulation_bench).

forcebool, default False

Re-download and overwrite the cached copy.

Returns:
rootpathlib.Path

Path to the extracted dataset root. Load the dataset with Dataset(root).

Examples

Download the default version of the dataset (or load from cache):

>>> from formulation_bench import download_dataset
>>> path = download_dataset()
>>> path
PosixPath('.../.cache/formulation_bench/dataset-v0.2.0/dataset')
>>> from formulation_bench import Dataset
>>> ds = Dataset(path)
>>> sorted(ds.problems)[:5]
[1, 2, 3, 4, 5]

Reload the dataset from cache:

>>> path = download_dataset()
>>> path
PosixPath('.../.cache/formulation_bench/dataset-v0.2.0/dataset')

Force re-download and overwrite the cached copy:

>>> path = download_dataset(force=True)
>>> path
PosixPath('.../.cache/formulation_bench/dataset-v0.2.0/dataset')

Provide a custom cache directory:

>>> path = download_dataset(cache_dir="./custom_cache")
>>> path
PosixPath('custom_cache/dataset-v0.2.0/dataset')