fiftyone.zoo.datasets¶

Module contents¶

The FiftyOne Dataset Zoo.

This package defines a collection of open source datasets made available for download via FiftyOne.

Copyright 2017-2025, Voxel51, Inc.
voxel51.com

Functions:

`list_zoo_datasets`([tags, source, license])	Lists the available datasets in the FiftyOne Dataset Zoo.
`list_zoo_dataset_sources`()	Returns the list of available zoo dataset sources.
`list_downloaded_zoo_datasets`()	Returns information about the zoo datasets that have been downloaded.
`download_zoo_dataset`(name_or_url[, split, …])	Downloads the specified dataset from the FiftyOne Dataset Zoo.
`load_zoo_dataset`(name_or_url[, split, …])	Loads the specified dataset from the FiftyOne Dataset Zoo.
`find_zoo_dataset`(name_or_url[, split])	Returns the directory containing the given zoo dataset.
`load_zoo_dataset_info`(name_or_url)	Loads the `ZooDatasetInfo` for the specified zoo dataset.
`get_zoo_dataset`(name_or_url[, overwrite])	Returns the `ZooDataset` instance for the given dataset.
`delete_zoo_dataset`(name_or_url[, split])	Deletes the zoo dataset from local disk, if necessary.

Classes:

`ZooDatasetInfo`(zoo_dataset, dataset_type, …)	Class containing info about a dataset in the FiftyOne Dataset Zoo.
`ZooDatasetSplitInfo`(split, num_samples)	Class containing info about a split of a dataset in the FiftyOne Dataset Zoo.
`ZooDataset`()	Base class for datasets made available in the FiftyOne Dataset Zoo.
`RemoteZooDataset`(dataset_dir[, url])	Class for working with remotely-sourced datasets that are compatible with the FiftyOne Dataset Zoo.
`DeprecatedZooDataset`()	Class representing a zoo dataset that no longer exists in the FiftyOne Dataset Zoo.

fiftyone.zoo.datasets.list_zoo_datasets(tags=None, source=None, license=None)¶

Lists the available datasets in the FiftyOne Dataset Zoo.

Also includes any remotely-sourced zoo datasets that you’ve downloaded.

Example usage:

import fiftyone as fo
import fiftyone.zoo as foz

#
# List all zoo datasets
#

names = foz.list_zoo_datasets()
print(names)

#
# List all zoo datasets with (both of) the specified tags
#

names = foz.list_zoo_datasets(tags=["image", "detection"])
print(names)

#
# List all zoo datasets available via the given source
#

names = foz.list_zoo_datasets(source="torch")
print(names)

Parameters

tags (None) – only include datasets that have the specified tag or list of tags
source (None) – only include datasets available via the given source or list of sources
license (None) – only include datasets that are distributed under the specified license or any of the specified list of licenses. Run fiftyone zoo datasets list to see the available licenses

Returns

a sorted list of dataset names

fiftyone.zoo.datasets.list_zoo_dataset_sources()¶

Returns the list of available zoo dataset sources.

Returns: a list of sources

fiftyone.zoo.datasets.list_downloaded_zoo_datasets()¶

Returns information about the zoo datasets that have been downloaded.

Returns: a dict mapping dataset names to (dataset_dir, ZooDatasetInfo) tuples

fiftyone.zoo.datasets.download_zoo_dataset(name_or_url, split=None, splits=None, overwrite=False, cleanup=True, **kwargs)¶

Downloads the specified dataset from the FiftyOne Dataset Zoo.

Any dataset splits that have already been downloaded are not re-downloaded, unless overwrite == True is specified.

Note

To download from a private GitHub repository that you have access to, provide your GitHub personal access token by setting the GITHUB_TOKEN environment variable.

Parameters

name_or_url –
the name of the zoo dataset to download, or the remote source to download it from, which can be:
- a GitHub repo URL like https://github.com/<user>/<repo>
- a GitHub ref like https://github.com/<user>/<repo>/tree/<branch> or https://github.com/<user>/<repo>/commit/<commit>
- a GitHub ref string like <user>/<repo>[/<ref>]
- a publicly accessible URL of an archive (eg zip or tar) file
split (None) – ("train", "validation", "test"). If neither split nor splits are provided, all available splits are downloaded. Consult the documentation for the ZooDataset you specified to see the supported splits
splits (None) – a list of splits to download, if applicable. Typical values are ("train", "validation", "test"). If neither split nor splits are provided, all available splits are downloaded. Consult the documentation for the ZooDataset you specified to see the supported splits
overwrite (False) – whether to overwrite any existing files
cleanup (True) – whether to cleanup any temporary files generated during download
**kwargs – optional arguments for the ZooDataset constructor or the remote dataset’s download_and_prepare() method

Returns

a tuple of

info: the ZooDatasetInfo for the dataset
dataset_dir: the directory containing the dataset

fiftyone.zoo.datasets.load_zoo_dataset(name_or_url, split=None, splits=None, label_field=None, dataset_name=None, download_if_necessary=True, drop_existing_dataset=False, persistent=False, overwrite=False, cleanup=True, progress=None, **kwargs)¶

Loads the specified dataset from the FiftyOne Dataset Zoo.

By default, the dataset will be downloaded if necessary.

Note

To download from a private GitHub repository that you have access to, provide your GitHub personal access token by setting the GITHUB_TOKEN environment variable.

If you do not specify a custom dataset_name and you have previously loaded the same zoo dataset and split(s) into FiftyOne, the existing dataset will be returned.

Parameters

name_or_url –
the name of the zoo dataset to load, or the remote source to load it from, which can be:
- a GitHub repo URL like https://github.com/<user>/<repo>
- a GitHub ref like https://github.com/<user>/<repo>/tree/<branch> or https://github.com/<user>/<repo>/commit/<commit>
- a GitHub ref string like <user>/<repo>[/<ref>]
- a publicly accessible URL of an archive (eg zip or tar) file
split (None) – ("train", "validation", "test"). If neither split nor splits are provided, all available splits are loaded. Consult the documentation for the ZooDataset you specified to see the supported splits
splits (None) – a list of splits to load, if applicable. Typical values are ("train", "validation", "test"). If neither split nor splits are provided, all available splits are loaded. Consult the documentation for the ZooDataset you specified to see the supported splits
label_field (None) – the label field (or prefix, if the dataset contains multiple label fields) in which to store the dataset’s labels. By default, this is "ground_truth" if the dataset contains a single label field. If the dataset contains multiple label fields and this value is not provided, the labels will be stored under dataset-specific field names
dataset_name (None) – an optional name to give the returned fiftyone.core.dataset.Dataset. By default, a name will be constructed based on the dataset and split(s) you are loading
download_if_necessary (True) – whether to download the dataset if it is not found in the specified dataset directory
drop_existing_dataset (False) – whether to drop an existing dataset with the same name if it exists
persistent (False) – whether the dataset should persist in the database after the session terminates
overwrite (False) – whether to overwrite any existing files if the dataset is to be downloaded
cleanup (True) – whether to cleanup any temporary files generated during download
progress (None) – whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
**kwargs – optional arguments to pass to the fiftyone.utils.data.importers.DatasetImporter constructor or the remote dataset’s load_dataset()` method. If ``download_if_necessary == True, then kwargs can also contain arguments for download_zoo_dataset()

Returns

a fiftyone.core.dataset.Dataset

fiftyone.zoo.datasets.find_zoo_dataset(name_or_url, split=None)¶

Returns the directory containing the given zoo dataset.

If a split is provided, the path to the dataset split is returned; otherwise, the path to the root directory is returned.

The dataset must be downloaded. Use download_zoo_dataset() to download datasets.

Parameters

name_or_url –
the name of the zoo dataset or its remote source, which can be:
- a GitHub repo URL like https://github.com/<user>/<repo>
- a GitHub ref like https://github.com/<user>/<repo>/tree/<branch> or https://github.com/<user>/<repo>/commit/<commit>
- a GitHub ref string like <user>/<repo>[/<ref>]
- a publicly accessible URL of an archive (eg zip or tar) file
split (None) – a specific split to locate

Returns

the directory containing the dataset or split

Raises

ValueError – if the dataset or split does not exist or has not been downloaded

fiftyone.zoo.datasets.load_zoo_dataset_info(name_or_url)¶

Loads the ZooDatasetInfo for the specified zoo dataset.

The dataset must be downloaded. Use download_zoo_dataset() to download datasets.

Parameters

name_or_url –

the name of the zoo dataset or its remote source, which can be:

a GitHub repo URL like https://github.com/<user>/<repo>
a GitHub ref like https://github.com/<user>/<repo>/tree/<branch> or https://github.com/<user>/<repo>/commit/<commit>
a GitHub ref string like <user>/<repo>[/<ref>]
a publicly accessible URL of an archive (eg zip or tar) file

Returns

the ZooDatasetInfo for the dataset

Raises

ValueError – if the dataset has not been downloaded

fiftyone.zoo.datasets.get_zoo_dataset(name_or_url, overwrite=False, **kwargs)¶

Returns the ZooDataset instance for the given dataset.

If the dataset is available from multiple sources, the default source is used.

Parameters

name_or_url –
the name of the zoo dataset, or its remote source, which can be:
- a GitHub repo URL like https://github.com/<user>/<repo>
- a GitHub ref like https://github.com/<user>/<repo>/tree/<branch> or https://github.com/<user>/<repo>/commit/<commit>
- a GitHub ref string like <user>/<repo>[/<ref>]
- a publicly accessible URL of an archive (eg zip or tar) file
overwrite (False) – whether to overwrite existing metadata if it has already been downloaded. Only applicable when name_or_url is a remote source
**kwargs – optional arguments for ZooDataset

Returns

the ZooDataset instance

fiftyone.zoo.datasets.delete_zoo_dataset(name_or_url, split=None)¶

Deletes the zoo dataset from local disk, if necessary.

If a split is provided, only that split is deleted.

Parameters

name_or_url –
the name of the zoo dataset, or its remote source, which can be:
- a GitHub repo URL like https://github.com/<user>/<repo>
- a GitHub ref like https://github.com/<user>/<repo>/tree/<branch> or https://github.com/<user>/<repo>/commit/<commit>
- a GitHub ref string like <user>/<repo>[/<ref>]
- a publicly accessible URL of an archive (eg zip or tar) file
split (None) –

class fiftyone.zoo.datasets.ZooDatasetInfo(zoo_dataset, dataset_type, num_samples, downloaded_splits=None, parameters=None, classes=None)¶

Bases: eta.core.serial.Serializable

Class containing info about a dataset in the FiftyOne Dataset Zoo.

Parameters

zoo_dataset – the ZooDataset instance for the dataset
dataset_type – the fiftyone.types.Dataset type of the dataset
num_samples – the total number of samples in all downloaded splits of the dataset
downloaded_splits (None) – a dict of ZooDatasetSplitInfo instances describing the downloaded splits of the dataset, if applicable
parameters (None) – a dict of parameters for the dataset
classes (None) – a list of class label strings

Attributes:

`name`	The name of the dataset.
`zoo_dataset`	The fully-qualified class string for the `ZooDataset` of the dataset.
`dataset_type`	The fully-qualified class string of the `fiftyone.types.Dataset` type, if any.
`supported_splits`	A tuple of supported splits for the dataset, or None if the dataset does not have splits.
`url`	The dataset’s URL, or None if it is not remotely-sourced.

Methods:

`get_zoo_dataset`()	Returns the `ZooDataset` instance for the dataset.
`get_dataset_type`()	Returns the `fiftyone.types.Dataset` type instance for the dataset.
`is_split_downloaded`(split)	Whether the given dataset split is downloaded.
`add_split`(split_info)	Adds the split to the dataset.
`remove_split`(split)	Removes the split from the dataset.
`attributes`()	Returns a list of class attributes to be serialized.
`from_dict`(d)	Loads a `ZooDatasetInfo` from a JSON dictionary.
`from_json`(json_path[, zoo_dataset, upgrade, …])	Loads a `ZooDatasetInfo` from a JSON file on disk.
`copy`()	Returns a deep copy of the object.
`custom_attributes`([dynamic, private])	Returns a customizable list of class attributes.
`from_str`(s, args, *kwargs)	Constructs a Serializable object from a JSON string.
`get_class_name`()	Returns the fully-qualified class name string of this object.
`serialize`([reflective])	Serializes the object into a dictionary.
`to_str`([pretty_print])	Returns a string representation of this object.
`write_json`(path[, pretty_print])	Serializes the object and writes it to disk.

property name¶: The name of the dataset.

property zoo_dataset¶: The fully-qualified class string for the ZooDataset of the dataset.

property dataset_type¶: The fully-qualified class string of the fiftyone.types.Dataset type, if any.

property supported_splits¶: A tuple of supported splits for the dataset, or None if the dataset does not have splits.

property url¶: The dataset’s URL, or None if it is not remotely-sourced.

get_zoo_dataset()¶

Returns the ZooDataset instance for the dataset.

Returns: a ZooDataset instance

get_dataset_type()¶

Returns the fiftyone.types.Dataset type instance for the dataset.

Returns: a fiftyone.types.Dataset instance

is_split_downloaded(split)¶

Whether the given dataset split is downloaded.

Parameters: split – the dataset split
Returns: True/False

add_split(split_info)¶

Adds the split to the dataset.

Parameters: split_info – a ZooDatasetSplitInfo

remove_split(split)¶

Removes the split from the dataset.

Parameters: split – the name of the split

attributes()¶

Returns a list of class attributes to be serialized.

Returns: a list of class attributes

classmethod from_dict(d)¶

Loads a ZooDatasetInfo from a JSON dictionary.

Parameters: d – a JSON dictionary
Returns: a ZooDatasetInfo

classmethod from_json(json_path, zoo_dataset=None, upgrade=False, warn_deprecated=False)¶

Loads a ZooDatasetInfo from a JSON file on disk.

Parameters

json_path – path to JSON file
zoo_dataset (None) – an existing ZooDataset instance
upgrade (False) – whether to upgrade the JSON file on disk if any migrations were necessary
warn_deprecated (False) – whether to issue a warning if the dataset has a deprecated format

Returns

a ZooDatasetInfo

copy()¶

Returns a deep copy of the object.

Returns: a Serializable instance

custom_attributes(dynamic=False, private=False)¶

Returns a customizable list of class attributes.

By default, all attributes in vars(self) are returned, minus private attributes (those starting with “_”).

Parameters

dynamic – whether to include dynamic properties, e.g., those defined by getter/setter methods or the @property decorator. By default, this is False
private – whether to include private properties, i.e., those starting with “_”. By default, this is False

Returns

a list of class attributes

classmethod from_str(s, *args, **kwargs)¶

Constructs a Serializable object from a JSON string.

Subclasses may override this method, but, by default, this method simply parses the string and calls from_dict(), which subclasses must implement.

Parameters

s – a JSON string representation of a Serializable object
*args – optional positional arguments for self.from_dict()
**kwargs – optional keyword arguments for self.from_dict()

Returns

an instance of the Serializable class

classmethod get_class_name()¶: Returns the fully-qualified class name string of this object.

serialize(reflective=False)¶

Serializes the object into a dictionary.

Serialization is applied recursively to all attributes in the object, including element-wise serialization of lists and dictionary values.

Parameters: reflective – whether to include reflective attributes when serializing the object. By default, this is False
Returns: a JSON dictionary representation of the object

to_str(pretty_print=True, **kwargs)¶

Returns a string representation of this object.

Parameters

pretty_print – whether to render the JSON in human readable format with newlines and indentations. By default, this is True
**kwargs – optional keyword arguments for self.serialize()

Returns

a string representation of the object

write_json(path, pretty_print=False, **kwargs)¶

Serializes the object and writes it to disk.

Parameters

path – the output path
pretty_print – whether to render the JSON in human readable format with newlines and indentations. By default, this is False
**kwargs – optional keyword arguments for self.serialize()

class fiftyone.zoo.datasets.ZooDatasetSplitInfo(split, num_samples)¶

Bases: eta.core.serial.Serializable

Class containing info about a split of a dataset in the FiftyOne Dataset Zoo.

Parameters

split – the name of the split
num_samples – the number of samples in the split

Methods:

`attributes`()	Returns a list of class attributes to be serialized.
`from_dict`(d)	Loads a `ZooDatasetSplitInfo` from a JSON dictionary.
`copy`()	Returns a deep copy of the object.
`custom_attributes`([dynamic, private])	Returns a customizable list of class attributes.
`from_json`(path, args, *kwargs)	Constructs a Serializable object from a JSON file.
`from_str`(s, args, *kwargs)	Constructs a Serializable object from a JSON string.
`get_class_name`()	Returns the fully-qualified class name string of this object.
`serialize`([reflective])	Serializes the object into a dictionary.
`to_str`([pretty_print])	Returns a string representation of this object.
`write_json`(path[, pretty_print])	Serializes the object and writes it to disk.

attributes()¶

Returns a list of class attributes to be serialized.

Returns: a list of class attributes

classmethod from_dict(d)¶

Loads a ZooDatasetSplitInfo from a JSON dictionary.

Parameters: d – a JSON dictionary
Returns: a ZooDatasetSplitInfo

copy()¶

Returns a deep copy of the object.

Returns: a Serializable instance

custom_attributes(dynamic=False, private=False)¶

Returns a customizable list of class attributes.

By default, all attributes in vars(self) are returned, minus private attributes (those starting with “_”).

Parameters

dynamic – whether to include dynamic properties, e.g., those defined by getter/setter methods or the @property decorator. By default, this is False
private – whether to include private properties, i.e., those starting with “_”. By default, this is False

Returns

a list of class attributes

classmethod from_json(path, *args, **kwargs)¶

Constructs a Serializable object from a JSON file.

Subclasses may override this method, but, by default, this method simply reads the JSON and calls from_dict(), which subclasses must implement.

Parameters

path – the path to the JSON file on disk
*args – optional positional arguments for self.from_dict()
**kwargs – optional keyword arguments for self.from_dict()

Returns

an instance of the Serializable class

classmethod from_str(s, *args, **kwargs)¶

Constructs a Serializable object from a JSON string.

Subclasses may override this method, but, by default, this method simply parses the string and calls from_dict(), which subclasses must implement.

Parameters

s – a JSON string representation of a Serializable object
*args – optional positional arguments for self.from_dict()
**kwargs – optional keyword arguments for self.from_dict()

Returns

an instance of the Serializable class

classmethod get_class_name()¶: Returns the fully-qualified class name string of this object.

serialize(reflective=False)¶

Serializes the object into a dictionary.

Serialization is applied recursively to all attributes in the object, including element-wise serialization of lists and dictionary values.

Parameters: reflective – whether to include reflective attributes when serializing the object. By default, this is False
Returns: a JSON dictionary representation of the object

to_str(pretty_print=True, **kwargs)¶

Returns a string representation of this object.

Parameters

pretty_print – whether to render the JSON in human readable format with newlines and indentations. By default, this is True
**kwargs – optional keyword arguments for self.serialize()

Returns

a string representation of the object

write_json(path, pretty_print=False, **kwargs)¶

Serializes the object and writes it to disk.

Parameters

path – the output path
pretty_print – whether to render the JSON in human readable format with newlines and indentations. By default, this is False
**kwargs – optional keyword arguments for self.serialize()

class fiftyone.zoo.datasets.ZooDataset¶

Bases: object

Base class for datasets made available in the FiftyOne Dataset Zoo.

Attributes:

`name`	The name of the dataset.
`is_remote`	Whether the dataset is remotely-sourced.
`license`	The license or list,of,licenses under which the dataset is distributed, or None if unknown.
`tags`	A tuple of tags for the dataset.
`has_tags`	Whether the dataset has tags.
`parameters`	An optional dict of parameters describing the configuration of the zoo dataset when it was downloaded.
`supported_splits`	A tuple of supported splits for the dataset, or None if the dataset does not have splits.
`has_splits`	Whether the dataset has splits.
`has_patches`	Whether the dataset has patches that may need to be applied to already downloaded files.
`supports_partial_downloads`	Whether the dataset supports downloading partial subsets of its splits.
`requires_manual_download`	Whether this dataset requires some files to be manually downloaded by the user before the dataset can be loaded.
`importer_kwargs`	A dict of default kwargs to pass to this dataset’s `fiftyone.utils.data.importers.DatasetImporter`.

Methods:

`has_tag`(tag)	Whether the dataset has the given tag.
`has_split`(split)	Whether the dataset has the given split.
`get_split_dir`(dataset_dir, split)	Returns the directory for the given split of the dataset.
`has_info`(dataset_dir)	Determines whether the directory contains `ZooDatasetInfo`.
`load_info`(dataset_dir[, upgrade, …])	Loads the `ZooDatasetInfo` from the given dataset directory.
`get_info_path`(dataset_dir)	Returns the path to the `ZooDatasetInfo` for the dataset.
`download_and_prepare`(dataset_dir[, split, …])	Downloads the dataset and prepares it for use.

property name¶: The name of the dataset.

property is_remote¶: Whether the dataset is remotely-sourced.

property license¶: The license or list,of,licenses under which the dataset is distributed, or None if unknown.

property tags¶: A tuple of tags for the dataset.

property has_tags¶: Whether the dataset has tags.

property parameters¶: An optional dict of parameters describing the configuration of the zoo dataset when it was downloaded.

property supported_splits¶: A tuple of supported splits for the dataset, or None if the dataset does not have splits.

property has_splits¶: Whether the dataset has splits.

property has_patches¶: Whether the dataset has patches that may need to be applied to already downloaded files.

property supports_partial_downloads¶: Whether the dataset supports downloading partial subsets of its splits.

property requires_manual_download¶: Whether this dataset requires some files to be manually downloaded by the user before the dataset can be loaded.

property importer_kwargs¶: A dict of default kwargs to pass to this dataset’s fiftyone.utils.data.importers.DatasetImporter.

has_tag(tag)¶

Whether the dataset has the given tag.

Parameters: tag – the tag
Returns: True/False

has_split(split)¶

Whether the dataset has the given split.

Parameters: split – the dataset split
Returns: True/False

get_split_dir(dataset_dir, split)¶

Returns the directory for the given split of the dataset.

Parameters

dataset_dir – the dataset directory
split – the dataset split

Returns

the directory that will/does hold the specified split

static has_info(dataset_dir)¶

Determines whether the directory contains ZooDatasetInfo.

Parameters: dataset_dir – the dataset directory
Returns: True/False

static load_info(dataset_dir, upgrade=True, warn_deprecated=False)¶

Loads the ZooDatasetInfo from the given dataset directory.

Parameters

dataset_dir – the directory in which to construct the dataset
upgrade (True) – whether to upgrade the JSON file on disk if any migrations were necessary
warn_deprecated (False) – whether to issue a warning if the dataset has a deprecated format

Returns

the ZooDatasetInfo for the dataset

static get_info_path(dataset_dir)¶

Returns the path to the ZooDatasetInfo for the dataset.

Parameters: dataset_dir – the dataset directory
Returns: the path to the ZooDatasetInfo

download_and_prepare(dataset_dir, split=None, splits=None, cleanup=True)¶

Downloads the dataset and prepares it for use.

If the requested splits have already been downloaded, they are not re-downloaded.

Parameters

dataset_dir – the directory in which to construct the dataset
split (None) – split nor splits are provided, the full dataset is downloaded
splits (None) – a list of splits to download, if applicable. If neither split nor splits are provided, the full dataset is downloaded
cleanup (True) – whether to cleanup any temporary files generated during download

Returns

the ZooDatasetInfo for the dataset

class fiftyone.zoo.datasets.RemoteZooDataset(dataset_dir, url=None, **kwargs)¶

Bases: fiftyone.zoo.datasets.ZooDataset

Class for working with remotely-sourced datasets that are compatible with the FiftyOne Dataset Zoo.

Parameters

dataset_dir – the dataset’s local directory, which must contain a valid dataset YAML file
url (None) –
the dataset’s remote source, which can be:
- a GitHub repo URL like https://github.com/<user>/<repo>
- a GitHub ref like https://github.com/<user>/<repo>/tree/<branch> or https://github.com/<user>/<repo>/commit/<commit>
- a GitHub ref string like <user>/<repo>[/<ref>]
- a publicly accessible URL of an archive (eg zip or tar) file
This is explicitly provided rather than relying on the YAML file’s url property in case the caller has specified a particular branch or commit
**kwargs – optional keyword arguments for the dataset’s download_and_prepare() and/or load_dataset() methods

Attributes:

`metadata`
`name`	The name of the dataset.
`url`
`is_remote`	Whether the dataset is remotely-sourced.
`author`
`version`
`source`
`license`	The license or list,of,licenses under which the dataset is distributed, or None if unknown.
`description`
`fiftyone_version`
`tags`	A tuple of tags for the dataset.
`supported_splits`	A tuple of supported splits for the dataset, or None if the dataset does not have splits.
`supports_partial_downloads`	Whether the dataset supports downloading partial subsets of its splits.
`size_samples`
`has_patches`	Whether the dataset has patches that may need to be applied to already downloaded files.
`has_splits`	Whether the dataset has splits.
`has_tags`	Whether the dataset has tags.
`importer_kwargs`	A dict of default kwargs to pass to this dataset’s `fiftyone.utils.data.importers.DatasetImporter`.
`parameters`	An optional dict of parameters describing the configuration of the zoo dataset when it was downloaded.
`requires_manual_download`	Whether this dataset requires some files to be manually downloaded by the user before the dataset can be loaded.

Methods:

`download_and_prepare`(dataset_dir[, split, …])	Downloads the dataset and prepares it for use.
`get_info_path`(dataset_dir)	Returns the path to the `ZooDatasetInfo` for the dataset.
`get_split_dir`(dataset_dir, split)	Returns the directory for the given split of the dataset.
`has_info`(dataset_dir)	Determines whether the directory contains `ZooDatasetInfo`.
`has_split`(split)	Whether the dataset has the given split.
`has_tag`(tag)	Whether the dataset has the given tag.
`load_info`(dataset_dir[, upgrade, …])	Loads the `ZooDatasetInfo` from the given dataset directory.

property metadata¶

property name¶: The name of the dataset.

property url¶

property is_remote¶: Whether the dataset is remotely-sourced.

property author¶

property version¶

property source¶

property license¶: The license or list,of,licenses under which the dataset is distributed, or None if unknown.

property description¶

property fiftyone_version¶

property tags¶: A tuple of tags for the dataset.

property supported_splits¶: A tuple of supported splits for the dataset, or None if the dataset does not have splits.

property supports_partial_downloads¶: Whether the dataset supports downloading partial subsets of its splits.

property size_samples¶

download_and_prepare(dataset_dir, split=None, splits=None, cleanup=True)¶

Downloads the dataset and prepares it for use.

If the requested splits have already been downloaded, they are not re-downloaded.

Parameters

dataset_dir – the directory in which to construct the dataset
split (None) – split nor splits are provided, the full dataset is downloaded
splits (None) – a list of splits to download, if applicable. If neither split nor splits are provided, the full dataset is downloaded
cleanup (True) – whether to cleanup any temporary files generated during download

Returns

the ZooDatasetInfo for the dataset

static get_info_path(dataset_dir)¶

Returns the path to the ZooDatasetInfo for the dataset.

Parameters: dataset_dir – the dataset directory
Returns: the path to the ZooDatasetInfo

get_split_dir(dataset_dir, split)¶

Returns the directory for the given split of the dataset.

Parameters

dataset_dir – the dataset directory
split – the dataset split

Returns

the directory that will/does hold the specified split

static has_info(dataset_dir)¶

Determines whether the directory contains ZooDatasetInfo.

Parameters: dataset_dir – the dataset directory
Returns: True/False

property has_patches¶: Whether the dataset has patches that may need to be applied to already downloaded files.

has_split(split)¶

Whether the dataset has the given split.

Parameters: split – the dataset split
Returns: True/False

property has_splits¶: Whether the dataset has splits.

has_tag(tag)¶

Whether the dataset has the given tag.

Parameters: tag – the tag
Returns: True/False

property has_tags¶: Whether the dataset has tags.

property importer_kwargs¶: A dict of default kwargs to pass to this dataset’s fiftyone.utils.data.importers.DatasetImporter.

static load_info(dataset_dir, upgrade=True, warn_deprecated=False)¶

Loads the ZooDatasetInfo from the given dataset directory.

Parameters

dataset_dir – the directory in which to construct the dataset
upgrade (True) – whether to upgrade the JSON file on disk if any migrations were necessary
warn_deprecated (False) – whether to issue a warning if the dataset has a deprecated format

Returns

the ZooDatasetInfo for the dataset

property parameters¶: An optional dict of parameters describing the configuration of the zoo dataset when it was downloaded.

property requires_manual_download¶: Whether this dataset requires some files to be manually downloaded by the user before the dataset can be loaded.

class fiftyone.zoo.datasets.DeprecatedZooDataset¶

Bases: fiftyone.zoo.datasets.ZooDataset

Class representing a zoo dataset that no longer exists in the FiftyOne Dataset Zoo.

Attributes:

`name`	The name of the dataset.
`supported_splits`	A tuple of supported splits for the dataset, or None if the dataset does not have splits.
`has_patches`	Whether the dataset has patches that may need to be applied to already downloaded files.
`has_splits`	Whether the dataset has splits.
`has_tags`	Whether the dataset has tags.
`importer_kwargs`	A dict of default kwargs to pass to this dataset’s `fiftyone.utils.data.importers.DatasetImporter`.
`is_remote`	Whether the dataset is remotely-sourced.
`license`	The license or list,of,licenses under which the dataset is distributed, or None if unknown.
`parameters`	An optional dict of parameters describing the configuration of the zoo dataset when it was downloaded.
`requires_manual_download`	Whether this dataset requires some files to be manually downloaded by the user before the dataset can be loaded.
`supports_partial_downloads`	Whether the dataset supports downloading partial subsets of its splits.
`tags`	A tuple of tags for the dataset.

Methods:

`download_and_prepare`(dataset_dir[, split, …])	Downloads the dataset and prepares it for use.
`get_info_path`(dataset_dir)	Returns the path to the `ZooDatasetInfo` for the dataset.
`get_split_dir`(dataset_dir, split)	Returns the directory for the given split of the dataset.
`has_info`(dataset_dir)	Determines whether the directory contains `ZooDatasetInfo`.
`has_split`(split)	Whether the dataset has the given split.
`has_tag`(tag)	Whether the dataset has the given tag.
`load_info`(dataset_dir[, upgrade, …])	Loads the `ZooDatasetInfo` from the given dataset directory.

property name¶: The name of the dataset.

property supported_splits¶: A tuple of supported splits for the dataset, or None if the dataset does not have splits.

download_and_prepare(dataset_dir, split=None, splits=None, cleanup=True)¶

Downloads the dataset and prepares it for use.

If the requested splits have already been downloaded, they are not re-downloaded.

Parameters

dataset_dir – the directory in which to construct the dataset
split (None) – split nor splits are provided, the full dataset is downloaded
splits (None) – a list of splits to download, if applicable. If neither split nor splits are provided, the full dataset is downloaded
cleanup (True) – whether to cleanup any temporary files generated during download

Returns

the ZooDatasetInfo for the dataset

static get_info_path(dataset_dir)¶

Returns the path to the ZooDatasetInfo for the dataset.

Parameters: dataset_dir – the dataset directory
Returns: the path to the ZooDatasetInfo

get_split_dir(dataset_dir, split)¶

Returns the directory for the given split of the dataset.

Parameters

dataset_dir – the dataset directory
split – the dataset split

Returns

the directory that will/does hold the specified split

static has_info(dataset_dir)¶

Determines whether the directory contains ZooDatasetInfo.

Parameters: dataset_dir – the dataset directory
Returns: True/False

property has_patches¶: Whether the dataset has patches that may need to be applied to already downloaded files.

has_split(split)¶

Whether the dataset has the given split.

Parameters: split – the dataset split
Returns: True/False

property has_splits¶: Whether the dataset has splits.

has_tag(tag)¶

Whether the dataset has the given tag.

Parameters: tag – the tag
Returns: True/False

property has_tags¶: Whether the dataset has tags.

property importer_kwargs¶: A dict of default kwargs to pass to this dataset’s fiftyone.utils.data.importers.DatasetImporter.

property is_remote¶: Whether the dataset is remotely-sourced.

property license¶: The license or list,of,licenses under which the dataset is distributed, or None if unknown.

static load_info(dataset_dir, upgrade=True, warn_deprecated=False)¶

Loads the ZooDatasetInfo from the given dataset directory.

Parameters

dataset_dir – the directory in which to construct the dataset
upgrade (True) – whether to upgrade the JSON file on disk if any migrations were necessary
warn_deprecated (False) – whether to issue a warning if the dataset has a deprecated format

Returns

the ZooDatasetInfo for the dataset

property parameters¶: An optional dict of parameters describing the configuration of the zoo dataset when it was downloaded.

property requires_manual_download¶: Whether this dataset requires some files to be manually downloaded by the user before the dataset can be loaded.

property supports_partial_downloads¶: Whether the dataset supports downloading partial subsets of its splits.

property tags¶: A tuple of tags for the dataset.