fiftyone.zoo.datasets¶
Module contents¶
The FiftyOne Dataset Zoo.
This package defines a collection of open source datasets made available for download via FiftyOne.
Functions:
|
Lists the available datasets in the FiftyOne Dataset Zoo. |
Returns the list of available zoo dataset sources. |
|
Returns information about the zoo datasets that have been downloaded. |
|
|
Downloads the specified dataset from the FiftyOne Dataset Zoo. |
|
Loads the specified dataset from the FiftyOne Dataset Zoo. |
|
Returns the directory containing the given zoo dataset. |
|
Loads the |
|
Returns the |
|
Deletes the zoo dataset from local disk, if necessary. |
Classes:
|
Class containing info about a dataset in the FiftyOne Dataset Zoo. |
|
Class containing info about a split of a dataset in the FiftyOne Dataset Zoo. |
Base class for datasets made available in the FiftyOne Dataset Zoo. |
|
|
Class for working with remotely-sourced datasets that are compatible with the FiftyOne Dataset Zoo. |
Class representing a zoo dataset that no longer exists in the FiftyOne Dataset Zoo. |
-
fiftyone.zoo.datasets.
list_zoo_datasets
(tags=None, source=None)¶ Lists the available datasets in the FiftyOne Dataset Zoo.
Also includes any remotely-sourced zoo datasets that you’ve downloaded.
Example usage:
import fiftyone as fo import fiftyone.zoo as foz # # List all zoo datasets # names = foz.list_zoo_datasets() print(names) # # List all zoo datasets with (both of) the specified tags # names = foz.list_zoo_datasets(tags=["image", "detection"]) print(names) # # List all zoo datasets available via the given source # names = foz.list_zoo_datasets(source="torch") print(names)
- Parameters
tags (None) – only include datasets that have the specified tag or list of tags
source (None) – only include datasets available via the given source or list of sources
- Returns
a sorted list of dataset names
-
fiftyone.zoo.datasets.
list_zoo_dataset_sources
()¶ Returns the list of available zoo dataset sources.
- Returns
a list of sources
-
fiftyone.zoo.datasets.
list_downloaded_zoo_datasets
()¶ Returns information about the zoo datasets that have been downloaded.
- Returns
a dict mapping dataset names to (
dataset_dir
,ZooDatasetInfo
) tuples
-
fiftyone.zoo.datasets.
download_zoo_dataset
(name_or_url, split=None, splits=None, overwrite=False, cleanup=True, **kwargs)¶ Downloads the specified dataset from the FiftyOne Dataset Zoo.
Any dataset splits that have already been downloaded are not re-downloaded, unless
overwrite == True
is specified.Note
To download from a private GitHub repository that you have access to, provide your GitHub personal access token by setting the
GITHUB_TOKEN
environment variable.- Parameters
name_or_url –
the name of the zoo dataset to download, or the remote source to download it from, which can be:
a GitHub repo URL like
https://github.com/<user>/<repo>
a GitHub ref like
https://github.com/<user>/<repo>/tree/<branch>
orhttps://github.com/<user>/<repo>/commit/<commit>
a GitHub ref string like
<user>/<repo>[/<ref>]
a publicly accessible URL of an archive (eg zip or tar) file
split (None) –
("train", "validation", "test")
. If neithersplit
norsplits
are provided, all available splits are downloaded. Consult the documentation for theZooDataset
you specified to see the supported splitssplits (None) – a list of splits to download, if applicable. Typical values are
("train", "validation", "test")
. If neithersplit
norsplits
are provided, all available splits are downloaded. Consult the documentation for theZooDataset
you specified to see the supported splitsoverwrite (False) – whether to overwrite any existing files
cleanup (True) – whether to cleanup any temporary files generated during download
**kwargs – optional arguments for the
ZooDataset
constructor or the remote dataset’sdownload_and_prepare()
method
- Returns
a tuple of
info: the
ZooDatasetInfo
for the datasetdataset_dir: the directory containing the dataset
-
fiftyone.zoo.datasets.
load_zoo_dataset
(name_or_url, split=None, splits=None, label_field=None, dataset_name=None, download_if_necessary=True, drop_existing_dataset=False, persistent=False, overwrite=False, cleanup=True, progress=None, **kwargs)¶ Loads the specified dataset from the FiftyOne Dataset Zoo.
By default, the dataset will be downloaded if necessary.
Note
To download from a private GitHub repository that you have access to, provide your GitHub personal access token by setting the
GITHUB_TOKEN
environment variable.If you do not specify a custom
dataset_name
and you have previously loaded the same zoo dataset and split(s) into FiftyOne, the existing dataset will be returned.- Parameters
name_or_url –
the name of the zoo dataset to load, or the remote source to load it from, which can be:
a GitHub repo URL like
https://github.com/<user>/<repo>
a GitHub ref like
https://github.com/<user>/<repo>/tree/<branch>
orhttps://github.com/<user>/<repo>/commit/<commit>
a GitHub ref string like
<user>/<repo>[/<ref>]
a publicly accessible URL of an archive (eg zip or tar) file
split (None) –
("train", "validation", "test")
. If neithersplit
norsplits
are provided, all available splits are loaded. Consult the documentation for theZooDataset
you specified to see the supported splitssplits (None) – a list of splits to load, if applicable. Typical values are
("train", "validation", "test")
. If neithersplit
norsplits
are provided, all available splits are loaded. Consult the documentation for theZooDataset
you specified to see the supported splitslabel_field (None) – the label field (or prefix, if the dataset contains multiple label fields) in which to store the dataset’s labels. By default, this is
"ground_truth"
if the dataset contains a single label field. If the dataset contains multiple label fields and this value is not provided, the labels will be stored under dataset-specific field namesdataset_name (None) – an optional name to give the returned
fiftyone.core.dataset.Dataset
. By default, a name will be constructed based on the dataset and split(s) you are loadingdownload_if_necessary (True) – whether to download the dataset if it is not found in the specified dataset directory
drop_existing_dataset (False) – whether to drop an existing dataset with the same name if it exists
persistent (False) – whether the dataset should persist in the database after the session terminates
overwrite (False) – whether to overwrite any existing files if the dataset is to be downloaded
cleanup (True) – whether to cleanup any temporary files generated during download
progress (None) – whether to render a progress bar (True/False), use the default value
fiftyone.config.show_progress_bars
(None), or a progress callback function to invoke instead**kwargs – optional arguments to pass to the
fiftyone.utils.data.importers.DatasetImporter
constructor or the remote dataset’sload_dataset()` method. If ``download_if_necessary == True
, thenkwargs
can also contain arguments fordownload_zoo_dataset()
- Returns
-
fiftyone.zoo.datasets.
find_zoo_dataset
(name_or_url, split=None)¶ Returns the directory containing the given zoo dataset.
If a
split
is provided, the path to the dataset split is returned; otherwise, the path to the root directory is returned.The dataset must be downloaded. Use
download_zoo_dataset()
to download datasets.- Parameters
name_or_url –
the name of the zoo dataset or its remote source, which can be:
a GitHub repo URL like
https://github.com/<user>/<repo>
a GitHub ref like
https://github.com/<user>/<repo>/tree/<branch>
orhttps://github.com/<user>/<repo>/commit/<commit>
a GitHub ref string like
<user>/<repo>[/<ref>]
a publicly accessible URL of an archive (eg zip or tar) file
split (None) – a specific split to locate
- Returns
the directory containing the dataset or split
- Raises
ValueError – if the dataset or split does not exist or has not been downloaded
-
fiftyone.zoo.datasets.
load_zoo_dataset_info
(name_or_url)¶ Loads the
ZooDatasetInfo
for the specified zoo dataset.The dataset must be downloaded. Use
download_zoo_dataset()
to download datasets.- Parameters
name_or_url –
the name of the zoo dataset or its remote source, which can be:
a GitHub repo URL like
https://github.com/<user>/<repo>
a GitHub ref like
https://github.com/<user>/<repo>/tree/<branch>
orhttps://github.com/<user>/<repo>/commit/<commit>
a GitHub ref string like
<user>/<repo>[/<ref>]
a publicly accessible URL of an archive (eg zip or tar) file
- Returns
the
ZooDatasetInfo
for the dataset- Raises
ValueError – if the dataset has not been downloaded
-
fiftyone.zoo.datasets.
get_zoo_dataset
(name_or_url, overwrite=False, **kwargs)¶ Returns the
ZooDataset
instance for the given dataset.If the dataset is available from multiple sources, the default source is used.
- Parameters
name_or_url –
the name of the zoo dataset, or its remote source, which can be:
a GitHub repo URL like
https://github.com/<user>/<repo>
a GitHub ref like
https://github.com/<user>/<repo>/tree/<branch>
orhttps://github.com/<user>/<repo>/commit/<commit>
a GitHub ref string like
<user>/<repo>[/<ref>]
a publicly accessible URL of an archive (eg zip or tar) file
overwrite (False) – whether to overwrite existing metadata if it has already been downloaded. Only applicable when
name_or_url
is a remote source**kwargs – optional arguments for
ZooDataset
- Returns
the
ZooDataset
instance
-
fiftyone.zoo.datasets.
delete_zoo_dataset
(name_or_url, split=None)¶ Deletes the zoo dataset from local disk, if necessary.
If a
split
is provided, only that split is deleted.- Parameters
name_or_url –
the name of the zoo dataset, or its remote source, which can be:
a GitHub repo URL like
https://github.com/<user>/<repo>
a GitHub ref like
https://github.com/<user>/<repo>/tree/<branch>
orhttps://github.com/<user>/<repo>/commit/<commit>
a GitHub ref string like
<user>/<repo>[/<ref>]
a publicly accessible URL of an archive (eg zip or tar) file
split (None) –
-
class
fiftyone.zoo.datasets.
ZooDatasetInfo
(zoo_dataset, dataset_type, num_samples, downloaded_splits=None, parameters=None, classes=None)¶ Bases:
eta.core.serial.Serializable
Class containing info about a dataset in the FiftyOne Dataset Zoo.
- Parameters
zoo_dataset – the
ZooDataset
instance for the datasetdataset_type – the
fiftyone.types.Dataset
type of the datasetnum_samples – the total number of samples in all downloaded splits of the dataset
downloaded_splits (None) – a dict of
ZooDatasetSplitInfo
instances describing the downloaded splits of the dataset, if applicableparameters (None) – a dict of parameters for the dataset
classes (None) – a list of class label strings
Attributes:
The name of the dataset.
The fully-qualified class string for the
ZooDataset
of the dataset.The fully-qualified class string of the
fiftyone.types.Dataset
type, if any.A tuple of supported splits for the dataset, or None if the dataset does not have splits.
The dataset’s URL, or None if it is not remotely-sourced.
Methods:
Returns the
ZooDataset
instance for the dataset.Returns the
fiftyone.types.Dataset
type instance for the dataset.is_split_downloaded
(split)Whether the given dataset split is downloaded.
add_split
(split_info)Adds the split to the dataset.
remove_split
(split)Removes the split from the dataset.
Returns a list of class attributes to be serialized.
from_dict
(d)Loads a
ZooDatasetInfo
from a JSON dictionary.from_json
(json_path[, zoo_dataset, upgrade, …])Loads a
ZooDatasetInfo
from a JSON file on disk.copy
()Returns a deep copy of the object.
custom_attributes
([dynamic, private])Returns a customizable list of class attributes.
from_str
(s, *args, **kwargs)Constructs a Serializable object from a JSON string.
Returns the fully-qualified class name string of this object.
serialize
([reflective])Serializes the object into a dictionary.
to_str
([pretty_print])Returns a string representation of this object.
write_json
(path[, pretty_print])Serializes the object and writes it to disk.
-
property
name
¶ The name of the dataset.
-
property
zoo_dataset
¶ The fully-qualified class string for the
ZooDataset
of the dataset.
-
property
dataset_type
¶ The fully-qualified class string of the
fiftyone.types.Dataset
type, if any.
-
property
supported_splits
¶ A tuple of supported splits for the dataset, or None if the dataset does not have splits.
-
property
url
¶ The dataset’s URL, or None if it is not remotely-sourced.
-
get_zoo_dataset
()¶ Returns the
ZooDataset
instance for the dataset.- Returns
a
ZooDataset
instance
-
get_dataset_type
()¶ Returns the
fiftyone.types.Dataset
type instance for the dataset.- Returns
a
fiftyone.types.Dataset
instance
-
is_split_downloaded
(split)¶ Whether the given dataset split is downloaded.
- Parameters
split – the dataset split
- Returns
True/False
-
add_split
(split_info)¶ Adds the split to the dataset.
- Parameters
split_info – a
ZooDatasetSplitInfo
-
remove_split
(split)¶ Removes the split from the dataset.
- Parameters
split – the name of the split
-
attributes
()¶ Returns a list of class attributes to be serialized.
- Returns
a list of class attributes
-
classmethod
from_dict
(d)¶ Loads a
ZooDatasetInfo
from a JSON dictionary.- Parameters
d – a JSON dictionary
- Returns
-
classmethod
from_json
(json_path, zoo_dataset=None, upgrade=False, warn_deprecated=False)¶ Loads a
ZooDatasetInfo
from a JSON file on disk.- Parameters
json_path – path to JSON file
zoo_dataset (None) – an existing
ZooDataset
instanceupgrade (False) – whether to upgrade the JSON file on disk if any migrations were necessary
warn_deprecated (False) – whether to issue a warning if the dataset has a deprecated format
- Returns
-
copy
()¶ Returns a deep copy of the object.
- Returns
a Serializable instance
-
custom_attributes
(dynamic=False, private=False)¶ Returns a customizable list of class attributes.
By default, all attributes in vars(self) are returned, minus private attributes (those starting with “_”).
- Parameters
dynamic – whether to include dynamic properties, e.g., those defined by getter/setter methods or the @property decorator. By default, this is False
private – whether to include private properties, i.e., those starting with “_”. By default, this is False
- Returns
a list of class attributes
-
classmethod
from_str
(s, *args, **kwargs)¶ Constructs a Serializable object from a JSON string.
Subclasses may override this method, but, by default, this method simply parses the string and calls from_dict(), which subclasses must implement.
- Parameters
s – a JSON string representation of a Serializable object
*args – optional positional arguments for self.from_dict()
**kwargs – optional keyword arguments for self.from_dict()
- Returns
an instance of the Serializable class
-
classmethod
get_class_name
()¶ Returns the fully-qualified class name string of this object.
-
serialize
(reflective=False)¶ Serializes the object into a dictionary.
Serialization is applied recursively to all attributes in the object, including element-wise serialization of lists and dictionary values.
- Parameters
reflective – whether to include reflective attributes when serializing the object. By default, this is False
- Returns
a JSON dictionary representation of the object
-
to_str
(pretty_print=True, **kwargs)¶ Returns a string representation of this object.
- Parameters
pretty_print – whether to render the JSON in human readable format with newlines and indentations. By default, this is True
**kwargs – optional keyword arguments for self.serialize()
- Returns
a string representation of the object
-
write_json
(path, pretty_print=False, **kwargs)¶ Serializes the object and writes it to disk.
- Parameters
path – the output path
pretty_print – whether to render the JSON in human readable format with newlines and indentations. By default, this is False
**kwargs – optional keyword arguments for self.serialize()
-
class
fiftyone.zoo.datasets.
ZooDatasetSplitInfo
(split, num_samples)¶ Bases:
eta.core.serial.Serializable
Class containing info about a split of a dataset in the FiftyOne Dataset Zoo.
- Parameters
split – the name of the split
num_samples – the number of samples in the split
Methods:
Returns a list of class attributes to be serialized.
from_dict
(d)Loads a
ZooDatasetSplitInfo
from a JSON dictionary.copy
()Returns a deep copy of the object.
custom_attributes
([dynamic, private])Returns a customizable list of class attributes.
from_json
(path, *args, **kwargs)Constructs a Serializable object from a JSON file.
from_str
(s, *args, **kwargs)Constructs a Serializable object from a JSON string.
Returns the fully-qualified class name string of this object.
serialize
([reflective])Serializes the object into a dictionary.
to_str
([pretty_print])Returns a string representation of this object.
write_json
(path[, pretty_print])Serializes the object and writes it to disk.
-
attributes
()¶ Returns a list of class attributes to be serialized.
- Returns
a list of class attributes
-
classmethod
from_dict
(d)¶ Loads a
ZooDatasetSplitInfo
from a JSON dictionary.- Parameters
d – a JSON dictionary
- Returns
-
copy
()¶ Returns a deep copy of the object.
- Returns
a Serializable instance
-
custom_attributes
(dynamic=False, private=False)¶ Returns a customizable list of class attributes.
By default, all attributes in vars(self) are returned, minus private attributes (those starting with “_”).
- Parameters
dynamic – whether to include dynamic properties, e.g., those defined by getter/setter methods or the @property decorator. By default, this is False
private – whether to include private properties, i.e., those starting with “_”. By default, this is False
- Returns
a list of class attributes
-
classmethod
from_json
(path, *args, **kwargs)¶ Constructs a Serializable object from a JSON file.
Subclasses may override this method, but, by default, this method simply reads the JSON and calls from_dict(), which subclasses must implement.
- Parameters
path – the path to the JSON file on disk
*args – optional positional arguments for self.from_dict()
**kwargs – optional keyword arguments for self.from_dict()
- Returns
an instance of the Serializable class
-
classmethod
from_str
(s, *args, **kwargs)¶ Constructs a Serializable object from a JSON string.
Subclasses may override this method, but, by default, this method simply parses the string and calls from_dict(), which subclasses must implement.
- Parameters
s – a JSON string representation of a Serializable object
*args – optional positional arguments for self.from_dict()
**kwargs – optional keyword arguments for self.from_dict()
- Returns
an instance of the Serializable class
-
classmethod
get_class_name
()¶ Returns the fully-qualified class name string of this object.
-
serialize
(reflective=False)¶ Serializes the object into a dictionary.
Serialization is applied recursively to all attributes in the object, including element-wise serialization of lists and dictionary values.
- Parameters
reflective – whether to include reflective attributes when serializing the object. By default, this is False
- Returns
a JSON dictionary representation of the object
-
to_str
(pretty_print=True, **kwargs)¶ Returns a string representation of this object.
- Parameters
pretty_print – whether to render the JSON in human readable format with newlines and indentations. By default, this is True
**kwargs – optional keyword arguments for self.serialize()
- Returns
a string representation of the object
-
write_json
(path, pretty_print=False, **kwargs)¶ Serializes the object and writes it to disk.
- Parameters
path – the output path
pretty_print – whether to render the JSON in human readable format with newlines and indentations. By default, this is False
**kwargs – optional keyword arguments for self.serialize()
-
class
fiftyone.zoo.datasets.
ZooDataset
¶ Bases:
object
Base class for datasets made available in the FiftyOne Dataset Zoo.
Attributes:
The name of the dataset.
Whether the dataset is remotely-sourced.
A tuple of tags for the dataset.
Whether the dataset has tags.
An optional dict of parameters describing the configuration of the zoo dataset when it was downloaded.
A tuple of supported splits for the dataset, or None if the dataset does not have splits.
Whether the dataset has splits.
Whether the dataset has patches that may need to be applied to already downloaded files.
Whether the dataset supports downloading partial subsets of its splits.
Whether this dataset requires some files to be manually downloaded by the user before the dataset can be loaded.
A dict of default kwargs to pass to this dataset’s
fiftyone.utils.data.importers.DatasetImporter
.Methods:
has_tag
(tag)Whether the dataset has the given tag.
has_split
(split)Whether the dataset has the given split.
get_split_dir
(dataset_dir, split)Returns the directory for the given split of the dataset.
has_info
(dataset_dir)Determines whether the directory contains
ZooDatasetInfo
.load_info
(dataset_dir[, upgrade, …])Loads the
ZooDatasetInfo
from the given dataset directory.get_info_path
(dataset_dir)Returns the path to the
ZooDatasetInfo
for the dataset.download_and_prepare
(dataset_dir[, split, …])Downloads the dataset and prepares it for use.
-
property
name
¶ The name of the dataset.
-
property
is_remote
¶ Whether the dataset is remotely-sourced.
A tuple of tags for the dataset.
Whether the dataset has tags.
-
property
parameters
¶ An optional dict of parameters describing the configuration of the zoo dataset when it was downloaded.
-
property
supported_splits
¶ A tuple of supported splits for the dataset, or None if the dataset does not have splits.
-
property
has_splits
¶ Whether the dataset has splits.
-
property
has_patches
¶ Whether the dataset has patches that may need to be applied to already downloaded files.
-
property
supports_partial_downloads
¶ Whether the dataset supports downloading partial subsets of its splits.
-
property
requires_manual_download
¶ Whether this dataset requires some files to be manually downloaded by the user before the dataset can be loaded.
-
property
importer_kwargs
¶ A dict of default kwargs to pass to this dataset’s
fiftyone.utils.data.importers.DatasetImporter
.
-
has_tag
(tag)¶ Whether the dataset has the given tag.
- Parameters
tag – the tag
- Returns
True/False
-
has_split
(split)¶ Whether the dataset has the given split.
- Parameters
split – the dataset split
- Returns
True/False
-
get_split_dir
(dataset_dir, split)¶ Returns the directory for the given split of the dataset.
- Parameters
dataset_dir – the dataset directory
split – the dataset split
- Returns
the directory that will/does hold the specified split
-
static
has_info
(dataset_dir)¶ Determines whether the directory contains
ZooDatasetInfo
.- Parameters
dataset_dir – the dataset directory
- Returns
True/False
-
static
load_info
(dataset_dir, upgrade=True, warn_deprecated=False)¶ Loads the
ZooDatasetInfo
from the given dataset directory.- Parameters
dataset_dir – the directory in which to construct the dataset
upgrade (True) – whether to upgrade the JSON file on disk if any migrations were necessary
warn_deprecated (False) – whether to issue a warning if the dataset has a deprecated format
- Returns
the
ZooDatasetInfo
for the dataset
-
static
get_info_path
(dataset_dir)¶ Returns the path to the
ZooDatasetInfo
for the dataset.- Parameters
dataset_dir – the dataset directory
- Returns
the path to the
ZooDatasetInfo
-
download_and_prepare
(dataset_dir, split=None, splits=None, cleanup=True)¶ Downloads the dataset and prepares it for use.
If the requested splits have already been downloaded, they are not re-downloaded.
- Parameters
dataset_dir – the directory in which to construct the dataset
split (None) –
split
norsplits
are provided, the full dataset is downloadedsplits (None) – a list of splits to download, if applicable. If neither
split
norsplits
are provided, the full dataset is downloadedcleanup (True) – whether to cleanup any temporary files generated during download
- Returns
the
ZooDatasetInfo
for the dataset
-
property
-
class
fiftyone.zoo.datasets.
RemoteZooDataset
(dataset_dir, url=None, **kwargs)¶ Bases:
fiftyone.zoo.datasets.ZooDataset
Class for working with remotely-sourced datasets that are compatible with the FiftyOne Dataset Zoo.
- Parameters
dataset_dir – the dataset’s local directory, which must contain a valid dataset YAML file
url (None) –
the dataset’s remote source, which can be:
a GitHub repo URL like
https://github.com/<user>/<repo>
a GitHub ref like
https://github.com/<user>/<repo>/tree/<branch>
orhttps://github.com/<user>/<repo>/commit/<commit>
a GitHub ref string like
<user>/<repo>[/<ref>]
a publicly accessible URL of an archive (eg zip or tar) file
This is explicitly provided rather than relying on the YAML file’s
url
property in case the caller has specified a particular branch or commit**kwargs – optional keyword arguments for the dataset’s download_and_prepare() and/or load_dataset() methods
Attributes:
The name of the dataset.
Whether the dataset is remotely-sourced.
A tuple of tags for the dataset.
A tuple of supported splits for the dataset, or None if the dataset does not have splits.
Whether the dataset supports downloading partial subsets of its splits.
Whether the dataset has patches that may need to be applied to already downloaded files.
Whether the dataset has splits.
Whether the dataset has tags.
A dict of default kwargs to pass to this dataset’s
fiftyone.utils.data.importers.DatasetImporter
.An optional dict of parameters describing the configuration of the zoo dataset when it was downloaded.
Whether this dataset requires some files to be manually downloaded by the user before the dataset can be loaded.
Methods:
download_and_prepare
(dataset_dir[, split, …])Downloads the dataset and prepares it for use.
get_info_path
(dataset_dir)Returns the path to the
ZooDatasetInfo
for the dataset.get_split_dir
(dataset_dir, split)Returns the directory for the given split of the dataset.
has_info
(dataset_dir)Determines whether the directory contains
ZooDatasetInfo
.has_split
(split)Whether the dataset has the given split.
has_tag
(tag)Whether the dataset has the given tag.
load_info
(dataset_dir[, upgrade, …])Loads the
ZooDatasetInfo
from the given dataset directory.-
property
metadata
¶
-
property
name
¶ The name of the dataset.
-
property
url
¶
-
property
is_remote
¶ Whether the dataset is remotely-sourced.
-
property
version
¶
-
property
source
¶
-
property
license
¶
-
property
description
¶
-
property
fiftyone_version
¶
A tuple of tags for the dataset.
-
property
supported_splits
¶ A tuple of supported splits for the dataset, or None if the dataset does not have splits.
-
property
supports_partial_downloads
¶ Whether the dataset supports downloading partial subsets of its splits.
-
property
size_samples
¶
-
download_and_prepare
(dataset_dir, split=None, splits=None, cleanup=True)¶ Downloads the dataset and prepares it for use.
If the requested splits have already been downloaded, they are not re-downloaded.
- Parameters
dataset_dir – the directory in which to construct the dataset
split (None) –
split
norsplits
are provided, the full dataset is downloadedsplits (None) – a list of splits to download, if applicable. If neither
split
norsplits
are provided, the full dataset is downloadedcleanup (True) – whether to cleanup any temporary files generated during download
- Returns
the
ZooDatasetInfo
for the dataset
-
static
get_info_path
(dataset_dir)¶ Returns the path to the
ZooDatasetInfo
for the dataset.- Parameters
dataset_dir – the dataset directory
- Returns
the path to the
ZooDatasetInfo
-
get_split_dir
(dataset_dir, split)¶ Returns the directory for the given split of the dataset.
- Parameters
dataset_dir – the dataset directory
split – the dataset split
- Returns
the directory that will/does hold the specified split
-
static
has_info
(dataset_dir)¶ Determines whether the directory contains
ZooDatasetInfo
.- Parameters
dataset_dir – the dataset directory
- Returns
True/False
-
property
has_patches
¶ Whether the dataset has patches that may need to be applied to already downloaded files.
-
has_split
(split)¶ Whether the dataset has the given split.
- Parameters
split – the dataset split
- Returns
True/False
-
property
has_splits
¶ Whether the dataset has splits.
-
has_tag
(tag)¶ Whether the dataset has the given tag.
- Parameters
tag – the tag
- Returns
True/False
Whether the dataset has tags.
-
property
importer_kwargs
¶ A dict of default kwargs to pass to this dataset’s
fiftyone.utils.data.importers.DatasetImporter
.
-
static
load_info
(dataset_dir, upgrade=True, warn_deprecated=False)¶ Loads the
ZooDatasetInfo
from the given dataset directory.- Parameters
dataset_dir – the directory in which to construct the dataset
upgrade (True) – whether to upgrade the JSON file on disk if any migrations were necessary
warn_deprecated (False) – whether to issue a warning if the dataset has a deprecated format
- Returns
the
ZooDatasetInfo
for the dataset
-
property
parameters
¶ An optional dict of parameters describing the configuration of the zoo dataset when it was downloaded.
-
property
requires_manual_download
¶ Whether this dataset requires some files to be manually downloaded by the user before the dataset can be loaded.
-
class
fiftyone.zoo.datasets.
DeprecatedZooDataset
¶ Bases:
fiftyone.zoo.datasets.ZooDataset
Class representing a zoo dataset that no longer exists in the FiftyOne Dataset Zoo.
Attributes:
The name of the dataset.
A tuple of supported splits for the dataset, or None if the dataset does not have splits.
Whether the dataset has patches that may need to be applied to already downloaded files.
Whether the dataset has splits.
Whether the dataset has tags.
A dict of default kwargs to pass to this dataset’s
fiftyone.utils.data.importers.DatasetImporter
.Whether the dataset is remotely-sourced.
An optional dict of parameters describing the configuration of the zoo dataset when it was downloaded.
Whether this dataset requires some files to be manually downloaded by the user before the dataset can be loaded.
Whether the dataset supports downloading partial subsets of its splits.
A tuple of tags for the dataset.
Methods:
download_and_prepare
(dataset_dir[, split, …])Downloads the dataset and prepares it for use.
get_info_path
(dataset_dir)Returns the path to the
ZooDatasetInfo
for the dataset.get_split_dir
(dataset_dir, split)Returns the directory for the given split of the dataset.
has_info
(dataset_dir)Determines whether the directory contains
ZooDatasetInfo
.has_split
(split)Whether the dataset has the given split.
has_tag
(tag)Whether the dataset has the given tag.
load_info
(dataset_dir[, upgrade, …])Loads the
ZooDatasetInfo
from the given dataset directory.-
property
name
¶ The name of the dataset.
-
property
supported_splits
¶ A tuple of supported splits for the dataset, or None if the dataset does not have splits.
-
download_and_prepare
(dataset_dir, split=None, splits=None, cleanup=True)¶ Downloads the dataset and prepares it for use.
If the requested splits have already been downloaded, they are not re-downloaded.
- Parameters
dataset_dir – the directory in which to construct the dataset
split (None) –
split
norsplits
are provided, the full dataset is downloadedsplits (None) – a list of splits to download, if applicable. If neither
split
norsplits
are provided, the full dataset is downloadedcleanup (True) – whether to cleanup any temporary files generated during download
- Returns
the
ZooDatasetInfo
for the dataset
-
static
get_info_path
(dataset_dir)¶ Returns the path to the
ZooDatasetInfo
for the dataset.- Parameters
dataset_dir – the dataset directory
- Returns
the path to the
ZooDatasetInfo
-
get_split_dir
(dataset_dir, split)¶ Returns the directory for the given split of the dataset.
- Parameters
dataset_dir – the dataset directory
split – the dataset split
- Returns
the directory that will/does hold the specified split
-
static
has_info
(dataset_dir)¶ Determines whether the directory contains
ZooDatasetInfo
.- Parameters
dataset_dir – the dataset directory
- Returns
True/False
-
property
has_patches
¶ Whether the dataset has patches that may need to be applied to already downloaded files.
-
has_split
(split)¶ Whether the dataset has the given split.
- Parameters
split – the dataset split
- Returns
True/False
-
property
has_splits
¶ Whether the dataset has splits.
-
has_tag
(tag)¶ Whether the dataset has the given tag.
- Parameters
tag – the tag
- Returns
True/False
Whether the dataset has tags.
-
property
importer_kwargs
¶ A dict of default kwargs to pass to this dataset’s
fiftyone.utils.data.importers.DatasetImporter
.
-
property
is_remote
¶ Whether the dataset is remotely-sourced.
-
static
load_info
(dataset_dir, upgrade=True, warn_deprecated=False)¶ Loads the
ZooDatasetInfo
from the given dataset directory.- Parameters
dataset_dir – the directory in which to construct the dataset
upgrade (True) – whether to upgrade the JSON file on disk if any migrations were necessary
warn_deprecated (False) – whether to issue a warning if the dataset has a deprecated format
- Returns
the
ZooDatasetInfo
for the dataset
-
property
parameters
¶ An optional dict of parameters describing the configuration of the zoo dataset when it was downloaded.
-
property
requires_manual_download
¶ Whether this dataset requires some files to be manually downloaded by the user before the dataset can be loaded.
-
property
supports_partial_downloads
¶ Whether the dataset supports downloading partial subsets of its splits.
A tuple of tags for the dataset.
-
property