fiftyone.utils.openimages¶
Utilities for working with the Open Images <https://storage.googleapis.com/openimages/web/index.html> dataset.
Classes:
|
Base class for importing datasets in Open Images format. |
|
Base class for importing datasets in Open Images V6 format. |
|
Base class for importing datasets in Open Images V7 format. |
Functions:
|
Gets the list of relationship attributes in the Open Images dataset. |
|
Gets the boxable classes that exist in classifications, detections, points, and relationships in the Open Images dataset. |
|
Gets the list of classes (350) that are labeled with segmentations in the Open Images V6/V7 dataset. |
|
Gets the list of classes that are labeled with points in the Open Images V7 dataset. |
|
Utility that downloads full or partial splits of the Open Images dataset. |
-
class
fiftyone.utils.openimages.
OpenImagesDatasetImporter
(dataset_dir, label_types=None, classes=None, attrs=None, image_ids=None, include_id=True, only_matching=False, load_hierarchy=True, shuffle=False, seed=None, max_samples=None)¶ Bases:
fiftyone.utils.data.importers.LabeledImageDatasetImporter
Base class for importing datasets in Open Images format.
See
fiftyone.types.OpenImagesDataset
for format details.- Parameters
dataset_dir – the dataset directory
label_types (None) – a label type or list of label types to load. The supported values are
("detections", "classifications", "points", "relationships", "segmentations")
. “points” are only supported for open-images-v7. By default, all supported label types for version are loadedclasses (None) – a string or list of strings specifying required classes to load. If provided, only samples containing at least one instance of a specified class will be loaded
attrs (None) – a string or list of strings specifying required relationship attributes to load. Only applicable when
label_types
includes “relationships”. If provided, only samples containing at least one instance of a specified attribute will be loadedimage_ids (None) –
an optional list of specific image IDs to load. Can be provided in any of the following formats:
a list of
<image-id>
stringsa list of
<split>/<image-id>
stringsthe path to a text (newline-separated), JSON, or CSV file containing the list of image IDs to load in either of the first two formats
include_id (True) – whether to load the Open Images ID for each sample along with the labels
only_matching (False) – whether to only load labels that match the
classes
orattrs
requirements that you provide (True), or to load all labels for samples that match the requirements (False)load_hierarchy (True) – whether to load the classes hierarchy and add it to the dataset’s
info
dictionaryshuffle (False) – whether to randomly shuffle the order in which the samples are imported
seed (None) – a random seed to use when shuffling
max_samples (None) – a maximum number of samples to load. If
label_types
,classes
, and/orattrs
are also specified, first priority will be given to samples that contain all of the specified label types, classes, and/or attributes, followed by samples that contain at least one of the specified labels types or classes. The actual number of samples loaded may be less than this maximum value if the dataset does not contain sufficient samples matching your requirements. By default, all matching samples are loaded
Attributes:
Whether this importer produces a dataset info dictionary.
Whether this importer produces
fiftyone.core.metadata.ImageMetadata
instances for each image.The
fiftyone.core.labels.Label
class(es) returned by this importer.Methods:
setup
()Performs any necessary setup before importing the first sample in the dataset.
Returns the dataset info for the dataset.
close
(*args)Performs any necessary actions after the last sample has been imported.
-
property
has_dataset_info
¶ Whether this importer produces a dataset info dictionary.
-
property
has_image_metadata
¶ Whether this importer produces
fiftyone.core.metadata.ImageMetadata
instances for each image.
-
property
label_cls
¶ The
fiftyone.core.labels.Label
class(es) returned by this importer.This can be any of the following:
a
fiftyone.core.labels.Label
class. In this case, the importer is guaranteed to return labels of this typea list or tuple of
fiftyone.core.labels.Label
classes. In this case, the importer can produce a single label field of any of these typesa dict mapping keys to
fiftyone.core.labels.Label
classes. In this case, the importer will return label dictionaries with keys and value-types specified by this dictionary. Not all keys need be present in the imported labelsNone
. In this case, the importer makes no guarantees about the labels that it may return
-
setup
()¶ Performs any necessary setup before importing the first sample in the dataset.
This method is called when the importer’s context manager interface is entered,
DatasetImporter.__enter__()
.
-
get_dataset_info
()¶ Returns the dataset info for the dataset.
By convention, this method should be called after all samples in the dataset have been imported.
- Returns
a dict of dataset info
-
close
(*args)¶ Performs any necessary actions after the last sample has been imported.
This method is called when the importer’s context manager interface is exited,
DatasetImporter.__exit__()
.- Parameters
*args – the arguments to
DatasetImporter.__exit__()
-
class
fiftyone.utils.openimages.
OpenImagesV6DatasetImporter
(dataset_dir, label_types=None, classes=None, attrs=None, image_ids=None, include_id=True, only_matching=False, load_hierarchy=True, shuffle=False, seed=None, max_samples=None)¶ Bases:
fiftyone.utils.openimages.OpenImagesDatasetImporter
Base class for importing datasets in Open Images V6 format.
See
fiftyone.types.OpenImagesDataset
for format details.- Parameters
dataset_dir – the dataset directory
label_types (None) – a label type or list of label types to load. The supported values are
("detections", "classifications", "relationships", "segmentations")
. By default, all supported label types for version are loadedclasses (None) – a string or list of strings specifying required classes to load. If provided, only samples containing at least one instance of a specified class will be loaded
attrs (None) – a string or list of strings specifying required relationship attributes to load. Only applicable when
label_types
includes “relationships”. If provided, only samples containing at least one instance of a specified attribute will be loadedimage_ids (None) –
an optional list of specific image IDs to load. Can be provided in any of the following formats:
a list of
<image-id>
stringsa list of
<split>/<image-id>
stringsthe path to a text (newline-separated), JSON, or CSV file containing the list of image IDs to load in either of the first two formats
include_id (True) – whether to load the Open Images ID for each sample along with the labels
only_matching (False) – whether to only load labels that match the
classes
orattrs
requirements that you provide (True), or to load all labels for samples that match the requirements (False)load_hierarchy (True) – whether to load the classes hierarchy and add it to the dataset’s
info
dictionaryshuffle (False) – whether to randomly shuffle the order in which the samples are imported
seed (None) – a random seed to use when shuffling
max_samples (None) – a maximum number of samples to load. If
label_types
,classes
, and/orattrs
are also specified, first priority will be given to samples that contain all of the specified label types, classes, and/or attributes, followed by samples that contain at least one of the specified labels types or classes. The actual number of samples loaded may be less than this maximum value if the dataset does not contain sufficient samples matching your requirements. By default, all matching samples are loaded
Methods:
close
(*args)Performs any necessary actions after the last sample has been imported.
Returns the dataset info for the dataset.
setup
()Performs any necessary setup before importing the first sample in the dataset.
Attributes:
Whether this importer produces a dataset info dictionary.
Whether this importer produces
fiftyone.core.metadata.ImageMetadata
instances for each image.The
fiftyone.core.labels.Label
class(es) returned by this importer.-
close
(*args)¶ Performs any necessary actions after the last sample has been imported.
This method is called when the importer’s context manager interface is exited,
DatasetImporter.__exit__()
.- Parameters
*args – the arguments to
DatasetImporter.__exit__()
-
get_dataset_info
()¶ Returns the dataset info for the dataset.
By convention, this method should be called after all samples in the dataset have been imported.
- Returns
a dict of dataset info
-
property
has_dataset_info
¶ Whether this importer produces a dataset info dictionary.
-
property
has_image_metadata
¶ Whether this importer produces
fiftyone.core.metadata.ImageMetadata
instances for each image.
-
property
label_cls
¶ The
fiftyone.core.labels.Label
class(es) returned by this importer.This can be any of the following:
a
fiftyone.core.labels.Label
class. In this case, the importer is guaranteed to return labels of this typea list or tuple of
fiftyone.core.labels.Label
classes. In this case, the importer can produce a single label field of any of these typesa dict mapping keys to
fiftyone.core.labels.Label
classes. In this case, the importer will return label dictionaries with keys and value-types specified by this dictionary. Not all keys need be present in the imported labelsNone
. In this case, the importer makes no guarantees about the labels that it may return
-
setup
()¶ Performs any necessary setup before importing the first sample in the dataset.
This method is called when the importer’s context manager interface is entered,
DatasetImporter.__enter__()
.
-
class
fiftyone.utils.openimages.
OpenImagesV7DatasetImporter
(dataset_dir, label_types=None, classes=None, attrs=None, image_ids=None, include_id=True, only_matching=False, load_hierarchy=True, shuffle=False, seed=None, max_samples=None)¶ Bases:
fiftyone.utils.openimages.OpenImagesDatasetImporter
Base class for importing datasets in Open Images V7 format.
See
fiftyone.types.OpenImagesDataset
for format details.- Parameters
dataset_dir – the dataset directory
label_types (None) – a label type or list of label types to load. The supported values are
("detections", "classifications", "points", "relationships", "segmentations")
. By default, all supported label types for version are loadedclasses (None) – a string or list of strings specifying required classes to load. If provided, only samples containing at least one instance of a specified class will be loaded
attrs (None) – a string or list of strings specifying required relationship attributes to load. Only applicable when
label_types
includes “relationships”. If provided, only samples containing at least one instance of a specified attribute will be loadedimage_ids (None) –
an optional list of specific image IDs to load. Can be provided in any of the following formats:
a list of
<image-id>
stringsa list of
<split>/<image-id>
stringsthe path to a text (newline-separated), JSON, or CSV file containing the list of image IDs to load in either of the first two formats
include_id (True) – whether to load the Open Images ID for each sample along with the labels
only_matching (False) – whether to only load labels that match the
classes
orattrs
requirements that you provide (True), or to load all labels for samples that match the requirements (False)load_hierarchy (True) – whether to load the classes hierarchy and add it to the dataset’s
info
dictionaryshuffle (False) – whether to randomly shuffle the order in which the samples are imported
seed (None) – a random seed to use when shuffling
max_samples (None) – a maximum number of samples to load. If
label_types
,classes
, and/orattrs
are also specified, first priority will be given to samples that contain all of the specified label types, classes, and/or attributes, followed by samples that contain at least one of the specified labels types or classes. The actual number of samples loaded may be less than this maximum value if the dataset does not contain sufficient samples matching your requirements. By default, all matching samples are loaded
Methods:
close
(*args)Performs any necessary actions after the last sample has been imported.
Returns the dataset info for the dataset.
setup
()Performs any necessary setup before importing the first sample in the dataset.
Attributes:
Whether this importer produces a dataset info dictionary.
Whether this importer produces
fiftyone.core.metadata.ImageMetadata
instances for each image.The
fiftyone.core.labels.Label
class(es) returned by this importer.-
close
(*args)¶ Performs any necessary actions after the last sample has been imported.
This method is called when the importer’s context manager interface is exited,
DatasetImporter.__exit__()
.- Parameters
*args – the arguments to
DatasetImporter.__exit__()
-
get_dataset_info
()¶ Returns the dataset info for the dataset.
By convention, this method should be called after all samples in the dataset have been imported.
- Returns
a dict of dataset info
-
property
has_dataset_info
¶ Whether this importer produces a dataset info dictionary.
-
property
has_image_metadata
¶ Whether this importer produces
fiftyone.core.metadata.ImageMetadata
instances for each image.
-
property
label_cls
¶ The
fiftyone.core.labels.Label
class(es) returned by this importer.This can be any of the following:
a
fiftyone.core.labels.Label
class. In this case, the importer is guaranteed to return labels of this typea list or tuple of
fiftyone.core.labels.Label
classes. In this case, the importer can produce a single label field of any of these typesa dict mapping keys to
fiftyone.core.labels.Label
classes. In this case, the importer will return label dictionaries with keys and value-types specified by this dictionary. Not all keys need be present in the imported labelsNone
. In this case, the importer makes no guarantees about the labels that it may return
-
setup
()¶ Performs any necessary setup before importing the first sample in the dataset.
This method is called when the importer’s context manager interface is entered,
DatasetImporter.__enter__()
.
-
fiftyone.utils.openimages.
get_attributes
(version='v7', dataset_dir=None)¶ Gets the list of relationship attributes in the Open Images dataset.
- Parameters
version ("v7") – the version of the Open Images dataset. Supported values are
("v6", "v7")
dataset_dir (None) – an optional root directory the in which the dataset is downloaded
- Returns
a sorted list of attribute names
-
fiftyone.utils.openimages.
get_classes
(version='v7', dataset_dir=None)¶ Gets the boxable classes that exist in classifications, detections, points, and relationships in the Open Images dataset.
This method can be called in isolation without downloading the dataset.
- Parameters
version ("v7") – the version of the Open Images dataset. Supported values are
("v6", "v7")
dataset_dir (None) – an optional root directory the in which the dataset is downloaded
- Returns
a sorted list of class name strings
-
fiftyone.utils.openimages.
get_segmentation_classes
(version='v6', dataset_dir=None)¶ Gets the list of classes (350) that are labeled with segmentations in the Open Images V6/V7 dataset.
This method can be called in isolation without downloading the dataset.
- Parameters
version ("v6") – the version of the Open Images dataset. Supported values are
("v6")
dataset_dir (None) – an optional root directory the in which the dataset is downloaded
- Returns
a sorted list of segmentation class name strings
-
fiftyone.utils.openimages.
get_point_classes
(version='v7', dataset_dir=None)¶ Gets the list of classes that are labeled with points in the Open Images V7 dataset.
This method can be called in isolation without downloading the dataset.
- Parameters
version ("v7") – the version of the Open Images dataset. Supported values are
("v7")
dataset_dir (None) – an optional root directory in which the dataset is downloaded
- Returns
a sorted list of segmentation class name strings
-
fiftyone.utils.openimages.
download_open_images_split
(dataset_dir, split, version='v6', label_types=None, classes=None, attrs=None, image_ids=None, num_workers=None, shuffle=None, seed=None, max_samples=None)¶ Utility that downloads full or partial splits of the Open Images dataset.
See
fiftyone.types.OpenImagesDataset
for the format in whichdataset_dir
will be arranged.Any existing files are not re-downloaded.
This method specifically downloads the subsets of annotations corresponding to the 600 boxable classes of Open Images. See here for other download options.
- Parameters
dataset_dir – the directory to download the dataset
split – the split to download. Supported values are
("train", "validation", "test")
version ("v7") – the version of the Open Images dataset to download. Supported values are
("v6", "v7")
label_types (None) – a label type or list of label types to load. The supported values are
("detections", "classifications", "relationships", "segmentations")
for"v6"
and("detections", "classifications", "points", "relationships", "segmentations")
for"v7"
. By default, all label types are loadedclasses (None) – a string or list of strings specifying required classes to load. If provided, only samples containing at least one instance of a specified class will be loaded
attrs (None) – a string or list of strings specifying required relationship attributes to load. Only applicable when
label_types
includes “relationships”. If provided, only samples containing at least one instance of a specified attribute will be loadedimage_ids (None) –
an optional list of specific image IDs to load. Can be provided in any of the following formats:
a list of
<image-id>
stringsa list of
<split>/<image-id>
stringsthe path to a text (newline-separated), JSON, or CSV file containing the list of image IDs to load in either of the first two formats
num_workers (None) – a suggested number of threads to use when downloading individual images
shuffle (False) – whether to randomly shuffle the order in which samples are chosen for partial downloads
seed (None) – a random seed to use when shuffling
max_samples (None) – a maximum number of samples to load per split. If
label_types
,classes
, and/orattrs
are also specified, first priority will be given to samples that contain all of the specified label types, classes, and/or attributes, followed by samples that contain at least one of the specified labels types or classes. The actual number of samples loaded may be less than this maximum value if the dataset does not contain sufficient samples matching your requirements. By default, all matching samples are loaded
- Returns
num_samples: the total number of downloaded images, or
None
if everything was already downloadedclasses: the list of all classes, or
None
if everything was already downloadeddid_download: whether any content was downloaded (True) or if all necessary files were already downloaded (False)
- Return type
a tuple of