fiftyone.utils.coco#

Utilities for working with datasets in COCO format.

Copyright 2017-2025, Voxel51, Inc.

Functions:

add_coco_labels(sample_collection, ...[, ...])

Adds the given COCO labels to the collection.

load_coco_detection_annotations(json_path[, ...])

Loads the COCO annotations from the given JSON file.

parse_coco_categories(categories)

Parses the COCO categories list.

download_coco_dataset_split(dataset_dir, split)

Utility that downloads full or partial splits of the COCO dataset.

Classes:

COCODetectionDatasetImporter([dataset_dir, ...])

Importer for COCO detection datasets stored on disk.

COCODetectionDatasetExporter([export_dir, ...])

Exporter that writes COCO detection datasets to disk.

COCOObject([id, image_id, category_id, ...])

An object in COCO format.

fiftyone.utils.coco.add_coco_labels(sample_collection, label_field, labels_or_path, categories, label_type='detections', coco_id_field=None, include_annotation_id=False, extra_attrs=True, use_polylines=False, tolerance=None)#

Adds the given COCO labels to the collection.

The labels_or_path argument can be any of the following:

  • a list of COCO annotations in the format below

  • the path to a JSON file containing a list of COCO annotations

  • the path to a JSON file whose "annotations" key contains a list of COCO annotations

When label_type="detections", the labels should have format:

[
    {
        "id": 1,
        "image_id": 1,
        "category_id": 1,
        "bbox": [260, 177, 231, 199],

        # optional
        "score": 0.95,
        "area": 45969,
        "iscrowd": 0,

        # extra attrs
        ...
    },
    ...
]

When label_type="segmentations", the labels should have format:

[
    {
        "id": 1,
        "image_id": 1,
        "category_id": 1,
        "bbox": [260, 177, 231, 199],
        "segmentation": [...],

        # optional
        "score": 0.95,
        "area": 45969,
        "iscrowd": 0,

        # extra attrs
        ...
    },
    ...
]

When label_type="keypoints", the labels should have format:

[
    {
        "id": 1,
        "image_id": 1,
        "category_id": 1,
        "keypoints": [224, 226, 2, ...],
        "num_keypoints": 10,

        # extra attrs
        ...
    },
    ...
]

See this page for more information about the COCO data format.

Parameters:
  • sample_collection – a fiftyone.core.collections.SampleCollection

  • label_field – the label field in which to store the labels. The field will be created if necessary

  • labels_or_path – a list of COCO annotations or the path to a JSON file containing such data on disk

  • categories –

    can be any of the following:

    • a list of category dicts in the format of parse_coco_categories() specifying the classes and their category IDs

    • a dict mapping class IDs to class labels

    • a list of class labels whose 1-based ordering is assumed to correspond to the category IDs in the provided COCO labels

  • label_type ("detections") – the type of labels to load. Supported values are ("detections", "segmentations", "keypoints")

  • coco_id_field (None) –

    this parameter determines how to map the predictions onto samples in sample_collection. The supported values are:

    • None (default): in this case, the image_id of the predictions are assumed to be the 1-based positional indexes of samples in sample_collection

    • the name of a field of sample_collection containing the COCO IDs for the samples that correspond to the image_id of the predictions

  • include_annotation_id (False) – whether to include the COCO ID of each annotation in the loaded labels

  • extra_attrs (True) –

    whether to load extra annotation attributes onto the imported labels. Supported values are:

    • True: load all extra attributes found

    • False: do not load extra attributes

    • a name or list of names of specific attributes to load

  • use_polylines (False) – whether to represent segmentations as fiftyone.core.labels.Polylines instances rather than fiftyone.core.labels.Detections with dense masks

  • tolerance (None) – a tolerance, in pixels, when generating approximate polylines for instance masks. Typical values are 1-3 pixels

class fiftyone.utils.coco.COCODetectionDatasetImporter(dataset_dir=None, data_path=None, labels_path=None, label_types=None, classes=None, image_ids=None, include_id=False, include_annotation_id=False, include_license=False, extra_attrs=True, only_matching=False, use_polylines=False, tolerance=None, shuffle=False, seed=None, max_samples=None)#

Bases: LabeledImageDatasetImporter, ImportPathsMixin

Importer for COCO detection datasets stored on disk.

See this page for format details.

Parameters:
  • dataset_dir (None) – the dataset directory. If omitted, data_path and/or labels_path must be provided

  • data_path (None) –

    an optional parameter that enables explicit control over the location of the media. Can be any of the following:

    • a folder name like "data" or "data/" specifying a subfolder of dataset_dir where the media files reside

    • an absolute directory path where the media files reside. In this case, the dataset_dir has no effect on the location of the data

    • a filename like "data.json" specifying the filename of the JSON data manifest file in dataset_dir

    • an absolute filepath specifying the location of the JSON data manifest. In this case, dataset_dir has no effect on the location of the data

    • a dict mapping filenames to absolute filepaths

    If None, this parameter will default to whichever of data/ or data.json exists in the dataset directory

  • labels_path (None) –

    an optional parameter that enables explicit control over the location of the labels. Can be any of the following:

    • a filename like "labels.json" specifying the location of the labels in dataset_dir

    • an absolute filepath to the labels. In this case, dataset_dir has no effect on the location of the labels

    If None, the parameter will default to labels.json

  • label_types (None) – a label type or list of label types to load. The supported values are ("detections", "segmentations", "keypoints"). By default, all label types are loaded

  • classes (None) – a string or list of strings specifying required classes to load. Only samples containing at least one instance of a specified class will be loaded

  • image_ids (None) –

    an optional list of specific image IDs to load. Can be provided in any of the following formats:

    • a list of <image-id> ints or strings

    • a list of <split>/<image-id> strings

    • the path to a text (newline-separated), JSON, or CSV file containing the list of image IDs to load in either of the first two formats

  • include_id (False) – whether to include the COCO ID of each sample in the loaded labels

  • include_annotation_id (False) – whether to include the COCO ID of each annotation in the loaded labels

  • include_license (False) –

    whether to include the license ID of each sample in the loaded labels, if available. Supported values are:

    • "False": don’t load the license

    • True/"name": store the string license name

    • "id": store the integer license ID

    • "url": store the license URL

    Note that the license descriptions (if available) are always loaded into dataset.info["licenses"] and can be used to convert between ID, name, and URL later

  • extra_attrs (True) –

    whether to load extra annotation attributes onto the imported labels. Supported values are:

    • True: load all extra attributes found

    • False: do not load extra attributes

    • a name or list of names of specific attributes to load

  • only_matching (False) – whether to only load labels that match the classes requirement that you provide (True), or to load all labels for samples that match the requirements (False)

  • use_polylines (False) – whether to represent segmentations as fiftyone.core.labels.Polylines instances rather than fiftyone.core.labels.Detections with dense masks

  • tolerance (None) – a tolerance, in pixels, when generating approximate polylines for instance masks. Typical values are 1-3 pixels

  • shuffle (False) – whether to randomly shuffle the order in which the samples are imported

  • seed (None) – a random seed to use when shuffling

  • max_samples (None) – a maximum number of samples to load. If label_types and/or classes are also specified, first priority will be given to samples that contain all of the specified label types and/or classes, followed by samples that contain at least one of the specified labels types or classes. The actual number of samples loaded may be less than this maximum value if the dataset does not contain sufficient samples matching your requirements. By default, all matching samples are loaded

Attributes:

has_dataset_info

Whether this importer produces a dataset info dictionary.

has_image_metadata

Whether this importer produces fiftyone.core.metadata.ImageMetadata instances for each image.

label_cls

The fiftyone.core.labels.Label class(es) returned by this importer.

Methods:

setup()

Performs any necessary setup before importing the first sample in the dataset.

get_dataset_info()

Returns the dataset info for the dataset.

close(*args)

Performs any necessary actions after the last sample has been imported.

property has_dataset_info#

Whether this importer produces a dataset info dictionary.

property has_image_metadata#

Whether this importer produces fiftyone.core.metadata.ImageMetadata instances for each image.

property label_cls#

The fiftyone.core.labels.Label class(es) returned by this importer.

This can be any of the following:

  • a fiftyone.core.labels.Label class. In this case, the importer is guaranteed to return labels of this type

  • a list or tuple of fiftyone.core.labels.Label classes. In this case, the importer can produce a single label field of any of these types

  • a dict mapping keys to fiftyone.core.labels.Label classes. In this case, the importer will return label dictionaries with keys and value-types specified by this dictionary. Not all keys need be present in the imported labels

  • None. In this case, the importer makes no guarantees about the labels that it may return

setup()#

Performs any necessary setup before importing the first sample in the dataset.

This method is called when the importer’s context manager interface is entered, DatasetImporter.__enter__().

get_dataset_info()#

Returns the dataset info for the dataset.

By convention, this method should be called after all samples in the dataset have been imported.

Returns:

a dict of dataset info

close(*args)#

Performs any necessary actions after the last sample has been imported.

This method is called when the importer’s context manager interface is exited, DatasetImporter.__exit__().

Parameters:

*args – the arguments to DatasetImporter.__exit__()

class fiftyone.utils.coco.COCODetectionDatasetExporter(export_dir=None, data_path=None, labels_path=None, export_media=None, rel_dir=None, abs_paths=False, image_format=None, classes=None, categories=None, info=None, extra_attrs=True, coco_id=None, annotation_id=None, iscrowd='iscrowd', num_decimals=None, tolerance=None)#

Bases: LabeledImageDatasetExporter, ExportPathsMixin

Exporter that writes COCO detection datasets to disk.

See this page for format details.

Parameters:
  • export_dir (None) – the directory to write the export. This has no effect if data_path and labels_path are absolute paths

  • data_path (None) –

    an optional parameter that enables explicit control over the location of the exported media. Can be any of the following:

    • a folder name like "data" or "data/" specifying a subfolder of export_dir in which to export the media

    • an absolute directory path in which to export the media. In this case, the export_dir has no effect on the location of the data

    • a JSON filename like "data.json" specifying the filename of the manifest file in export_dir generated when export_media is "manifest"

    • an absolute filepath specifying the location to write the JSON manifest file when export_media is "manifest". In this case, export_dir has no effect on the location of the data

    If None, the default value of this parameter will be chosen based on the value of the export_media parameter

  • labels_path (None) –

    an optional parameter that enables explicit control over the location of the exported labels. Can be any of the following:

    • a filename like "labels.json" specifying the location in export_dir in which to export the labels

    • an absolute filepath to which to export the labels. In this case, the export_dir has no effect on the location of the labels

    If None, the labels will be exported into export_dir using the default filename

  • export_media (None) –

    controls how to export the raw media. The supported values are:

    • True: copy all media files into the output directory

    • False: don’t export media

    • "move": move all media files into the output directory

    • "symlink": create symlinks to the media files in the output directory

    • "manifest": create a data.json in the output directory that maps UUIDs used in the labels files to the filepaths of the source media, rather than exporting the actual media

    If None, the default value of this parameter will be chosen based on the value of the data_path parameter

  • rel_dir (None) – an optional relative directory to strip from each input filepath to generate a unique identifier for each image. When exporting media, this identifier is joined with data_path to generate an output path for each exported image. This argument allows for populating nested subdirectories that match the shape of the input paths. The path is converted to an absolute path (if necessary) via fiftyone.core.storage.normalize_path()

  • abs_paths (False) – whether to store absolute paths to the images in the exported labels

  • image_format (None) – the image format to use when writing in-memory images to disk. By default, fiftyone.config.default_image_ext is used

  • classes (None) – the list of possible class labels

  • categories (None) – a list of category dicts in the format of parse_coco_categories() specifying the classes and their category IDs

  • info (None) – a dict of info as returned by load_coco_detection_annotations() to include in the exported JSON. If not provided, this info will be extracted when log_collection() is called, if possible

  • extra_attrs (True) –

    whether to include extra object attributes in the exported labels. Supported values are:

    • True: export all extra attributes found

    • False: do not export extra attributes

    • a name or list of names of specific attributes to export

  • coco_id (None) – the name of a sample field containing the COCO IDs of each image

  • annotation_id (None) – the name of a label field containing the COCO annotation ID of each label

  • iscrowd ("iscrowd") – the name of a detection attribute that indicates whether an object is a crowd (the value is automatically set to 0 if the attribute is not present)

  • num_decimals (None) – an optional number of decimal places at which to round bounding box pixel coordinates. By default, no rounding is done

  • tolerance (None) – a tolerance, in pixels, when generating approximate polylines for instance masks. Typical values are 1-3 pixels

Attributes:

requires_image_metadata

Whether this exporter requires fiftyone.core.metadata.ImageMetadata instances for each sample being exported.

label_cls

The fiftyone.core.labels.Label class(es) exported by this exporter.

Methods:

setup()

Performs any necessary setup before exporting the first sample in the dataset.

log_collection(sample_collection)

Logs any relevant information about the fiftyone.core.collections.SampleCollection whose samples will be exported.

export_sample(image_or_path, label[, metadata])

Exports the given sample to the dataset.

close(*args)

Performs any necessary actions after the last sample has been exported.

property requires_image_metadata#

Whether this exporter requires fiftyone.core.metadata.ImageMetadata instances for each sample being exported.

property label_cls#

The fiftyone.core.labels.Label class(es) exported by this exporter.

This can be any of the following:

  • a fiftyone.core.labels.Label class. In this case, the exporter directly exports labels of this type

  • a list or tuple of fiftyone.core.labels.Label classes. In this case, the exporter can export a single label field of any of these types

  • a dict mapping keys to fiftyone.core.labels.Label classes. In this case, the exporter can handle label dictionaries with value-types specified by this dictionary. Not all keys need be present in the exported label dicts

  • None. In this case, the exporter makes no guarantees about the labels that it can export

setup()#

Performs any necessary setup before exporting the first sample in the dataset.

This method is called when the exporter’s context manager interface is entered, DatasetExporter.__enter__().

log_collection(sample_collection)#

Logs any relevant information about the fiftyone.core.collections.SampleCollection whose samples will be exported.

Subclasses can optionally implement this method if their export format can record information such as the fiftyone.core.collections.SampleCollection.info() of the collection being exported.

By convention, this method must be optional; i.e., if it is not called before the first call to export_sample(), then the exporter must make do without any information about the fiftyone.core.collections.SampleCollection (which may not be available, for example, if the samples being exported are not stored in a collection).

Parameters:

sample_collection – the fiftyone.core.collections.SampleCollection whose samples will be exported

export_sample(image_or_path, label, metadata=None)#

Exports the given sample to the dataset.

Parameters:
close(*args)#

Performs any necessary actions after the last sample has been exported.

This method is called when the exporter’s context manager interface is exited, DatasetExporter.__exit__().

Parameters:

*args – the arguments to DatasetExporter.__exit__()

class fiftyone.utils.coco.COCOObject(id=None, image_id=None, category_id=None, bbox=None, segmentation=None, keypoints=None, score=None, area=None, iscrowd=None, **attributes)#

Bases: object

An object in COCO format.

Parameters:
  • id (None) – the ID of the annotation

  • image_id (None) – the ID of the image in which the annotation appears

  • category_id (None) – the category ID of the object

  • bbox (None) – a bounding box for the object in [xmin, ymin, width, height] format

  • segmentation (None) – the segmentation data for the object

  • keypoints (None) – the keypoints data for the object

  • score (None) – a confidence score for the object

  • area (None) – the area of the bounding box, in pixels

  • iscrowd (None) – whether the object is a crowd

  • **attributes – additional custom attributes

Methods:

to_polyline(frame_size[, classes_map, ...])

Returns a fiftyone.core.labels.Polyline representation of the object.

to_keypoints(frame_size[, classes_map, ...])

Returns a fiftyone.core.labels.Keypoint representation of the object.

to_detection(frame_size[, classes_map, ...])

Returns a fiftyone.core.labels.Detection representation of the object.

to_anno_dict()

Returns a COCO annotation dictionary representation of the object.

from_anno_dict(d[, extra_attrs])

Creates a COCOObject from a COCO annotation dict.

from_label(label, metadata[, image_id, ...])

Creates a COCOObject from a compatible fiftyone.core.labels.Label.

to_polyline(frame_size, classes_map=None, supercategory_map=None, tolerance=None, include_id=False)#

Returns a fiftyone.core.labels.Polyline representation of the object.

Parameters:
  • frame_size – the (width, height) of the image

  • classes_map (None) – a dict mapping class IDs to class labels

  • supercategory_map (None) – a dict mapping class names to category dicts

  • tolerance (None) – a tolerance, in pixels, when generating approximate polylines for instance masks. Typical values are 1-3 pixels

  • include_id (False) – whether to include the COCO ID of the object as a label attribute

Returns:

a fiftyone.core.labels.Polyline, or None if no segmentation data is available

to_keypoints(frame_size, classes_map=None, supercategory_map=None, include_id=False)#

Returns a fiftyone.core.labels.Keypoint representation of the object.

Parameters:
  • frame_size – the (width, height) of the image

  • classes_map (None) – a dict mapping class IDs to class labels

  • supercategory_map (None) – a dict mapping class names to category dicts

  • include_id (False) – whether to include the COCO ID of the object as a label attribute

Returns:

a fiftyone.core.labels.Keypoint, or None if no keypoints data is available

to_detection(frame_size, classes_map=None, supercategory_map=None, load_segmentation=False, include_id=False)#

Returns a fiftyone.core.labels.Detection representation of the object.

Parameters:
  • frame_size – the (width, height) of the image

  • classes_map (None) – a dict mapping class IDs to class labels

  • supercategory_map (None) – a dict mapping class names to category dicts

  • load_segmentation (False) – whether to load the segmentation mask for the object, if available

  • include_id (False) – whether to include the COCO ID of the object as a label attribute

Returns:

a fiftyone.core.labels.Detection, or None if no bbox data is available

to_anno_dict()#

Returns a COCO annotation dictionary representation of the object.

Returns:

a COCO annotation dict

classmethod from_anno_dict(d, extra_attrs=True)#

Creates a COCOObject from a COCO annotation dict.

Parameters:
  • d – a COCO annotation dict

  • extra_attrs (True) –

    whether to load extra annotation attributes. Supported values are:

    • True: load all extra attributes

    • False: do not load extra attributes

    • a name or list of names of specific attributes to load

Returns:

a COCOObject

classmethod from_label(label, metadata, image_id=None, category_id=None, keypoint=None, extra_attrs=True, id_attr=None, iscrowd='iscrowd', num_decimals=None, tolerance=None)#

Creates a COCOObject from a compatible fiftyone.core.labels.Label.

Parameters:
  • label – a fiftyone.core.labels.Detection, fiftyone.core.labels.Polyline, or fiftyone.core.labels.Keypoint

  • metadata – a fiftyone.core.metadata.ImageMetadata for the image

  • image_id (None) – an image ID

  • category_id (None) – the category ID for the object

  • keypoint (None) – an optional fiftyone.core.labels.Keypoint containing keypoints to include for the object

  • extra_attrs (True) –

    whether to include extra attributes from the object. Supported values are:

    • True: include all extra attributes found

    • False: do not include extra attributes

    • a name or list of names of specific attributes to include

  • id_attr (None) – the name of the attribute containing the annotation ID of the label, if any

  • iscrowd ("iscrowd") – the name of the crowd attribute (the value is automatically set to 0 if the attribute is not present)

  • num_decimals (None) – an optional number of decimal places at which to round bounding box pixel coordinates. By default, no rounding is done

  • tolerance (None) – a tolerance, in pixels, when generating approximate polylines for instance masks. Typical values are 1-3 pixels

Returns:

a COCOObject

fiftyone.utils.coco.load_coco_detection_annotations(json_path, extra_attrs=True)#

Loads the COCO annotations from the given JSON file.

See this page for format details.

Parameters:
  • json_path – the path to the annotations JSON file

  • extra_attrs (True) –

    whether to load extra annotation attributes. Supported values are:

    • True: load all extra attributes found

    • False: do not load extra attributes

    • a name or list of names of specific attributes to load

Returns:

a tuple of

  • info: a dict of dataset info

  • classes_map: a dict mapping class IDs to labels

  • supercategory_map: a dict mapping class labels to category dicts

  • images: a dict mapping image IDs to image dicts

  • annotations: a dict mapping image IDs to list of COCOObject instances, or None for unlabeled datasets

fiftyone.utils.coco.parse_coco_categories(categories)#

Parses the COCO categories list.

Parameters:

categories –

a list of dict of the form:

[
    ...
    {
        "id": 2,
        "name": "cat",
        "supercategory": "animal",
        "keypoints": ["nose", "head", ...],
        "skeleton": [[12, 14], [14, 16], ...]
    },
    ...
]

Returns:

a tuple of

  • classes_map: a dict mapping class IDs to labels

  • supercategory_map: a dict mapping class labels to category dicts

fiftyone.utils.coco.download_coco_dataset_split(dataset_dir, split, year='2017', label_types=None, classes=None, image_ids=None, num_workers=None, shuffle=None, seed=None, max_samples=None, raw_dir=None, scratch_dir=None)#

Utility that downloads full or partial splits of the COCO dataset.

See this page for the format in which dataset_dir will be arranged.

Any existing files are not re-downloaded.

Parameters:
  • dataset_dir – the directory to download the dataset

  • split – the split to download. Supported values are ("train", "validation", "test")

  • year ("2017") – the dataset year to download. Supported values are ("2014", "2017")

  • label_types (None) – a label type or list of label types to load. The supported values are ("detections", "segmentations"). By default, all label types are loaded

  • classes (None) – a string or list of strings specifying required classes to load. Only samples containing at least one instance of a specified class will be loaded

  • image_ids (None) –

    an optional list of specific image IDs to load. Can be provided in any of the following formats:

    • a list of <image-id> ints or strings

    • a list of <split>/<image-id> strings

    • the path to a text (newline-separated), JSON, or CSV file containing the list of image IDs to load in either of the first two formats

  • num_workers (None) – a suggested number of threads to use when downloading individual images

  • shuffle (False) – whether to randomly shuffle the order in which samples are chosen for partial downloads

  • seed (None) – a random seed to use when shuffling

  • max_samples (None) – a maximum number of samples to load. If label_types and/or classes are also specified, first priority will be given to samples that contain all of the specified label types and/or classes, followed by samples that contain at least one of the specified labels types or classes. The actual number of samples loaded may be less than this maximum value if the dataset does not contain sufficient samples matching your requirements. By default, all matching samples are loaded

  • raw_dir (None) – a directory in which full annotations files may be stored to avoid re-downloads in the future

  • scratch_dir (None) – a scratch directory to use to download any necessary temporary files

Returns:

  • num_samples: the total number of downloaded images

  • classes: the list of all classes

  • did_download: whether any content was downloaded (True) or if all necessary files were already downloaded (False)

Return type:

a tuple of