fiftyone.utils.sama#

Sama utilities.

Copyright 2017-2025, Voxel51, Inc.
voxel51.com

Functions:

download_sama_coco_dataset_split(...[, ...])

Utility that downloads full or partial data splits of the COCO dataset with annotation splits found at https://www.sama.com/sama-coco-dataset.

fiftyone.utils.sama.download_sama_coco_dataset_split(dataset_dir, split, label_types=None, classes=None, image_ids=None, num_workers=None, shuffle=None, seed=None, max_samples=None, raw_dir=None, scratch_dir=None)#

Utility that downloads full or partial data splits of the COCO dataset with annotation splits found at https://www.sama.com/sama-coco-dataset.

See this page for the format in which dataset_dir will be arranged.

Any existing files are not re-downloaded.

Parameters:

dataset_dir – the directory to download the dataset
split – the split to download. Supported values are ("train", "validation", "test")
label_types (None) – a label type or list of label types to load. The supported values are ("detections", "segmentations"). By default, all label types are loaded
classes (None) – a string or list of strings specifying required classes to load. Only samples containing at least one instance of a specified class will be loaded
image_ids (None) –
an optional list of specific image IDs to load. Can be provided in any of the following formats:
- a list of <image-id> ints or strings
- a list of <split>/<image-id> strings
- the path to a text (newline-separated), JSON, or CSV file containing the list of image IDs to load in either of the first two formats
num_workers (None) – a suggested number of threads to use when downloading individual images
shuffle (False) – whether to randomly shuffle the order in which samples are chosen for partial downloads
seed (None) – a random seed to use when shuffling
max_samples (None) – a maximum number of samples to load. If label_types and/or classes are also specified, first priority will be given to samples that contain all of the specified label types and/or classes, followed by samples that contain at least one of the specified labels types or classes. The actual number of samples loaded may be less than this maximum value if the dataset does not contain sufficient samples matching your requirements. By default, all matching samples are loaded
raw_dir (None) – a directory in which full annotations files may be stored to avoid re-downloads in the future
scratch_dir (None) – a scratch directory to use to download any necessary temporary files

Returns:

num_samples: the total number of downloaded images
classes: the list of all classes
did_download: whether any content was downloaded (True) or if all necessary files were already downloaded (False)

Return type:

a tuple of