fiftyone.utils.data.base¶

Data utilities.

Copyright 2017-2025, Voxel51, Inc.
voxel51.com

Functions:

`parse_images_dir`(dataset_dir[, recursive])	Parses the contents of the given directory of images.
`parse_videos_dir`(dataset_dir[, recursive])	Parses the contents of the given directory of videos.
`parse_image_classification_dir_tree`(dataset_dir)	Parses the contents of the given image classification dataset directory tree, which should have the following format.
`download_image_classification_dataset`(…[, …])	Downloads the classification dataset specified by the given CSV file, which should have the following format.
`download_images`(image_urls, output_dir[, …])	Downloads the images from the given URLs.

fiftyone.utils.data.base.parse_images_dir(dataset_dir, recursive=True)¶

Parses the contents of the given directory of images.

Parameters

dataset_dir – the dataset directory
recursive (True) – whether to recursively traverse subdirectories

Returns

a list of image paths

fiftyone.utils.data.base.parse_videos_dir(dataset_dir, recursive=True)¶

Parses the contents of the given directory of videos.

Parameters

dataset_dir – the dataset directory
recursive (True) – whether to recursively traverse subdirectories

Returns

a list of video paths

fiftyone.utils.data.base.parse_image_classification_dir_tree(dataset_dir)¶

Parses the contents of the given image classification dataset directory tree, which should have the following format:

<dataset_dir>/
    <classA>/
        <image1>.<ext>
        <image2>.<ext>
        ...
    <classB>/
        <image1>.<ext>
        <image2>.<ext>
        ...

Parameters: dataset_dir – the dataset directory
Returns: a list of (image_path, target) pairs classes: a list of class label strings
Return type: samples

fiftyone.utils.data.base.download_image_classification_dataset(csv_path, dataset_dir, classes=None, num_workers=None)¶

Downloads the classification dataset specified by the given CSV file, which should have the following format:

<label1>,<image_url1>
<label2>,<image_url2>
...

The image filenames are the basenames of the URLs, which are assumed to be unique.

The dataset is written to disk in fiftyone.types.FiftyOneImageClassificationDataset format.

Parameters

csv_path – a CSV file containing the labels and image URLs
dataset_dir – the directory to write the dataset
classes (None) – an optional list of classes. By default, this will be inferred from the contents of csv_path
num_workers (None) – a suggested number of threads to use to download images

fiftyone.utils.data.base.download_images(image_urls, output_dir, num_workers=None)¶

Downloads the images from the given URLs.

The filenames in output_dir are the basenames of the URLs, which are assumed to be unique.

Parameters

image_urls – a list of image URLs to download
output_dir – the directory to write the images
num_workers (None) – a suggested number of threads to use

Returns

the list of downloaded image paths