COCO-2017#
COCO is a large-scale object detection, segmentation, and captioning dataset.
This version contains images, bounding boxes, and segmentations for the 2017 version of the dataset.
Note
With support from the COCO team, FiftyOne is a recommended tool for downloading, visualizing, and evaluating on the COCO dataset!
Check out this guide for more details on using FiftyOne to work with COCO.
Notes
COCO defines 91 classes but the data only uses 80 classes
Some images from the train and validation sets don’t have annotations
The test set does not have annotations
COCO 2014 and 2017 use the same images, but the splits are different
Details
Dataset name:
coco-2017
Dataset source: http://cocodataset.org/#home
Dataset license: CC-BY-4.0
Dataset size: 25.20 GB
Tags:
image, detection, segmentation
Supported splits:
train, validation, test
ZooDataset class:
COCO2017Dataset
Full split stats
Train split: 118,287 images
Test split: 40,670 images
Validation split: 5,000 images
Partial downloads
FiftyOne provides parameters that can be used to efficiently download specific subsets of the COCO dataset to suit your needs. When new subsets are specified, FiftyOne will use existing downloaded data first if possible before resorting to downloading additional data from the web.
The following parameters are available to configure a partial download of
COCO-2017 by passing them to
load_zoo_dataset()
:
split (None) and splits (None): a string or list of strings, respectively, specifying the splits to load. Supported values are
("train", "test", "validation")
. If neither is provided, all available splits are loadedlabel_types (None): a label type or list of label types to load. Supported values are
("detections", "segmentations")
. By default, only detections are loadedclasses (None): a string or list of strings specifying required classes to load. If provided, only samples containing at least one instance of a specified class will be loaded
image_ids (None): a list of specific image IDs to load. The IDs can be specified either as
<split>/<image-id>
strings or<image-id>
ints or strings. Alternatively, you can provide the path to a TXT (newline-separated), JSON, or CSV file containing the list of image IDs to load in either of the first two formatsinclude_id (False): whether to include the COCO ID of each sample in the loaded labels
include_license (False): whether to include the COCO license of each sample in the loaded labels, if available. The supported values are:
"False"
(default): don’t load the licenseTrue
/"name"
: store the string license name"id"
: store the integer license ID"url"
: store the license URL
only_matching (False): whether to only load labels that match the
classes
orattrs
requirements that you provide (True), or to load all labels for samples that match the requirements (False)num_workers (None): the number of processes to use when downloading individual images. By default,
multiprocessing.cpu_count()
is usedshuffle (False): whether to randomly shuffle the order in which samples are chosen for partial downloads
seed (None): a random seed to use when shuffling
max_samples (None): a maximum number of samples to load per split. If
label_types
and/orclasses
are also specified, first priority will be given to samples that contain all of the specified label types and/or classes, followed by samples that contain at least one of the specified labels types or classes. The actual number of samples loaded may be less than this maximum value if the dataset does not contain sufficient samples matching your requirements
Note
See
COCO2017Dataset
and
COCODetectionDatasetImporter
for complete descriptions of the optional keyword arguments that you can
pass to load_zoo_dataset()
.
Example usage
1import fiftyone as fo
2import fiftyone.zoo as foz
3
4#
5# Load 50 random samples from the validation split
6#
7# Only the required images will be downloaded (if necessary).
8# By default, only detections are loaded
9#
10
11dataset = foz.load_zoo_dataset(
12 "coco-2017",
13 split="validation",
14 max_samples=50,
15 shuffle=True,
16)
17
18session = fo.launch_app(dataset)
19
20#
21# Load segmentations for 25 samples from the validation split that
22# contain cats and dogs
23#
24# Images that contain all `classes` will be prioritized first, followed
25# by images that contain at least one of the required `classes`. If
26# there are not enough images matching `classes` in the split to meet
27# `max_samples`, only the available images will be loaded.
28#
29# Images will only be downloaded if necessary
30#
31
32dataset = foz.load_zoo_dataset(
33 "coco-2017",
34 split="validation",
35 label_types=["segmentations"],
36 classes=["cat", "dog"],
37 max_samples=25,
38)
39
40session.dataset = dataset
41
42#
43# Download the entire validation split and load both detections and
44# segmentations
45#
46# Subsequent partial loads of the validation split will never require
47# downloading any images
48#
49
50dataset = foz.load_zoo_dataset(
51 "coco-2017",
52 split="validation",
53 label_types=["detections", "segmentations"],
54)
55
56session.dataset = dataset
#
# Load 50 random samples from the validation split
#
# Only the required images will be downloaded (if necessary).
# By default, only detections are loaded
#
fiftyone zoo datasets load coco-2017 \
--split validation \
--kwargs \
max_samples=50
fiftyone app launch coco-2017-validation-50
#
# Load segmentations for 25 samples from the validation split that
# contain cats and dogs
#
# Images that contain all `classes` will be prioritized first, followed
# by images that contain at least one of the required `classes`. If
# there are not enough images matching `classes` in the split to meet
# `max_samples`, only the available images will be loaded.
#
# Images will only be downloaded if necessary
#
fiftyone zoo datasets load coco-2017 \
--split validation \
--kwargs \
label_types=segmentations \
classes=cat,dog \
max_samples=25
fiftyone app launch coco-2017-validation-25
#
# Download the entire validation split and load both detections and
# segmentations
#
# Subsequent partial loads of the validation split will never require
# downloading any images
#
fiftyone zoo datasets load coco-2017 \
--split validation \
--kwargs \
label_types=detections,segmentations
fiftyone app launch coco-2017-validation
