Note

This is a Hugging Face dataset. For large datasets, ensure huggingface_hub>=1.1.3 to avoid rate limits. Learn more in the Hugging Face integration docs.

Dataset Card for Lecture Test Set for Coursera MOOC - Hands Data Centric Visual AI#

This dataset is the test dataset for the in-class lectures of the Hands-on Data Centric Visual AI Coursera course.

This is a FiftyOne dataset with 4159 samples.

Installation#

If you haven’t already, install FiftyOne:

pip install -U fiftyone

Usage#

import fiftyone as fo
import fiftyone.utils.huggingface as fouh

# Load the dataset
# Note: other available arguments include 'max_samples', etc
dataset = fouh.load_from_hub("Voxel51/Coursera_lecture_dataset_test")

# Launch the App
session = fo.launch_app(dataset)

Dataset Details#

Dataset Description#

This dataset is a modified subset of the LVIS dataset.

The dataset here only contains detections; NONE of the test set’s labels have been artificially perturbed.

This dataset has the following labels:

‘jacket’
‘coat’
‘jean’
‘trousers’
‘short_pants’
‘trash_can’
‘bucket’
‘flowerpot’
‘helmet’
‘baseball_cap’
‘hat’
‘sunglasses’
‘goggles’
‘doughnut’
‘pastry’
‘onion’
‘tomato’

Dataset Sources [optional]#

Repository: https://www.lvisdataset.org/
Paper: https://arxiv.org/abs/1908.03195

Uses#

The labels in this dataset have been NOT perturbed, unlike the corresponding training dataset.

Dataset Structure#

Each image in the dataset comes with detailed annotations in FiftyOne detection format. A typical annotation looks like this:

<Detection: {
    'id': '66a2f24cce2f9d11d98d39f3',
    'attributes': {},
    'tags': [],
    'label': 'trousers',
    'bounding_box': [
        0.5562343750000001,
        0.4614166666666667,
        0.1974375,
        0.29300000000000004,
    ],
    'mask': None,
    'confidence': None,
    'index': None,
}>

Dataset Creation#

Curation Rationale#

The selected labels for this dataset are because these objects can confuse a model. Thus, making them a great choice for demonstrating data centric AI techniques.

Source Data#

This is a subset of the LVIS dataset.

Citation#

BibTeX:

@inproceedings{gupta2019lvis,
  title={{LVIS}: A Dataset for Large Vocabulary Instance Segmentation},
  author={Gupta, Agrim and Dollar, Piotr and Girshick, Ross},
  booktitle={Proceedings of the {IEEE} Conference on Computer Vision and Pattern Recognition},
  year={2019}
}