Note

This is a Hugging Face dataset. For large datasets, ensure huggingface_hub>=1.1.3 to avoid rate limits. Learn more in the Hugging Face integration docs.

Dataset Card for Homework Test Set for Coursera MOOC - Hands Data Centric Visual AI#

This dataset is the test dataset for the homework in the Hands-on Data Centric Visual AI Coursera course.

This is a FiftyOne dataset with 4572 samples.

Installation#

If you haven’t already, install FiftyOne:

pip install -U fiftyone

Usage#

import fiftyone as fo
import fiftyone.utils.huggingface as fouh

# Load the dataset
# Note: other available arguments include 'max_samples', etc
dataset = fouh.load_from_hub("Voxel51/Coursera_homework_dataset_test")

# Launch the App
session = fo.launch_app(dataset)

Dataset Details#

Dataset Description#

This dataset is a modified subset of the LVIS dataset.

The dataset here only contains detections, NONE of which have been artificially perturbed.

This dataset has the following labels:

‘bolt’
‘knob’
‘tag’
‘button’
‘bottle_cap’
‘belt’
‘strap’
‘necktie’
‘shirt’
‘sweater’
‘streetlight’
‘pole’
‘reflector’
‘headlight’
‘taillight’
‘traffic_light’
‘rearview_mirror’

Dataset Sources#

Repository: https://www.lvisdataset.org/
Paper: https://arxiv.org/abs/1908.03195

Uses#

Unlike the training dataset for the course, the labels in this dataset HAVE NOT been perturbed.

Dataset Structure#

Each image in the dataset comes with detailed annotations in FiftyOne detection format. A typical annotation looks like this:

<Detection: {
    'id': '66a2f24cce2f9d11d98d3a21',
    'attributes': {},
    'tags': [],
    'label': 'shirt',
    'bounding_box': [
        0.25414,
        0.35845238095238097,
        0.041960000000000004,
        0.051011904761904765,
    ],
    'mask': None,
    'confidence': None,
    'index': None,
}>

Dataset Creation#

Curation Rationale#

The selected labels for this dataset is because these objects can be confusing to a model. Thus, making them a great choice for demonstrating data centric AI techniques.

Source Data#

This is a subset of the LVIS dataset.

Citation#

BibTeX:

@inproceedings{gupta2019lvis,
  title={{LVIS}: A Dataset for Large Vocabulary Instance Segmentation},
  author={Gupta, Agrim and Dollar, Piotr and Girshick, Ross},
  booktitle={Proceedings of the {IEEE} Conference on Computer Vision and Pattern Recognition},
  year={2019}
}