Note

This is a Hugging Face dataset. Learn how to load datasets from the Hub in the Hugging Face integration docs.

Hugging Face

Dataset Card for Homework Test Set for Coursera MOOC - Hands Data Centric Visual AI#

This dataset is the test dataset for the homework in the Hands-on Data Centric Visual AI Coursera course.

This is a FiftyOne dataset with 4572 samples.

Installation#

If you haven’t already, install FiftyOne:

pip install -U fiftyone

Usage#

import fiftyone as fo
import fiftyone.utils.huggingface as fouh

# Load the dataset
# Note: other available arguments include 'max_samples', etc
dataset = fouh.load_from_hub("Voxel51/Coursera_homework_dataset_test")

# Launch the App
session = fo.launch_app(dataset)

Dataset Details#

Dataset Description#

This dataset is a modified subset of the LVIS dataset.

The dataset here only contains detections, NONE of which have been artificially perturbed.

This dataset has the following labels:

  • ‘bolt’

  • ‘knob’

  • ‘tag’

  • ‘button’

  • ‘bottle_cap’

  • ‘belt’

  • ‘strap’

  • ‘necktie’

  • ‘shirt’

  • ‘sweater’

  • ‘streetlight’

  • ‘pole’

  • ‘reflector’

  • ‘headlight’

  • ‘taillight’

  • ‘traffic_light’

  • ‘rearview_mirror’

Dataset Sources#

  • Repository: https://www.lvisdataset.org/

  • Paper: https://arxiv.org/abs/1908.03195

Uses#

Unlike the training dataset for the course, the labels in this dataset HAVE NOT been perturbed.

Dataset Structure#

Each image in the dataset comes with detailed annotations in FiftyOne detection format. A typical annotation looks like this:

<Detection: {
    'id': '66a2f24cce2f9d11d98d3a21',
    'attributes': {},
    'tags': [],
    'label': 'shirt',
    'bounding_box': [
        0.25414,
        0.35845238095238097,
        0.041960000000000004,
        0.051011904761904765,
    ],
    'mask': None,
    'confidence': None,
    'index': None,
}>

Dataset Creation#

Curation Rationale#

The selected labels for this dataset is because these objects can be confusing to a model. Thus, making them a great choice for demonstrating data centric AI techniques.

Source Data#

This is a subset of the LVIS dataset.

Citation#

BibTeX:

@inproceedings{gupta2019lvis,
  title={{LVIS}: A Dataset for Large Vocabulary Instance Segmentation},
  author={Gupta, Agrim and Dollar, Piotr and Girshick, Ross},
  booktitle={Proceedings of the {IEEE} Conference on Computer Vision and Pattern Recognition},
  year={2019}
}