![]() |
![]() |
![]() |
Adding Object Detections to a Dataset¶
This recipe provides a glimpse into the possibilities for integrating FiftyOne into your ML workflows. Specifically, it covers:
Loading an object detection dataset from the Dataset Zoo
Adding predictions from an object detector to the dataset
Launching the FiftyOne App and visualizing/exploring your data
Integrating the App into your data analysis workflow
Setup¶
If you haven’t already, install FiftyOne:
[ ]:
!pip install fiftyone
In this tutorial, we’ll use an off-the-shelf Faster R-CNN detection model provided by PyTorch. To use it, you’ll need to install torch
and torchvision
, if necessary.
[ ]:
!pip install torch torchvision
Loading a detection dataset¶
In this recipe, we’ll work with the validation split of the COCO dataset, which is conveniently available for download via the FiftyOne Dataset Zoo.
The snippet below will download the validation split and load it into FiftyOne.
[2]:
import fiftyone as fo
import fiftyone.zoo as foz
dataset = foz.load_zoo_dataset(
"coco-2017",
split="validation",
dataset_name="detector-recipe",
)
Split 'validation' already downloaded
Loading 'coco-2017' split 'validation'
100% |████████████████████| 5000/5000 [43.3s elapsed, 0s remaining, 114.9 samples/s]
Dataset 'detector-recipe' created
Let’s inspect the dataset to see what we downloaded:
[3]:
# Print some information about the dataset
print(dataset)
Name: detector-recipe
Media type: image
Num samples: 5000
Persistent: False
Info: {'classes': ['0', 'person', 'bicycle', ...]}
Tags: ['validation']
Sample fields:
filepath: fiftyone.core.fields.StringField
tags: fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
metadata: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.Metadata)
ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
[4]:
# Print a ground truth detection
sample = dataset.first()
print(sample.ground_truth.detections[0])
<Detection: {
'id': '602fea44db78a9b44e6ae129',
'attributes': BaseDict({}),
'label': 'potted plant',
'bounding_box': BaseList([
0.37028125,
0.3345305164319249,
0.038593749999999996,
0.16314553990610328,
]),
'mask': None,
'confidence': None,
'index': None,
'area': 531.8071000000001,
'iscrowd': 0.0,
}>
Note that the ground truth detections are stored in the ground_truth
field of the samples.
Before we go further, let’s launch the FiftyOne App and use the GUI to explore the dataset visually:
[5]:
session = fo.launch_app(dataset)
Adding model predictions¶
Now let’s add some predictions from an object detector to the dataset.
We’ll use an off-the-shelf Faster R-CNN detection model provided by PyTorch. The following cell downloads the model and loads it:
[1]:
import torch
import torchvision
# Run the model on GPU if it is available
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# Load a pre-trained Faster R-CNN model
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
model.to(device)
model.eval()
print("Model ready")
Model ready
The code below performs inference with the model on a randomly chosen subset of 100 samples from the dataset and stores the predictions in a predictions
field of the samples.
[6]:
# Choose a random subset of 100 samples to add predictions to
predictions_view = dataset.take(100, seed=51)
[7]:
from PIL import Image
from torchvision.transforms import functional as func
import fiftyone as fo
# Get class list
classes = dataset.default_classes
# Add predictions to samples
with fo.ProgressBar() as pb:
for sample in pb(predictions_view):
# Load image
image = Image.open(sample.filepath)
image = func.to_tensor(image).to(device)
c, h, w = image.shape
# Perform inference
preds = model([image])[0]
labels = preds["labels"].cpu().detach().numpy()
scores = preds["scores"].cpu().detach().numpy()
boxes = preds["boxes"].cpu().detach().numpy()
# Convert detections to FiftyOne format
detections = []
for label, score, box in zip(labels, scores, boxes):
# Convert to [top-left-x, top-left-y, width, height]
# in relative coordinates in [0, 1] x [0, 1]
x1, y1, x2, y2 = box
rel_box = [x1 / w, y1 / h, (x2 - x1) / w, (y2 - y1) / h]
detections.append(
fo.Detection(
label=classes[label],
bounding_box=rel_box,
confidence=score
)
)
# Save predictions to dataset
sample["predictions"] = fo.Detections(detections=detections)
sample.save()
100% |██████████████████████| 100/100 [12.7m elapsed, 0s remaining, 0.1 samples/s]
Let’s load predictions_view
in the App to visualize the predictions that we added:
[11]:
session.view = predictions_view