Run in Google Colab | View source on GitHub | Download notebook |
Downloading and Evaluating Open Images¶
Downloading Google’s Open Images dataset is now easier than ever with the FiftyOne Dataset Zoo! You can load all three splits of Open Images V7, including image-level labels, detections, segmentations, visual relationships, and point labels.
FiftyOne also natively supports Open Images-style evaluation, so you can easily evaluate your object detection models and explore the results directly in the library.
This walkthrough covers:
Downloading Open Images from the FiftyOne Dataset Zoo
Computing predictions using a model from the FiftyOne Model Zoo
Performing Open Images-style evaluation in FiftyOne to evaluate a model and compute its mAP
Exploring the dataset and evaluation results
So, what’s the takeaway?
Starting a new ML project takes data and time, and the datasets in the FiftyOne Dataset Zoo can help jump start the development process.
Open Images in particular is one of the largest publicly available datasets for object detections, classification, segmentation, and more. Additionally, with Open Images evaluation available natively in FiftyOne, you can quickly evaluate your models and compute mAP and PR curves.
While metrics like mAP are often used to compare models, the best way to improve your model’s performance isn’t to look at aggregate metrics but instead to get hands-on with your evaluation and visualize how your model performs on individual samples. All of this is made easy with FiftyOne!
Setup¶
If you haven’t already, install FiftyOne:
[1]:
!pip install fiftyone
In this tutorial, we’ll use some TensorFlow models and PyTorch to generate predictions and embeddings, and we’ll use the UMAP method to reduce the dimensionality of embeddings, so we need to install the corresponding packages:
[2]:
!pip install tensorflow torch torchvision umap-learn
This tutorial also includes some of FiftyOne’s interactive plotting capabilities.
The recommended way to work with FiftyOne’s interactive plots is in Jupyter notebooks or JupyterLab. In these environments, you can leverage the full power of plots by attaching them to the FiftyOne App and bidirectionally interacting with the plots and the App to identify interesting subsets of your data.
To use interactive plots in Jupyter notebooks, ensure that you have the ipywidgets
package installed:
[3]:
!pip install 'ipywidgets>=8,<9'
If you’re working in JupyterLab, refer to these instructions to get setup.
Support for interactive plots in non-notebook contexts and Google Colab is coming soon! In the meantime, you can still use FiftyOne’s plotting features in those environments, but you must manually call plot.show() to update the state of a plot to match the state of a connected session, and any callbacks that would normally be triggered in response to interacting with a plot will not be triggered.
Loading Open Images¶
In this section, we’ll load various subsets of Open Images from the FiftyOne Dataset Zoo and visualize them using FiftyOne.
Let’s start by downloading a small sample of 100 randomly chosen images + annotations:
[4]:
import fiftyone as fo
import fiftyone.zoo as foz
[5]:
dataset = foz.load_zoo_dataset(
"open-images-v7",
split="validation",
max_samples=100,
seed=51,
shuffle=True,
)
Now let’s launch the FiftyOne App so we can explore the dataset we just downloaded.
[6]:
session = fo.launch_app(dataset.view())
Connected to FiftyOne on port 5151 at localhost.
If you are not connecting to a remote session, you may need to start a new session and specify a port
Loading Open Images with FiftyOne also automatically stores relevant labels and metadata like classes, attributes, and a class hierarchy that is used for evaluation in the dataset’s info
dictionary:
[7]:
print(dataset.info.keys())
dict_keys(['hierarchy', 'attributes_map', 'attributes', 'segmentation_classes', 'point_classes', 'classes_map'])
When loading Open Images from the dataset zoo, there are a variety of available parameters that you can pass to load_zoo_dataset()
to specify a subset of the images and/or label types to download:
label_types
- a list of label types to load. The supported values are ("detections", "classifications", "points", "segmentations", "relationships"
) for Open Images V7. Open Images v6 is the same except that it does not contain point labels. By default, all available labels types will be loaded. Specifying[]
will load only the imagesclasses
- a list of classes of interest. If specified, only samples with at least one object, segmentation, or image-level label in the specified classes will be downloadedattrs
- a list of attributes of interest. If specified, only download samples if they contain at least one attribute inattrs
or one class inclasses
(only applicable whenlabel_types
contains"relationships"
)load_hierarchy
- whether to load the class hierarchy intodataset.info["hierarchy"]
image_ids
- an array of specific image IDs to downloadimage_ids_file
- a path to a.txt
,.csv
, or.json
file containing image IDs to download
In addition, like all other zoo datasets, you can specify:
max_samples
- the maximum number of samples to loadshuffle
- whether to randomly chose which samples to load ifmax_samples
is givenseed
- a random seed to use when shuffling
Let’s use some of these parameters to download a 100 sample subset of Open Images containing segmentations and image-level labels for the classes “Burrito”, “Cheese”, and “Popcorn”.
[8]:
dataset = foz.load_zoo_dataset(
"open-images-v7",
split="validation",
label_types=["segmentations", "classifications"],
classes = ["Burrito", "Cheese", "Popcorn"],
max_samples=100,
seed=51,
shuffle=True,
dataset_name="open-images-food",
)
Downloading split 'validation' to 'datasets/open-images-v7/validation' if necessary
Only found 83 (<100) samples matching your requirements
Necessary images already downloaded
Existing download of split 'validation' is sufficient
Loading existing dataset 'open-images-food'. To reload from disk, either delete the existing dataset or provide a custom `dataset_name` to use
[9]:
session.view = dataset.view()
[10]:
session.freeze() # screenshots App for sharing
We can do the same for visual relationships. For example, we can download only samples that contain a relationship with the “Wooden” attribute.
[ ]:
dataset = foz.load_zoo_dataset(
"open-images-v7",
split="validation",
label_types=["relationships"],
attrs=["Wooden"],
max_samples=100,
seed=51,
shuffle=True,
dataset_name="open-images-relationships",
)
You can visualize relationships in the App by clicking on a sample to open the App’s expanded view. From there, you can hover over objects to see their attributes in a tooltip.
Alternatively, you can use the settings menu in the lower-right corner of the media player to set show_attributes
to True to make attributes appear as persistent boxes (as shown below). This can also be achieved programmatically by configuring the App:
[12]:
# Launch a new App instance with a customized config
app_config = fo.AppConfig()
app_config.show_attributes = True
session = fo.launch_app(dataset, config=app_config)