Albumentations Integration#

The Albumentations library is the leading open-source library for image augmentation in machine learning. It is widely used in the computer vision community and is known for its extensive collection of augmentations and its high performance.

Now, we’ve integrated Albumentations transformation pipelines directly with FiftyOne datasets, enabling you to visualize Albumentations augmentations and test their effects on your data directly within the FiftyOne App!

This integration takes the form of a FiftyOne plugin, which is easy to install and can be used entirely via a convenient graphical interface.

With the FiftyOne Albumentations plugin, you can transform any and all labels of type Detections, Keypoints, Segmentation, and Heatmap, or just the images themselves.

This integration guide will focus on the setup process and the functionality of the plugin. For a tutorial on how to curate your augmentations, check out the Data Augmentation Tutorial.

Overview#

Before we get started, let’s take a look at the main features of the FiftyOne Albumentations integration.

Supported transformations#

Albumentations supports 80+ transformations, spanning pixel-level transformations, geometric transformations, and more.

The FiftyOne Albumentations plugin currently supports all but the following transformations:

Functionality#

The FiftyOne Albumentations plugin provides the following functionality:

Apply Albumentations transformations to your dataset, your current view, or selected samples
Visualize the effects of these transformations directly within the FiftyOne App
View samples generated by the last applied transformation
Save augmented samples to the dataset
Get info about the last applied transformation
Save transformation pipelines to the dataset for reproducibility

Setup#

To get started, first make sure you have FiftyOne and Albumentations installed:

$ pip install -U fiftyone albumentations

Next, install the FiftyOne Albumentations plugin:

$ fiftyone plugins download https://github.com/jacobmarks/fiftyone-albumentations-plugin

Note

If you have the FiftyOne Plugin Utils plugin installed, you can also install the Albumentations plugin via the install_plugin operator, selecting the Albumentations plugin from the community dropdown menu.

You will also need to load (and download if necessary) a dataset to apply the augmentations to. For this guide, we’ll use the quickstart dataset:

import fiftyone as fo
import fiftyone.zoo as foz

## only take 5 samples for quick demonstration
dataset = foz.load_zoo_dataset("quickstart", max_samples=5)

# only keep the ground truth labels
dataset.select_fields("ground_truth").keep_fields()

session = fo.launch_app(dataset)

Note

The quickstart dataset only contains Detections labels. If you want to test Albumentations transformations on other label types, here are some quick examples to get you started, using FiftyOne’s Hugging Face Transformers and Ultralytics integrations:

pip install -U transformers ultralytics

import fiftyone as fo
import fiftyone.zoo as foz

from ultralytics import YOLO

# Keypoints
model = YOLO("yolov8l-pose.pt")
dataset.apply_model(model, label_field="keypoints")

# Instance Segmentation
model = YOLO("yolov8l-seg.pt")
dataset.apply_model(model, label_field="instances")

# Semantic Segmentation
model = foz.load_zoo_model(
    "segmentation-transformer-torch",
    name_or_path="Intel/dpt-large-ade",
)
dataset.apply_model(model, label_field="mask")

# Heatmap
model = foz.load_zoo_model(
    "depth-estimation-transformer-torch",
    name_or_path="LiheYoung/depth-anything-small-hf",
)
dataset.apply_model(model, label_field="depth_map")

Apply transformations#

To apply Albumentations transformations to your dataset, you can use the augment_with_albumentations operator. Press the backtick key (‘`’) to open the operator modal, and select the augment_with_albumentations operator from the dropdown menu.

You can then configure the transformations to apply:

Number of augmentations per sample: The number of augmented samples to generate for each input sample. The default is 1, which is sufficient for deterministic transformations, but for probabilistic transformations, you may want to generate multiple samples to see the range of possible outputs.
Number of transforms: The number of transformations to compose into the pipeline to be applied to each sample. The default is 1, but you can set this as high as you’d like — the more transformations, the more complex the augmentations will be. You will be able to configure each transform separately.
Target view: The view to which the transformations will be applied. The default is dataset, but you can also apply the transformations to the current view or to currently selected samples within the app.
Execution mode: If you set delegated=False, the operation will be executed immediately. If you set delegated=True, the operation will be queued as a job, which you can then run in the background from your terminal with:

$ fiftyone delegated launch

For each transformation, you can select either a “primitive” transformation from the Albumentations library, or a “saved” transformation pipeline that you have previously saved to the dataset. These saved pipelines can consist of one or more transformations.

When you apply a primitive transformation, you can configure the parameters of the transformation directly within the app. The available parameters, their default values, types, and docstrings are all integrated directly from the Albumentations library.

When you apply a saved pipeline, there will not be any parameters to configure.

Visualize transformations#

Once you’ve applied the transformations, you can visualize the effects of the transformations directly within the FiftyOne App. All augmented samples will be added to the dataset, and will be tagged as augmented so that you can easily filter for just augmented or non-augmented samples in the app.

You can also filter for augmented samples programmatically with the match_tags() method:

# get just the augmented samples
augmented_view = dataset.match_tags("augmented")

# get just the non-augmented samples
non_augmented_view = dataset.match_tags("augmented", bool=False)

However, matching on these tags will return all samples that have been generated by an augmentation, not just the samples that were generated by the last applied transformation — as you will see shortly, we can save augmentations to the dataset. To get just the samples generated by the last applied transformation, you can use the view_last_albumentations_run operator:

Note

For all samples added to the dataset by the FiftyOne Albumentations plugin, there will be a field "transform", which contains the information not just about the pipeline that was applied, but also about the specific parameters that were used for this application of the pipeline. For example, if you had a HorizontalFlip transformation with an application probability of p=0.5, the contents of the "transform" field tell you whether or not this transformation was applied to the sample!

Save augmentations#

By default all augmentations are temporary, as the FiftyOne Albumentations plugin is primarily designed for rapid prototyping and experimentation. This means that when you generated a new batch of augmented samples, the previous batch of augmented samples will be removed from the dataset, and the image files will be deleted from disk.

However, if you want to save the augmented samples to the dataset, you can use the save_albumentations_augmentations operator, which will save the augmented samples to the dataset while keeping the augmented tag on the samples.

Get last transformation info#

When you apply a transformation pipeline to samples in your dataset using the FiftyOne Albumentations plugin, this information is captured and stored using FiftyOne’s custom runs. This means that you can easily access the information about the last applied transformation.

In the FiftyOne App, you can use the get_last_albumentations_run_info operator to display a formatted summary of the relevant information:

Note

You can also access this information programmatically by getting info about the custom run that the information is stored in. For the Albumentations plugin, this info is stored via the key '_last_albumentations_run':

last_run_info = dataset.get_run_info("_last_albumentations_run")
print(last_run_info)

Save transformations#

If you are satisfied with the transformation pipeline you have created, you can save the entire composition of transformations to the dataset, hyperparameters and all. This means that after your rapid prototyping phase, you can easily move to a more reproducible workflow, and you can share your transformations or port them to other datasets.

To save a transformation pipeline, you can use the save_albumentations_transform operator:

After doing so, you will be able to view the information about this saved transformation pipeline using the get_albumentations_run_info operator:

Additionally, you will have access to this saved transformation pipeline under the “saved” tab for each transformation in the augment_with_albumentations operator modal.