Label Studio Integration¶
Label Studio is a popular open-source data labeling tool with a friendly UI. The integration between FiftyOne and Label Studio allows you to easily upload your data directly from FiftyOne to Label Studio for labeling.
You can get started with Label Studio through a simple pip install to get a local server up and running. FiftyOne provides simple setup instructions that you can use to specify the necessary account credentials and server endpoint to use.
Note
Did you know? You can request, manage, and import annotations from within the FiftyOne App by installing the @voxel51/annotation plugin!
FiftyOne provides an API to create projects, upload data, define label schemas, and download annotations using Label Studio, all programmatically in Python. All of the following label types are supported for image datasets:
Basic recipe¶
The basic workflow to use Label Studio to add or edit labels on your FiftyOne datasets is as follows:
Load a labeled or unlabeled dataset into FiftyOne
Explore the dataset using the App or dataset views to locate either unlabeled samples that you wish to annotate or labeled samples whose annotations you want to edit
Use the
annotate()
method on your dataset or view to upload the samples and optionally their existing labels to Label Studio by setting the parameterbackend="labelstudio"
In Label Studio, perform the necessary annotation work
Back in FiftyOne, load your dataset and use the
load_annotations()
method to merge the annotations back into your FiftyOne datasetIf desired, delete the Label Studio tasks and the record of the annotation run from your FiftyOne dataset
The example below demonstrates this workflow.
Note
You must start by installing and setting up Label Studio as described in this section.
Note that you can also store your credentials to avoid entering them manually each time you interact with Label Studio.
First, we create the annotation tasks in Label Studio:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F # Step 1: Load your data into FiftyOne dataset = foz.load_zoo_dataset( "quickstart", dataset_name="ls-annotation-example" ) dataset.persistent = True dataset.evaluate_detections( "predictions", gt_field="ground_truth", eval_key="eval" ) # Step 2: Locate a subset of your data requiring annotation # Create a view that contains only high confidence false positive model # predictions, with samples containing the most false positives first most_fp_view = ( dataset .filter_labels("predictions", (F("confidence") > 0.8) & (F("eval") == "fp")) .sort_by(F("predictions.detections").length(), reverse=True) ) # Retrieve the sample with the most high confidence false positives sample_id = most_fp_view.first().id view = dataset.select(sample_id) # Step 3: Send samples to Label Studio # A unique identifier for this run anno_key = "labelstudio_basic_recipe" label_schema = { "new_ground_truth": { "type": "detections", "classes": dataset.distinct("ground_truth.detections.label"), }, } view.annotate( anno_key, backend="labelstudio", label_schema=label_schema, launch_editor=True, ) print(dataset.get_annotation_info(anno_key)) # Step 4: Perform annotation in Label Studio and save the tasks |
Then, once the annotation work is complete, we merge the annotations back into FiftyOne:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | import fiftyone as fo anno_key = "labelstudio_basic_recipe" # Step 5: Merge annotations back into FiftyOne dataset dataset = fo.load_dataset("ls-annotation-example") dataset.load_annotations(anno_key) # Load the view that was annotated in the App view = dataset.load_annotation_view(anno_key) session = fo.launch_app(view=view) # Step 6: Cleanup # Delete tasks from Label Studio results = dataset.load_annotation_results(anno_key) results.cleanup() # Delete run record (not the labels) from FiftyOne dataset.delete_annotation_run(anno_key) |
Setup¶
The easiest way to get started with Label Studio is to install it locally and create an account.
pip install label-studio
# Launch it!
label-studio
Installing the Label Studio client¶
In order to use the Label Studio backend, you must install the Label Studio Python SDK:
pip install label-studio-sdk
Using the Label Studio backend¶
By default, calling
annotate()
will
use the CVAT backend.
To use the Label Studio backend, simply set the optional backend
parameter of
annotate()
to
"labelstudio"
:
1 | view.annotate(anno_key, backend="labelstudio", ...) |
Alternatively, you can permanently configure FiftyOne to use the Label Studio
backend by setting the FIFTYONE_ANNOTATION_DEFAULT_BACKEND
environment
variable:
export FIFTYONE_ANNOTATION_DEFAULT_BACKEND=labelstudio
or by setting the default_backend
parameter of your
annotation config located at
~/.fiftyone/annotation_config.json
:
{
"default_backend": "labelstudio"
}
Authentication¶
In order to connect to a Label Studio server, you must provide your API key, which can be done in a variety of ways.
Environment variables (recommended)
The recommended way to configure your Label Studio API key is to store it in
the FIFTYONE_LABELSTUDIO_API_KEY
environment variable. This is automatically
accessed by FiftyOne whenever a connection to Label Studio is made.
export FIFTYONE_LABELSTUDIO_API_KEY=...
FiftyOne annotation config
You can also store your credentials in your
annotation config located at
~/.fiftyone/annotation_config.json
:
{
"backends": {
"labelstudio": {
"api_key": ...,
}
}
}
Note that this file will not exist until you create it.
Keyword arguments
You can manually provide your API key as a keyword argument each time you call
methods like
annotate()
and
load_annotations()
that require connections to Label Studio:
1 2 3 4 5 6 | view.annotate( anno_key, backend="labelstudio", label_field="ground_truth", api_key=..., ) |
Command line prompt
If you have not stored your API key via another method, you will be prompted to enter it interactively in your shell each time you call a method that requires a connection to Label Studio:
1 2 3 4 5 6 | view.annotate( anno_key, backend="labelstudio", label_field="ground_truth", launch_editor=True, ) |
Please enter your API key.
You can avoid this in the future by setting your `FIFTYONE_LABELSTUDIO_API_KEY` environment variable.
API key: ...
Server URL¶
You can configure the URL to the desired Label Studio server in any of the following ways:
Set the
FIFTYONE_LABELSTUDIO_URL
environment variable:
export FIFTYONE_LABELSTUDIO_URL=http://localhost:8080
Store the
url
of your server in your annotation config at~/.fiftyone/annotation_config.json
:
{
"backends": {
"labelstudio": {
"url": "http://localhost:8080"
}
}
}
Pass the
url
parameter manually each time you callannotate()
:
1 2 3 4 5 6 7 | view.annotate( anno_key, backend="labelstudio", label_field="ground_truth", url="http://localhost:8080", api_key=..., ) |
Configuring local file storage¶
If you are using FiftyOne on the same machine that is hosting Label Studio, then you can make use of the local storage feature of Label Studio to avoid needing to copy your media.
To enable this, you just need to configure the
LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT
and
LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED
environment variables as defined in
the documentation.
Then when you request annotations, if all of the samples in your Dataset
or
DatasetView
reside in a subdirectory of the
LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT
, the media will not be copied over and
only filepaths for you media will be used to create the Label Studio project.
Requesting annotations¶
Use the
annotate()
method
to send the samples and optionally existing labels in a Dataset
or
DatasetView
to Label Studio for annotation.
The basic syntax is:
1 2 | anno_key = "..." view.annotate(anno_key, backend="labelstudio", ...) |
The anno_key
argument defines a unique identifier for the annotation run, and
you will provide it to methods like
load_annotations()
,
get_annotation_info()
,
load_annotation_results()
,
rename_annotation_run()
, and
delete_annotation_run()
to manage the run in the future.
Note
Calling
annotate()
will upload the source media files to the Label Studio server.
In addition,
annotate()
provides various parameters that you can use to customize the annotation tasks
that you wish to be performed.
The following parameters are supported by all annotation backends:
backend (None): the annotation backend to use. Use
"labelstudio"
for the Label Studio backend. The supported values arefiftyone.annotation_config.backends.keys()
and the default isfiftyone.annotation_config.default_backend
media_field (“filepath”): the sample field containing the path to the source media to upload
launch_editor (False): whether to launch the annotation backend’s editor after uploading the samples
The following parameters allow you to configure the labeling schema to use for your annotation tasks. See this section for more details:
label_schema (None): a dictionary defining the label schema to use. If this argument is provided, it takes precedence over
label_field
andlabel_type
label_field (None): a string indicating a new or existing label field to annotate
label_type (None): a string indicating the type of labels to annotate. The possible label types are:
"classification"
: a single classification stored inClassification
fields"classifications"
: multilabel classifications stored inClassifications
fields"detections"
: object detections stored inDetections
fields"instances"
: instance segmentations stored inDetections
fields with theirmask
attributes populated"polylines"
: polylines stored inPolylines
fields with theirfilled
attributes set toFalse
"polygons"
: polygons stored inPolylines
fields with theirfilled
attributes set toTrue
"keypoints"
: keypoints stored inKeypoints
fields"segmentation"
: semantic segmentations stored inSegmentation
fields
All new label fields must have their type specified via this argument or in
label_schema
classes (None): a list of strings indicating the class options for
label_field
or all fields inlabel_schema
without classes specified. All new label fields must have a class list provided via one of the supported methods. For existing label fields, if classes are not provided by this argument norlabel_schema
, they are parsed fromDataset.classes
orDataset.default_classes
mask_targets (None): a dict mapping pixel values to semantic label strings. Only applicable when annotating semantic segmentations
In addition, the following Label Studio-specific parameters from
LabelStudioBackendConfig
can also be provided:
project_name (None): a name for the Label Studio project that will be created. The default is
"FiftyOne_<dataset_name>"
Label schema¶
The label_schema
, label_field
, label_type
, classes
, and mask_targets
parameters to
annotate()
allow
you to define the annotation schema that you wish to be used.
The label schema may define new label field(s) that you wish to populate, and it may also include existing label field(s), in which case you can add, delete, or edit the existing labels on your FiftyOne dataset.
The label_schema
argument is the most flexible way to define how to construct
tasks in Label Studio. In its most verbose form, it is a dictionary that
defines the label type, annotation type, and possible classes for each label
field:
1 2 3 4 5 6 7 8 9 10 11 12 13 | anno_key = "..." label_schema = { "new_field": { "type": "detections", "classes": ["class1", "class2"], }, "existing_field": { "classes": ["class3", "class4"], }, } dataset.annotate(anno_key, backend="labelstudio", label_schema=label_schema) |
Alternatively, if you are only editing or creating a single label field, you
can use the label_field
, label_type
, classes
, and
mask_targets
parameters to specify the components of the label schema
individually:
1 2 3 4 5 6 7 8 9 10 11 12 13 | anno_key = "..." label_field = "new_field", label_type = "detections" classes = ["class1", "class2"] dataset.annotate( anno_key, backend="labelstudio", label_field=label_field, label_type=label_type, classes=classes, ) |
When you are annotating existing label fields, you can omit some of these
parameters from
annotate()
, as
FiftyOne can infer the appropriate values to use:
label_type: if omitted, the
Label
type of the field will be used to infer the appropriate value for this parameterclasses: if omitted for a non-semantic segmentation field, the class lists from the
classes
ordefault_classes
properties of your dataset will be used, if available. Otherwise, the observed labels on your dataset will be used to construct a classes listmask_targets: if omitted for a semantic segmentation field, the mask targets from the
mask_targets
ordefault_mask_targets
properties of your dataset will be used, if available
Label attributes¶
Warning
The Label Studio integration does not yet support annotating label attributes.
Loading annotations¶
After your annotations tasks in the annotation backend are complete, you can
use the
load_annotations()
method to download them and merge them back into your FiftyOne dataset.
1 | view.load_annotations(anno_key) |
The anno_key
parameter is the unique identifier for the annotation run that
you provided when calling
annotate()
. You
can use
list_annotation_runs()
to see the available keys on a dataset.
Note
By default, calling
load_annotations()
will not delete any information for the run from the annotation backend.
However, you can pass cleanup=True
to delete all information associated
with the run from the backend after the annotations are downloaded.
You can use the optional dest_field
parameter to override the task’s
label schema and instead load annotations into different field name(s) of your
dataset. This can be useful, for example, when editing existing annotations, if
you would like to do a before/after comparison of the edits that you import. If
the annotation run involves multiple fields, dest_field
should be a
dictionary mapping label schema field names to destination field names.
Managing annotation runs¶
FiftyOne provides a variety of methods that you can use to manage in-progress or completed annotation runs.
For example, you can call
list_annotation_runs()
to see the available annotation keys on a dataset:
1 | dataset.list_annotation_runs() |
Or, you can use
get_annotation_info()
to retrieve information about the configuration of an annotation run:
1 2 | info = dataset.get_annotation_info(anno_key) print(info) |
Use load_annotation_results()
to load the AnnotationResults
instance for an annotation run.
All results objects provide a cleanup()
method that you can use to delete all information associated with a run from
the annotation backend.
1 2 | results = dataset.load_annotation_results(anno_key) results.cleanup() |
In addition, the
AnnotationResults
subclasses for each backend may provide additional utilities such as support
for programmatically monitoring the status of the annotation tasks in the run.
You can use
rename_annotation_run()
to rename the annotation key associated with an existing annotation run:
1 | dataset.rename_annotation_run(anno_key, new_anno_key) |
Finally, you can use
delete_annotation_run()
to delete the record of an annotation run from your FiftyOne dataset:
1 | dataset.delete_annotation_run(anno_key) |
Note
Calling
delete_annotation_run()
only deletes the record of the annotation run from your FiftyOne
dataset; it will not delete any annotations loaded onto your dataset via
load_annotations()
,
nor will it delete any associated information from the annotation backend.
Annotating videos¶
Warning
The Label Studio integration does not currently support annotating videos.
Acknowledgements¶
Note
Special thanks to Rustem Galiullin, Ganesh Tata, and Emil Zakirov for building this integration!