Using Sample Parsers#
This page describes how to use the SampleParser
interface to add samples to
your FiftyOne dataset from a stream of in-memory data.
The SampleParser
interface provides native support for loading samples in a
variety of common formats, and it can be easily
extended to import datasets in custom formats,
allowing you to automate the dataset loading process.
Warning
The SampleParser
interface is only recommended for specific use cases.
In most cases, you’ll likely prefer adding samples manually or using dataset importers to load data into FiftyOne.
Adding samples to datasets#
Basic recipe#
The basic recipe for using the SampleParser
interface to add samples to a
Dataset
is to create a parser of the appropriate type and then pass the
parser along with an iterable of samples to the appropriate Dataset
method.
1import fiftyone as fo
2import fiftyone.utils.data as foud
3
4dataset = fo.Dataset()
5
6# An iterable of samples and an UnlabeledImageSampleParser to parse them
7samples = ...
8sample_parser = foud.ImageSampleParser # for example
9
10# Add the image samples to the dataset
11dataset.add_images(samples, sample_parser)
1import fiftyone as fo
2import fiftyone.utils.data as foud
3
4dataset = fo.Dataset()
5
6# An iterable of samples and a LabeledImageSampleParser to parse them
7samples = ...
8sample_parser = foud.ImageClassificationSampleParser # for example
9
10# Add the labeled image samples to the dataset
11dataset.add_labeled_images(samples, sample_parser)
1import fiftyone as fo
2import fiftyone.utils.data as foud
3
4dataset = fo.Dataset()
5
6# An iterable of samples and an UnlabeledVideoSampleParser to parse them
7samples = ...
8sample_parser = foud.VideoSampleParser # for example
9
10# Add the video samples to the dataset
11dataset.add_images(samples, sample_parser)
1import fiftyone as fo
2import fiftyone.utils.data as foud
3
4dataset = fo.Dataset()
5
6# An iterable of samples and a LabeledVideoSampleParser to parse them
7samples = ...
8sample_parser = foud.FiftyOneVideoLabelsSampleParser # for example
9
10# Add the labeled video samples to the dataset
11dataset.add_labeled_videos(samples, sample_parser)
Note
A typical use case is that samples
in the above recipe is a
torch.utils.data.Dataset
or an iterable generated by
tf.data.Dataset.as_numpy_iterator()
.
Adding unlabeled images#
FiftyOne provides a few convenient ways to add unlabeled images in FiftyOne datasets.
Adding a directory of images#
Use Dataset.add_images_dir()
to add a directory of images to a dataset:
1import fiftyone as fo
2
3dataset = fo.Dataset()
4
5# A directory of images to add
6images_dir = "/path/to/images"
7
8# Add images to the dataset
9dataset.add_images_dir(images_dir)
Adding a glob pattern of images#
Use Dataset.add_images_patt()
to add a glob pattern of images to a dataset:
1import fiftyone as fo
2
3dataset = fo.Dataset()
4
5# A glob pattern of images to add
6images_patt = "/path/to/images/*.jpg"
7
8# Add images to the dataset
9dataset.add_images_patt(images_patt)
Adding images using a SampleParser#
Use Dataset.add_images()
to add an iterable of unlabeled images that can be parsed via a specified
UnlabeledImageSampleParser
to a dataset.
Example
FiftyOne provides an
ImageSampleParser
that handles samples that contain either an image that can be converted to
numpy format via np.asarray()
of the path to an
image on disk.
1import fiftyone as fo
2import fiftyone.utils.data as foud
3
4dataset = fo.Dataset()
5
6# An iterable of images or image paths and the UnlabeledImageSampleParser
7# to use to parse them
8samples = ...
9sample_parser = foud.ImageSampleParser
10
11# Add images to the dataset
12dataset.add_images(samples, sample_parser)
Adding labeled images#
Use Dataset.add_labeled_images()
to add an iterable of samples that can be parsed via a specified
LabeledImageSampleParser
to a dataset.
Example
FiftyOne provides an
ImageClassificationSampleParser
that handles samples that contain (image_or_path, target)
tuples, where:
image_or_path
is either an image that can be converted to numpy format vianp.asarray()
or the path to an image on disktarget
is either a class ID or a label string
The snippet below adds an iterable of image classification data in the above format to a dataset:
1import fiftyone as fo
2import fiftyone.utils.data as foud
3
4dataset = fo.Dataset()
5
6# An iterable of `(image_or_path, target)` tuples and the
7# LabeledImageSampleParser to use to parse them
8samples = ...
9sample_parser = foud.ImageClassificationSampleParser
10
11# Add labeled images to the dataset
12dataset.add_labeled_images(samples, sample_parser)
Adding unlabeled videos#
FiftyOne provides a few convenient ways to add unlabeled videos in FiftyOne datasets.
Adding a directory of videos#
Use Dataset.add_videos_dir()
to add a directory of videos to a dataset:
1import fiftyone as fo
2
3dataset = fo.Dataset()
4
5# A directory of videos to add
6videos_dir = "/path/to/videos"
7
8# Add videos to the dataset
9dataset.add_videos_dir(videos_dir)
Adding a glob pattern of videos#
Use Dataset.add_videos_patt()
to add a glob pattern of videos to a dataset:
1import fiftyone as fo
2
3dataset = fo.Dataset()
4
5# A glob pattern of videos to add
6videos_patt = "/path/to/videos/*.mp4"
7
8# Add videos to the dataset
9dataset.add_videos_patt(videos_patt)
Adding videos using a SampleParser#
Use Dataset.add_videos()
to add an iterable of unlabeled videos that can be parsed via a specified
UnlabeledVideoSampleParser
to a dataset.
Example
FiftyOne provides a
VideoSampleParser
that handles samples that directly contain the path to the video on disk.
1import fiftyone as fo
2import fiftyone.utils.data as foud
3
4dataset = fo.Dataset()
5
6# An iterable of video paths and the UnlabeledVideoSampleParser to use to
7# parse them
8samples = ...
9sample_parser = foud.VideoSampleParser
10
11# Add videos to the dataset
12dataset.add_videos(samples, sample_parser)
Adding labeled videos#
Use Dataset.add_labeled_videos()
to add an iterable of samples that can be parsed via a specified
LabeledVideoSampleParser
to a dataset.
Example
FiftyOne provides a
VideoLabelsSampleParser
that handles samples that contain (video_path, video_labels_or_path)
tuples, where:
video_path
is the path to a video on diskvideo_labels_or_path
is aneta.core.video.VideoLabels
instance, a serialized dict representation of one, or the path to one on disk
The snippet below adds an iterable of labeled video samples in the above format to a dataset:
1import fiftyone as fo
2import fiftyone.utils.data as foud
3
4dataset = fo.Dataset()
5
6# An iterable of `(video_path, video_labels_or_path)` tuples and the
7# LabeledVideoSampleParser to use to parse them
8samples = ...
9sample_parser = foud.VideoLabelsSampleParser
10
11# Add labeled videos to the dataset
12dataset.add_labeled_videos(samples, sample_parser)
Ingesting samples into datasets#
Creating FiftyOne datasets typically does not create copies of the source media,
since Sample
instances store the filepath
to the media, not the media itself.
However, in certain circumstances, such as loading data from binary sources like TFRecords or creating a FiftyOne dataset from unorganized and/or temporary files on disk, it can be desirable to ingest the raw media for each sample into a common backing location.
FiftyOne provides support for ingesting samples and their underlying source media in both common formats and can be extended to import datasets in custom formats.
Basic recipe#
The basic recipe for ingesting samples and their source media into a Dataset
is to create a SampleParser
of the appropriate type of sample that you’re
loading and then pass the parser along with an iterable of samples to the
appropriate Dataset
method.
1import fiftyone as fo
2import fiftyone.utils.data as foud
3
4dataset = fo.Dataset()
5
6# The iterable of samples and the UnlabeledImageSampleParser to use
7# to parse them
8samples = ...
9sample_parser = foud.ImageSampleParser # for example
10
11# A directory in which the images will be written; If `None`, a default directory
12# based on the dataset's `name` will be used
13dataset_dir = ...
14
15# Ingest the labeled image samples into the dataset
16# The source images are copied into `dataset_dir`
17dataset.ingest_images(samples, sample_parser, dataset_dir=dataset_dir)
1import fiftyone as fo
2import fiftyone.utils.data as foud
3
4dataset = fo.Dataset()
5
6# The iterable of samples and the LabeledImageSampleParser to use
7# to parse them
8samples = ...
9sample_parser = foud.ImageClassificationSampleParser # for example
10
11# A directory in which the images will be written; If `None`, a default directory
12# based on the dataset's `name` will be used
13dataset_dir = ...
14
15# Add the labeled image samples to the dataset
16dataset.add_labeled_images(samples, sample_parser, dataset_dir=dataset_dir)
1import fiftyone as fo
2import fiftyone.utils.data as foud
3
4dataset = fo.Dataset()
5
6# The iterable of samples and the UnlabeledVideoSampleParser to use
7# to parse them
8samples = ...
9sample_parser = foud.VideoSampleParser # for example
10
11# A directory in which the videos will be written; If `None`, a default directory
12# based on the dataset's `name` will be used
13dataset_dir = ...
14
15# Ingest the labeled video samples into the dataset
16# The source videos are copied into `dataset_dir`
17dataset.ingest_videos(samples, sample_parser, dataset_dir=dataset_dir)
1import fiftyone as fo
2import fiftyone.utils.data as foud
3
4dataset = fo.Dataset()
5
6# The iterable of samples and the LabeledVideoSampleParser to use
7# to parse them
8samples = ...
9sample_parser = foud.VideoLabelsSampleParser # for example
10
11# A directory in which the videos will be written; If `None`, a default directory
12# based on the dataset's `name` will be used
13dataset_dir = ...
14
15# Add the labeled video samples to the dataset
16dataset.add_labeled_videos(samples, sample_parser, dataset_dir=dataset_dir)
Note
A typical use case is that samples
in the above recipe is a
torch.utils.data.Dataset
or an iterable generated by
tf.data.Dataset.as_numpy_iterator()
.
Ingesting unlabeled images#
Use Dataset.ingest_images()
to ingest an iterable of unlabeled images that can be parsed via a specified
UnlabeledImageSampleParser
into a dataset.
The has_image_path
property of the parser may either be True
or False
. If the parser provides
image paths, the source images will be directly copied from their source
locations into the backing directory for the dataset; otherwise, the image will
be read in-memory via
get_image()
and then written to the backing directory.
Example
FiftyOne provides an
ImageSampleParser
that handles samples that contain either an image that can be converted to
numpy format via np.asarray()
of the path to an
image on disk.
1import fiftyone as fo
2import fiftyone.utils.data as foud
3
4dataset = fo.Dataset()
5
6# An iterable of images or image paths and the UnlabeledImageSampleParser
7# to use to parse them
8samples = ...
9sample_parser = foud.ImageSampleParser
10
11# A directory in which the images will be written; If `None`, a default directory
12# based on the dataset's `name` will be used
13dataset_dir = ...
14
15# Ingest the images into the dataset
16# The source images are copied into `dataset_dir`
17dataset.ingest_images(samples, sample_parser, dataset_dir=dataset_dir)
Ingesting labeled images#
Use Dataset.ingest_labeled_images()
to ingest an iterable of samples that can be parsed via a specified
LabeledImageSampleParser
into a dataset.
The has_image_path
property of the parser may either be True
or False
. If the parser provides
image paths, the source images will be directly copied from their source
locations into the backing directory for the dataset; otherwise, the image will
be read in-memory via
get_image()
and then written to the backing directory.
Example
FiftyOne provides an
ImageClassificationSampleParser
that handles samples that contain (image_or_path, target)
tuples, where:
image_or_path
is either an image that can be converted to numpy format vianp.asarray()
or the path to an image on disktarget
is either a class ID or a label string
The snippet below ingests an iterable of image classification data in the above format intoa a FiftyOne dataset:
1import fiftyone as fo
2import fiftyone.utils.data as foud
3
4dataset = fo.Dataset()
5
6# An iterable of `(image_or_path, target)` tuples and the
7# LabeledImageSampleParser to use to parse them
8samples = ...
9sample_parser = foud.ImageClassificationSampleParser # for example
10
11# A directory in which the images will be written; If `None`, a default directory
12# based on the dataset's `name` will be used
13dataset_dir = ...
14
15# Ingest the labeled images into the dataset
16# The source images are copied into `dataset_dir`
17dataset.ingest_labeled_images(samples, sample_parser, dataset_dir=dataset_dir)
Ingesting unlabeled videos#
Use Dataset.ingest_videos()
to ingest an iterable of unlabeled videos that can be parsed via a specified
UnlabeledVideoSampleParser
into a dataset.
The source videos will be directly copied from their source locations into the backing directory for the dataset.
Example
FiftyOne provides a
VideoSampleParser
that handles samples that directly contain the paths to videos on disk.
1import fiftyone as fo
2import fiftyone.utils.data as foud
3
4dataset = fo.Dataset()
5
6# An iterable of videos or video paths and the UnlabeledVideoSampleParser
7# to use to parse them
8samples = ...
9sample_parser = foud.VideoSampleParser
10
11# A directory in which the videos will be written; If `None`, a default directory
12# based on the dataset's `name` will be used
13dataset_dir = ...
14
15# Ingest the videos into the dataset
16# The source videos are copied into `dataset_dir`
17dataset.ingest_videos(samples, sample_parser, dataset_dir=dataset_dir)
Ingesting labeled videos#
Use Dataset.ingest_labeled_videos()
to ingest an iterable of samples that can be parsed via a specified
LabeledVideoSampleParser
into a dataset.
The source videos will be directly copied from their source locations into the backing directory for the dataset.
Example
FiftyOne provides a
VideoLabelsSampleParser
that handles samples that contain (video_path, video_labels_or_path)
tuples, where:
video_path
is the path to a video on diskvideo_labels_or_path
is aneta.core.video.VideoLabels
instance, a serialized dict representation of one, or the path to one on disk
The snippet below ingests an iterable of labeled videos in the above format into a FiftyOne dataset:
1import fiftyone as fo
2import fiftyone.utils.data as foud
3
4dataset = fo.Dataset()
5
6# An iterable of `(video_path, video_labels_or_path)` tuples and the
7# LabeledVideoSampleParser to use to parse them
8samples = ...
9sample_parser = foud.VideoLabelsSampleParser # for example
10
11# A directory in which the videos will be written; If `None`, a default directory
12# based on the dataset's `name` will be used
13dataset_dir = ...
14
15# Ingest the labeled videos into the dataset
16# The source videos are copied into `dataset_dir`
17dataset.ingest_labeled_videos(samples, sample_parser, dataset_dir=dataset_dir)
Built-in SampleParser classes#
The table below lists the common data formats for which FiftyOne provides
built-in SampleParser
implementations. You can also write a
custom SampleParser to automate the parsing of
samples in your own custom data format.
You can use a SampleParser
to
add samples to datasets and
ingest samples into datasets.
SampleParser |
Description |
---|---|
A sample parser that parses raw image samples. |
|
A sample parser that parses raw video samples. |
|
Generic parser for image classification samples whose labels are represented as |
|
Generic parser for image detection samples whose labels are represented as |
|
Generic parser for image detection samples whose labels are stored in ETA ImageLabels format. |
|
Parser for samples in FiftyOne image classification datasets. See
|
|
Parser for samples in FiftyOne image detection datasets. See
|
|
Parser for samples in FiftyOne image labels datasets. See
|
|
Parser for samples in FiftyOne video labels datasets. See
|
|
Parser for image classification samples stored as TFRecords. |
|
Parser for image detection samples stored in TF Object Detection API format. |
Writing a custom SampleParser#
FiftyOne provides a variety of
built-in SampleParser classes to parse
data in common formats. However, if your samples are stored in a custom format,
you can provide a custom SampleParser
class and provide it to FiftyOne when
adding or
ingesting samples into your datasets.
The SampleParser
interface provides a mechanism for defining methods that
parse a data sample that is stored in a particular (external to FiftyOne)
format and return various elements of the sample in a format that FiftyOne
understands.
SampleParser
itself is an abstract interface; the concrete interface that you
should implement is determined by the type of samples that you are importing.
For example, LabeledImageSampleParser
defines an interface for parsing
information from a labeled image sample, such as the path to the image on
disk, the image itself, metadata about the image, and the label (e.g.,
classification or object detections) associated with the image.
To define a custom parser for unlabeled images, implement the
UnlabeledImageSampleParser
interface.
The pseudocode below provides a template for a custom
UnlabeledImageSampleParser
:
1import fiftyone.utils.data as foud
2
3class CustomUnlabeledImageSampleParser(foud.UnlabeledImageSampleParser):
4 """Custom parser for unlabeled image samples."""
5
6 @property
7 def has_image_path(self):
8 """Whether this parser produces paths to images on disk for samples
9 that it parses.
10 """
11 # Return True or False here
12 pass
13
14 @property
15 def has_image_metadata(self):
16 """Whether this parser produces
17 :class:`fiftyone.core.metadata.ImageMetadata` instances for samples
18 that it parses.
19 """
20 # Return True or False here
21 pass
22
23 def get_image(self):
24 """Returns the image from the current sample.
25
26 Returns:
27 a numpy image
28 """
29 # Return the image in `self.current_sample` here
30 pass
31
32 def get_image_path(self):
33 """Returns the image path for the current sample.
34
35 Returns:
36 the path to the image on disk
37 """
38 # Return the image path for `self.current_sample` here, or raise
39 # an error if `has_image_path == False`
40 pass
41
42 def get_image_metadata(self):
43 """Returns the image metadata for the current sample.
44
45 Returns:
46 a :class:`fiftyone.core.metadata.ImageMetadata` instance
47 """
48 # Return the image metadata for `self.current_sample` here, or
49 # raise an error if `has_image_metadata == False`
50 pass
When Dataset.add_images()
is called with a custom UnlabeledImageSampleParser
, the import is effectively
performed via the pseudocode below:
import fiftyone as fo
dataset = fo.Dataset(...)
samples = ...
sample_parser = CustomUnlabeledImageSampleParser(...)
for sample in samples:
sample_parser.with_sample(sample)
image_path = sample_parser.get_image_path()
if sample_parser.has_image_metadata:
metadata = sample_parser.get_image_metadata()
else:
metadata = None
sample = fo.Sample(filepath=image_path, metadata=metadata)
dataset.add_sample(sample)
The base SampleParser
interface provides a
with_sample()
method that ingests the next sample and makes it available via the
current_sample
property of the parser. Subsequent calls to the parser’s get_XXX()
methods
return information extracted from the current sample.
The UnlabeledImageSampleParser
interface provides a
has_image_path
property that declares whether the sample parser can return the path to the
current sample’s image on disk via
get_image_path()
.
Similarly, the
has_image_metadata
property that declares whether the sample parser can return an ImageMetadata
for the current sample’s image via
get_image_metadata()
.
By convention, all UnlabeledImageSampleParser
implementations must make the
current sample’s image available via
get_image()
.
To define a custom parser for labeled images, implement the
LabeledImageSampleParser
interface.
The pseudocode below provides a template for a custom
LabeledImageSampleParser
:
1import fiftyone.utils.data as foud
2
3class CustomLabeledImageSampleParser(foud.LabeledImageSampleParser):
4 """Custom parser for labeled image samples."""
5
6 @property
7 def has_image_path(self):
8 """Whether this parser produces paths to images on disk for samples
9 that it parses.
10 """
11 # Return True or False here
12 pass
13
14 @property
15 def has_image_metadata(self):
16 """Whether this parser produces
17 :class:`fiftyone.core.metadata.ImageMetadata` instances for samples
18 that it parses.
19 """
20 # Return True or False here
21 pass
22
23 @property
24 def label_cls(self):
25 """The :class:`fiftyone.core.labels.Label` class(es) returned by this
26 parser.
27
28 This can be any of the following:
29
30 - a :class:`fiftyone.core.labels.Label` class. In this case, the
31 parser is guaranteed to return labels of this type
32 - a list or tuple of :class:`fiftyone.core.labels.Label` classes. In
33 this case, the parser can produce a single label field of any of
34 these types
35 - a dict mapping keys to :class:`fiftyone.core.labels.Label` classes.
36 In this case, the parser will return label dictionaries with keys
37 and value-types specified by this dictionary. Not all keys need be
38 present in the imported labels
39 - ``None``. In this case, the parser makes no guarantees about the
40 labels that it may return
41 """
42 # Return the appropriate value here
43 pass
44
45 def get_image(self):
46 """Returns the image from the current sample.
47
48 Returns:
49 a numpy image
50 """
51 # Return the image in `self.current_sample` here
52 pass
53
54 def get_image_path(self):
55 """Returns the image path for the current sample.
56
57 Returns:
58 the path to the image on disk
59 """
60 # Return the image path for `self.current_sample` here, or raise
61 # an error if `has_image_path == False`
62 pass
63
64 def get_image_metadata(self):
65 """Returns the image metadata for the current sample.
66
67 Returns:
68 a :class:`fiftyone.core.metadata.ImageMetadata` instance
69 """
70 # Return the image metadata for `self.current_sample` here, or
71 # raise an error if `has_image_metadata == False`
72 pass
73
74 def get_label(self):
75 """Returns the label for the current sample.
76
77 Returns:
78 a :class:`fiftyone.core.labels.Label` instance, or a dictionary
79 mapping field names to :class:`fiftyone.core.labels.Label`
80 instances, or ``None`` if the sample is unlabeled
81 """
82 # Return the label for `self.current_sample` here
83 pass
When Dataset.add_labeled_images()
is called with a custom LabeledImageSampleParser
, the import is effectively
performed via the pseudocode below:
import fiftyone as fo
dataset = fo.Dataset(...)
samples = ...
sample_parser = CustomLabeledImageSampleParser(...)
label_field = ...
if isinstance(label_field, dict):
label_key = lambda k: label_field.get(k, k)
elif label_field is not None:
label_key = lambda k: label_field + "_" + k
else:
label_field = "ground_truth"
label_key = lambda k: k
for sample in samples:
sample_parser.with_sample(sample)
image_path = sample_parser.get_image_path()
if sample_parser.has_image_metadata:
metadata = sample_parser.get_image_metadata()
else:
metadata = None
label = sample_parser.get_label()
sample = fo.Sample(filepath=image_path, metadata=metadata)
if isinstance(label, dict):
sample.update_fields({label_key(k): v for k, v in label.items()})
elif label is not None:
sample[label_field] = label
dataset.add_sample(sample)
The base SampleParser
interface provides a
with_sample()
method that ingests the next sample and makes it available via the
current_sample
property of the parser. Subsequent calls to the parser’s get_XXX()
methods
return information extracted from the current sample.
The LabeledImageSampleParser
interface provides a
has_image_path
property that declares whether the sample parser can return the path to the
current sample’s image on disk via
get_image_path()
.
Similarly, the
has_image_metadata
property that declares whether the sample parser can return an ImageMetadata
for the current sample’s image via
get_image_metadata()
.
Additionally, the
label_cls
property of the parser declares the type of label(s) that the parser
will produce.
By convention, all LabeledImageSampleParser
implementations must make the
current sample’s image available via
get_image()
, and they must make the current sample’s label available via
get_label()
.
To define a custom parser for unlabeled videos, implement the
UnlabeledVideoSampleParser
interface.
The pseudocode below provides a template for a custom
UnlabeledVideoSampleParser
:
1import fiftyone.utils.data as foud
2
3class CustomUnlabeledVideoSampleParser(foud.UnlabeledVideoSampleParser):
4 """Custom parser for unlabeled video samples."""
5
6 @property
7 def has_video_metadata(self):
8 """Whether this parser produces
9 :class:`fiftyone.core.metadata.VideoMetadata` instances for samples
10 that it parses.
11 """
12 # Return True or False here
13 pass
14
15 def get_video_path(self):
16 """Returns the video path for the current sample.
17
18 Returns:
19 the path to the video on disk
20 """
21 # Return the video path for `self.current_sample` here
22 pass
23
24 def get_video_metadata(self):
25 """Returns the video metadata for the current sample.
26
27 Returns:
28 a :class:`fiftyone.core.metadata.VideoMetadata` instance
29 """
30 # Return the video metadata for `self.current_sample` here, or
31 # raise an error if `has_video_metadata == False`
32 pass
When Dataset.add_videos()
is called with a custom UnlabeledVideoSampleParser
, the import is effectively
performed via the pseudocode below:
import fiftyone as fo
dataset = fo.Dataset(...)
samples = ...
sample_parser = CustomUnlabeledVideoSampleParser(...)
for sample in samples:
sample_parser.with_sample(sample)
video_path = sample_parser.get_video_path()
if sample_parser.has_image_metadata:
metadata = sample_parser.get_image_metadata()
else:
metadata = None
sample = fo.Sample(filepath=video_path, metadata=metadata)
dataset.add_sample(sample)
The base SampleParser
interface provides a
with_sample()
method that ingests the next sample and makes it available via the
current_sample
property of the parser. Subsequent calls to the parser’s get_XXX()
methods
return information extracted from the current sample.
The UnlabeledVideoSampleParser
interface provides a
get_video_path()
to get the video path for the current sample. The
has_video_metadata
property that declares whether the sample parser can return a VideoMetadata
for the current sample’s video via
get_video_metadata()
.
To define a custom parser for labeled videos, implement the
LabeledVideoSampleParser
interface.
The pseudocode below provides a template for a custom
LabeledVideoSampleParser
:
1import fiftyone.utils.data as foud
2
3class CustomLabeledVideoSampleParser(foud.LabeledVideoSampleParser):
4 """Custom parser for labeled video samples."""
5
6 @property
7 def has_video_metadata(self):
8 """Whether this parser produces
9 :class:`fiftyone.core.metadata.VideoMetadata` instances for samples
10 that it parses.
11 """
12 # Return True or False here
13 pass
14
15 @property
16 def label_cls(self):
17 """The :class:`fiftyone.core.labels.Label` class(es) returned by this
18 parser within the sample-level labels that it produces.
19
20 This can be any of the following:
21
22 - a :class:`fiftyone.core.labels.Label` class. In this case, the
23 parser is guaranteed to return sample-level labels of this type
24 - a list or tuple of :class:`fiftyone.core.labels.Label` classes. In
25 this case, the parser can produce a single sample-level label field
26 of any of these types
27 - a dict mapping keys to :class:`fiftyone.core.labels.Label` classes.
28 In this case, the parser will return sample-level label
29 dictionaries with keys and value-types specified by this
30 dictionary. Not all keys need be present in the imported labels
31 - ``None``. In this case, the parser makes no guarantees about the
32 sample-level labels that it may return
33 """
34 # Return the appropriate value here
35 pass
36
37 @property
38 def frame_labels_cls(self):
39 """The :class:`fiftyone.core.labels.Label` class(es) returned by this
40 parser within the frame labels that it produces.
41
42 This can be any of the following:
43
44 - a :class:`fiftyone.core.labels.Label` class. In this case, the
45 parser is guaranteed to return frame labels of this type
46 - a list or tuple of :class:`fiftyone.core.labels.Label` classes. In
47 this case, the parser can produce a single frame label field of any
48 of these types
49 - a dict mapping keys to :class:`fiftyone.core.labels.Label` classes.
50 In this case, the parser will return frame label dictionaries with
51 keys and value-types specified by this dictionary. Not all keys
52 need be present in each frame
53 - ``None``. In this case, the parser makes no guarantees about the
54 frame labels that it may return
55 """
56 # Return the appropriate value here
57 pass
58
59 def get_video_path(self):
60 """Returns the video path for the current sample.
61
62 Returns:
63 the path to the video on disk
64 """
65 # Return the video path for `self.current_sample` here
66 pass
67
68 def get_video_metadata(self):
69 """Returns the video metadata for the current sample.
70
71 Returns:
72 a :class:`fiftyone.core.metadata.VideoMetadata` instance
73 """
74 # Return the video metadata for `self.current_sample` here, or
75 # raise an error if `has_video_metadata == False`
76 pass
77
78 def get_label(self):
79 """Returns the sample-level labels for the current sample.
80
81 Returns:
82 a :class:`fiftyone.core.labels.Label` instance, or a dictionary
83 mapping field names to :class:`fiftyone.core.labels.Label`
84 instances, or ``None`` if the sample has no sample-level labels
85 """
86 # Return the sample labels for `self.current_sample` here
87 pass
88
89 def get_frame_labels(self):
90 """Returns the frame labels for the current sample.
91
92 Returns:
93 a dictionary mapping frame numbers to dictionaries that map label
94 fields to :class:`fiftyone.core.labels.Label` instances for each
95 video frame, or ``None`` if the sample has no frame labels
96 """
97 # Return the frame labels for `self.current_sample` here
98 pass
When Dataset.add_labeled_videos()
is called with a custom LabeledVideoSampleParser
, the import is effectively
performed via the pseudocode below:
import fiftyone as fo
dataset = fo.Dataset(...)
samples = ...
sample_parser = CustomLabeledVideoSampleParser(...)
label_field = ...
if isinstance(label_field, dict):
label_key = lambda k: label_field.get(k, k)
elif label_field is not None:
label_key = lambda k: label_field + "_" + k
else:
label_field = "ground_truth"
label_key = lambda k: k
for sample in samples:
sample_parser.with_sample(sample)
video_path = sample_parser.get_video_path()
if sample_parser.has_video_metadata:
metadata = sample_parser.get_video_metadata()
else:
metadata = None
label = sample_parser.get_label()
frames = sample_parser.get_frame_labels()
sample = fo.Sample(filepath=video_path, metadata=metadata)
if isinstance(label, dict):
sample.update_fields({label_key(k): v for k, v in label.items()})
elif label is not None:
sample[label_field] = label
if frames is not None:
frame_labels = {}
for frame_number, _label in frames.items():
if isinstance(_label, dict):
frame_labels[frame_number] = {
label_key(k): v for k, v in _label.items()
}
elif _label is not None:
frame_labels[frame_number] = {label_field: _label}
sample.frames.merge(frame_labels)
dataset.add_sample(sample)
The base SampleParser
interface provides a
with_sample()
method that ingests the next sample and makes it available via the
current_sample
property of the parser. Subsequent calls to the parser’s get_XXX()
methods
return information extracted from the current sample.
The LabeledVideoSampleParser
interface provides a
get_video_path()
to get the video path for the current sample. The
has_video_metadata
property that declares whether the sample parser can return a VideoMetadata
for the current sample’s video via
get_video_metadata()
.
The
label_cls
property of the parser declares the type of sample-level label(s) that
the parser may produce (if any). The
frame_labels_cls
property of the parser declares the type of frame-level label(s) that
the parser may produce (if any). By convention, all
LabeledVideoSampleParser
implementations must make the current
sample’s sample-level labels available via
get_label()
and its frame-level labels available via
get_frame_labels()
.