fiftyone.core.sample¶
Dataset samples.
Functions:
|
Returns the default fields present on all samples. |
Classes:
|
A sample in a |
|
A view into a |
-
fiftyone.core.sample.
get_default_sample_fields
(include_private=False, use_db_fields=False)¶ Returns the default fields present on all samples.
- Parameters
include_private (False) – whether to include fields starting with
_
use_db_fields (False) – whether to return database fields rather than user-facing fields, when applicable
- Returns
a tuple of field names
-
class
fiftyone.core.sample.
Sample
(filepath, tags=None, metadata=None, **kwargs)¶ Bases:
fiftyone.core.sample._SampleMixin
,fiftyone.core.document.Document
A sample in a
fiftyone.core.dataset.Dataset
.Samples store all information associated with a particular piece of data in a dataset, including basic metadata about the data, one or more sets of labels (ground truth, user-provided, or FiftyOne-generated), and additional features associated with subsets of the data and/or label sets.
Note
Sample
instances that are in datasets are singletons, i.e.,dataset[sample_id]
will always return the sameSample
instance.- Parameters
filepath – the path to the data on disk. The path is converted to an absolute path (if necessary) via
fiftyone.core.storage.normalize_path()
tags (None) – a list of tags for the sample
metadata (None) – a
fiftyone.core.metadata.Metadata
instance**kwargs – additional fields to dynamically set on the sample
Methods:
reload
([hard])Reloads the sample from the database.
save
()Saves the sample to the database.
from_frame
(frame[, filepath])Creates a sample from the given frame.
from_doc
(doc[, dataset])Creates a sample backed by the given document.
from_dict
(d)Loads the sample from a JSON dictionary.
add_labels
(labels[, label_field, …])Adds the given labels to the sample.
clear_field
(field_name)Clears the value of a field of the document.
compute_metadata
([overwrite, skip_failures])Populates the
metadata
field of the sample.copy
([fields, omit_fields])Returns a deep copy of the sample that has not been added to the database.
from_json
(s)Loads the document from a JSON string.
get_field
(field_name)Gets the value of a field of the document.
has_field
(field_name)Determines whether the document has the given field.
iter_fields
([include_id, include_timestamps])Returns an iterator over the
(name, value)
pairs of the public fields of the document.merge
(sample[, fields, omit_fields, …])Merges the fields of the given sample into this sample.
set_field
(field_name, value[, create, …])Sets the value of a field of the document.
to_dict
([include_frames, include_private])Serializes the sample to a JSON dictionary.
to_json
([pretty_print])Serializes the document to a JSON string.
to_mongo_dict
([include_id])Serializes the document to a BSON dictionary equivalent to the representation that would be stored in the database.
update_fields
(fields_dict[, expand_schema, …])Sets the dictionary of fields on the document.
Attributes:
The dataset to which this document belongs, or
None
if it has not been added to a dataset.An ordered tuple of the public field names of this document.
The basename of the media’s filepath.
Whether the document has been added to a dataset.
The media type of the sample.
-
reload
(hard=False)¶ Reloads the sample from the database.
- Parameters
hard (False) – whether to reload the sample’s schema in addition to its field values. This is necessary if new fields may have been added to the dataset schema
-
save
()¶ Saves the sample to the database.
-
classmethod
from_frame
(frame, filepath=None)¶ Creates a sample from the given frame.
- Parameters
frame – a
fiftyone.core.frame.Frame
filepath (None) – the path to the corresponding image frame on disk, if not available
- Returns
a
Sample
-
classmethod
from_doc
(doc, dataset=None)¶ Creates a sample backed by the given document.
- Parameters
doc – a
fiftyone.core.odm.sample.DatasetSampleDocument
orfiftyone.core.odm.sample.NoDatasetSampleDocument
dataset (None) – the
fiftyone.core.dataset.Dataset
that the sample belongs to
- Returns
a
Sample
-
classmethod
from_dict
(d)¶ Loads the sample from a JSON dictionary.
The returned sample will not belong to a dataset.
- Returns
a
Sample
-
add_labels
(labels, label_field=None, confidence_thresh=None, expand_schema=True, validate=True, dynamic=False)¶ Adds the given labels to the sample.
The provided
labels
can be any of the following:A
fiftyone.core.labels.Label
instance, in which case the labels are directly saved in the specifiedlabel_field
A dict mapping keys to
fiftyone.core.labels.Label
instances. In this case, the labels are added as follows:for key, value in labels.items(): sample[label_key(key)] = value
A dict mapping frame numbers to
fiftyone.core.labels.Label
instances. In this case, the provided labels are interpreted as frame-level labels that should be added as follows:sample.frames.merge( { frame_number: {label_field: label} for frame_number, label in labels.items() } )
A dict mapping frame numbers to dicts mapping keys to
fiftyone.core.labels.Label
instances. In this case, the provided labels are interpreted as frame-level labels that should be added as follows:sample.frames.merge( { frame_number: { label_key(key): value for key, value in frame_dict.items() } for frame_number, frame_dict in labels.items() } )
In the above, the
label_key
function maps label dict keys to field names, and is defined fromlabel_field
as follows:if isinstance(label_field, dict): label_key = lambda k: label_field.get(k, k) elif label_field is not None: label_key = lambda k: label_field + "_" + k else: label_key = lambda k: k
- Parameters
labels – a
fiftyone.core.labels.Label
or dict of labels per the description abovelabel_field (None) – the sample field, prefix, or dict defining in which field(s) to save the labels
confidence_thresh (None) – an optional confidence threshold to apply to any applicable labels before saving them
expand_schema (True) – whether to dynamically add new fields encountered to the dataset schema. If False, an error is raised if any fields are not in the dataset schema
validate (True) – whether to validate values for existing fields
dynamic (False) – whether to declare dynamic attributes
-
clear_field
(field_name)¶ Clears the value of a field of the document.
- Parameters
field_name – the name of the field to clear
- Raises
AttributeError – if the field does not exist
-
compute_metadata
(overwrite=False, skip_failures=False)¶ Populates the
metadata
field of the sample.- Parameters
overwrite (False) – whether to overwrite existing metadata
skip_failures (False) – whether to gracefully continue without raising an error if metadata cannot be computed
-
copy
(fields=None, omit_fields=None)¶ Returns a deep copy of the sample that has not been added to the database.
- Parameters
fields (None) – an optional field or iterable of fields to which to restrict the copy. This can also be a dict mapping existing field names to new field names
omit_fields (None) – an optional field or iterable of fields to exclude from the copy
- Returns
a
Sample
-
property
dataset
¶ The dataset to which this document belongs, or
None
if it has not been added to a dataset.
-
property
dataset_id
¶
-
property
field_names
¶ An ordered tuple of the public field names of this document.
-
property
filename
¶ The basename of the media’s filepath.
-
classmethod
from_json
(s)¶ Loads the document from a JSON string.
The returned document will not belong to a dataset.
- Parameters
s – the JSON string
- Returns
a
Document
-
get_field
(field_name)¶ Gets the value of a field of the document.
- Parameters
field_name – the field name
- Returns
the field value
- Raises
AttributeError – if the field does not exist
-
has_field
(field_name)¶ Determines whether the document has the given field.
- Parameters
field_name – the field name
- Returns
True/False
-
property
in_dataset
¶ Whether the document has been added to a dataset.
-
iter_fields
(include_id=False, include_timestamps=False)¶ Returns an iterator over the
(name, value)
pairs of the public fields of the document.- Parameters
include_id (False) – whether to include the
id
fieldinclude_timestamps (False) – whether to include the
created_at
andlast_modified_at
fields
- Returns
an iterator that emits
(name, value)
tuples
-
property
media_type
¶ The media type of the sample.
-
merge
(sample, fields=None, omit_fields=None, merge_lists=True, overwrite=True, expand_schema=True, validate=True, dynamic=False)¶ Merges the fields of the given sample into this sample.
The behavior of this method is highly customizable. By default, all top-level fields from the provided sample are merged in, overwriting any existing values for those fields, with the exception of list fields (e.g.,
tags
) and label list fields (e.g.,fiftyone.core.labels.Detections
fields), in which case the elements of the lists themselves are merged. In the case of label list fields, labels with the sameid
in both samples are updated rather than duplicated.To avoid confusion between missing fields and fields whose value is
None
,None
-valued fields are always treated as missing while merging.This method can be configured in numerous ways, including:
Whether new fields can be added to the dataset schema
Whether list fields should be treated as ordinary fields and merged as a whole rather than merging their elements
Whether to merge only specific fields, or all but certain fields
Mapping input sample fields to different field names of this sample
- Parameters
sample – a
fiftyone.core.sample.Sample
fields (None) – an optional field or iterable of fields to which to restrict the merge. May contain frame fields for video samples. This can also be a dict mapping field names of the input sample to field names of this sample
omit_fields (None) – an optional field or iterable of fields to exclude from the merge. May contain frame fields for video samples
merge_lists (True) – whether to merge the elements of list fields (e.g.,
tags
) and label list fields (e.g.,fiftyone.core.labels.Detections
fields) rather than merging the entire top-level field like other field types. For label lists fields, existingfiftyone.core.label.Label
elements are either replaced (whenoverwrite
is True) or kept (whenoverwrite
is False) when theirid
matches a label from the provided sampleoverwrite (True) – whether to overwrite (True) or skip (False) existing fields and label elements
expand_schema (True) – whether to dynamically add new fields encountered to the dataset schema. If False, an error is raised if any fields are not in the dataset schema
validate (True) – whether to validate values for existing fields
dynamic (False) – whether to declare dynamic embedded document fields
-
set_field
(field_name, value, create=True, validate=True, dynamic=False)¶ Sets the value of a field of the document.
- Parameters
field_name – the field name
value – the field value
create (True) – whether to create the field if it does not exist
validate (True) – whether to validate values for existing fields
dynamic (False) – whether to declare dynamic embedded document fields
- Raises
ValueError – if
field_name
is not an allowed field nameAttributeError – if the field does not exist and
create == False
-
to_dict
(include_frames=False, include_private=False)¶ Serializes the sample to a JSON dictionary.
- Parameters
include_frames (False) – whether to include the frame labels for video samples
include_private (False) – whether to include private fields
- Returns
a JSON dict
-
to_json
(pretty_print=False)¶ Serializes the document to a JSON string.
The document ID and private fields are excluded in this representation.
- Parameters
pretty_print (False) – whether to render the JSON in human readable format with newlines and indentations
- Returns
a JSON string
-
to_mongo_dict
(include_id=False)¶ Serializes the document to a BSON dictionary equivalent to the representation that would be stored in the database.
- Parameters
include_id (False) – whether to include the document ID
- Returns
a BSON dict
-
update_fields
(fields_dict, expand_schema=True, validate=True, dynamic=False)¶ Sets the dictionary of fields on the document.
- Parameters
fields_dict – a dict mapping field names to values
expand_schema (True) – whether to dynamically add new fields encountered to the document schema. If False, an error is raised if any fields are not in the document schema
validate (True) – whether to validate values for existing fields
dynamic (False) – whether to declare dynamic embedded document fields
- Raises
AttributeError – if
expand_schema == False
and a field does not exist
-
class
fiftyone.core.sample.
SampleView
(doc, view, selected_fields=None, excluded_fields=None, filtered_fields=None)¶ Bases:
fiftyone.core.sample._SampleMixin
,fiftyone.core.document.DocumentView
A view into a
Sample
in a dataset.Like
Sample
instances, the fields of aSampleView
instance can be modified, new fields can be created, and any changes can be saved to the database.SampleView
instances differ fromSample
instances in the following ways:A sample view may contain only a subset of the fields of its source sample, either by selecting and/or excluding specific fields
A sample view may contain array fields or embedded array fields that have been filtered, thus containing only a subset of the array elements from the source sample
Excluded fields of a sample view may not be accessed or modified
Note
Sample views should never be created manually; they are generated when accessing the samples in a
fiftyone.core.view.DatasetView
.- Parameters
doc – a
fiftyone.core.odm.mixins.DatasetSampleDocument
view – the
fiftyone.core.view.DatasetView
that the sample belongs toselected_fields (None) – a set of field names that this sample view is restricted to, if any
excluded_fields (None) – a set of field names that are excluded from this sample view, if any
filtered_fields (None) – a set of field names of list fields that are filtered in this sample view, if any
Methods:
to_dict
([include_frames, include_private])Serializes the sample view to a JSON dictionary.
save
()Saves the sample view to the database.
add_labels
(labels[, label_field, …])Adds the given labels to the sample.
clear_field
(field_name)Clears the value of a field of the document.
compute_metadata
([overwrite, skip_failures])Populates the
metadata
field of the sample.copy
([fields, omit_fields])Returns a deep copy of the sample that has not been added to the database.
get_field
(field_name)Gets the value of a field of the document.
has_field
(field_name)Determines whether the document has the given field.
iter_fields
([include_id, include_timestamps])Returns an iterator over the
(name, value)
pairs of the public fields of the document.merge
(sample[, fields, omit_fields, …])Merges the fields of the given sample into this sample.
set_field
(field_name, value[, create, …])Sets the value of a field of the document.
to_json
([pretty_print])Serializes the document to a JSON string.
to_mongo_dict
([include_id])Serializes the document to a BSON dictionary equivalent to the representation that would be stored in the database.
update_fields
(fields_dict[, expand_schema, …])Sets the dictionary of fields on the document.
Attributes:
The dataset to which this document belongs, or
None
if it has not been added to a dataset.The set of field names that are excluded on this document view, or
None
if no fields are explicitly excluded.An ordered tuple of field names of this document view.
The basename of the media’s filepath.
The set of field names or
embedded.field.names
that have been filtered on this document view, orNone
if no fields are filtered.Whether the document has been added to a dataset.
The media type of the sample.
The set of field names that are selected on this document view, or
None
if no fields are explicitly selected.-
to_dict
(include_frames=False, include_private=False)¶ Serializes the sample view to a JSON dictionary.
- Parameters
include_frames (False) – whether to include the frame labels for video samples
include_private (False) – whether to include private fields
- Returns
a JSON dict
-
save
()¶ Saves the sample view to the database.
Warning
This will permanently delete any omitted or filtered contents from the source dataset.
-
add_labels
(labels, label_field=None, confidence_thresh=None, expand_schema=True, validate=True, dynamic=False)¶ Adds the given labels to the sample.
The provided
labels
can be any of the following:A
fiftyone.core.labels.Label
instance, in which case the labels are directly saved in the specifiedlabel_field
A dict mapping keys to
fiftyone.core.labels.Label
instances. In this case, the labels are added as follows:for key, value in labels.items(): sample[label_key(key)] = value
A dict mapping frame numbers to
fiftyone.core.labels.Label
instances. In this case, the provided labels are interpreted as frame-level labels that should be added as follows:sample.frames.merge( { frame_number: {label_field: label} for frame_number, label in labels.items() } )
A dict mapping frame numbers to dicts mapping keys to
fiftyone.core.labels.Label
instances. In this case, the provided labels are interpreted as frame-level labels that should be added as follows:sample.frames.merge( { frame_number: { label_key(key): value for key, value in frame_dict.items() } for frame_number, frame_dict in labels.items() } )
In the above, the
label_key
function maps label dict keys to field names, and is defined fromlabel_field
as follows:if isinstance(label_field, dict): label_key = lambda k: label_field.get(k, k) elif label_field is not None: label_key = lambda k: label_field + "_" + k else: label_key = lambda k: k
- Parameters
labels – a
fiftyone.core.labels.Label
or dict of labels per the description abovelabel_field (None) – the sample field, prefix, or dict defining in which field(s) to save the labels
confidence_thresh (None) – an optional confidence threshold to apply to any applicable labels before saving them
expand_schema (True) – whether to dynamically add new fields encountered to the dataset schema. If False, an error is raised if any fields are not in the dataset schema
validate (True) – whether to validate values for existing fields
dynamic (False) – whether to declare dynamic attributes
-
clear_field
(field_name)¶ Clears the value of a field of the document.
- Parameters
field_name – the name of the field to clear
- Raises
AttributeError – if the field does not exist
-
compute_metadata
(overwrite=False, skip_failures=False)¶ Populates the
metadata
field of the sample.- Parameters
overwrite (False) – whether to overwrite existing metadata
skip_failures (False) – whether to gracefully continue without raising an error if metadata cannot be computed
-
copy
(fields=None, omit_fields=None)¶ Returns a deep copy of the sample that has not been added to the database.
- Parameters
fields (None) – an optional field or iterable of fields to which to restrict the copy. This can also be a dict mapping existing field names to new field names
omit_fields (None) – an optional field or iterable of fields to exclude from the copy
- Returns
a
Sample
-
property
dataset
¶ The dataset to which this document belongs, or
None
if it has not been added to a dataset.
-
property
dataset_id
¶
-
property
excluded_field_names
¶ The set of field names that are excluded on this document view, or
None
if no fields are explicitly excluded.
-
property
field_names
¶ An ordered tuple of field names of this document view.
This may be a subset of all fields of the document if fields have been selected or excluded.
-
property
filename
¶ The basename of the media’s filepath.
-
property
filtered_field_names
¶ The set of field names or
embedded.field.names
that have been filtered on this document view, orNone
if no fields are filtered.
-
get_field
(field_name)¶ Gets the value of a field of the document.
- Parameters
field_name – the field name
- Returns
the field value
- Raises
AttributeError – if the field does not exist
-
has_field
(field_name)¶ Determines whether the document has the given field.
- Parameters
field_name – the field name
- Returns
True/False
-
property
in_dataset
¶ Whether the document has been added to a dataset.
-
iter_fields
(include_id=False, include_timestamps=False)¶ Returns an iterator over the
(name, value)
pairs of the public fields of the document.- Parameters
include_id (False) – whether to include the
id
fieldinclude_timestamps (False) – whether to include the
created_at
andlast_modified_at
fields
- Returns
an iterator that emits
(name, value)
tuples
-
property
media_type
¶ The media type of the sample.
-
merge
(sample, fields=None, omit_fields=None, merge_lists=True, overwrite=True, expand_schema=True, validate=True, dynamic=False)¶ Merges the fields of the given sample into this sample.
The behavior of this method is highly customizable. By default, all top-level fields from the provided sample are merged in, overwriting any existing values for those fields, with the exception of list fields (e.g.,
tags
) and label list fields (e.g.,fiftyone.core.labels.Detections
fields), in which case the elements of the lists themselves are merged. In the case of label list fields, labels with the sameid
in both samples are updated rather than duplicated.To avoid confusion between missing fields and fields whose value is
None
,None
-valued fields are always treated as missing while merging.This method can be configured in numerous ways, including:
Whether new fields can be added to the dataset schema
Whether list fields should be treated as ordinary fields and merged as a whole rather than merging their elements
Whether to merge only specific fields, or all but certain fields
Mapping input sample fields to different field names of this sample
- Parameters
sample – a
fiftyone.core.sample.Sample
fields (None) – an optional field or iterable of fields to which to restrict the merge. May contain frame fields for video samples. This can also be a dict mapping field names of the input sample to field names of this sample
omit_fields (None) – an optional field or iterable of fields to exclude from the merge. May contain frame fields for video samples
merge_lists (True) – whether to merge the elements of list fields (e.g.,
tags
) and label list fields (e.g.,fiftyone.core.labels.Detections
fields) rather than merging the entire top-level field like other field types. For label lists fields, existingfiftyone.core.label.Label
elements are either replaced (whenoverwrite
is True) or kept (whenoverwrite
is False) when theirid
matches a label from the provided sampleoverwrite (True) – whether to overwrite (True) or skip (False) existing fields and label elements
expand_schema (True) – whether to dynamically add new fields encountered to the dataset schema. If False, an error is raised if any fields are not in the dataset schema
validate (True) – whether to validate values for existing fields
dynamic (False) – whether to declare dynamic embedded document fields
-
property
selected_field_names
¶ The set of field names that are selected on this document view, or
None
if no fields are explicitly selected.
-
set_field
(field_name, value, create=True, validate=True, dynamic=False)¶ Sets the value of a field of the document.
- Parameters
field_name – the field name
value – the field value
create (True) – whether to create the field if it does not exist
validate (True) – whether to validate values for existing fields
dynamic (False) – whether to declare dynamic embedded document fields
- Raises
ValueError – if
field_name
is not an allowed field nameAttributeError – if the field does not exist and
create == False
-
to_json
(pretty_print=False)¶ Serializes the document to a JSON string.
The document ID and private fields are excluded in this representation.
- Parameters
pretty_print (False) – whether to render the JSON in human readable format with newlines and indentations
- Returns
a JSON string
-
to_mongo_dict
(include_id=False)¶ Serializes the document to a BSON dictionary equivalent to the representation that would be stored in the database.
- Parameters
include_id (False) – whether to include the document ID
- Returns
a BSON dict
-
update_fields
(fields_dict, expand_schema=True, validate=True, dynamic=False)¶ Sets the dictionary of fields on the document.
- Parameters
fields_dict – a dict mapping field names to values
expand_schema (True) – whether to dynamically add new fields encountered to the document schema. If False, an error is raised if any fields are not in the document schema
validate (True) – whether to validate values for existing fields
dynamic (False) – whether to declare dynamic embedded document fields
- Raises
AttributeError – if
expand_schema == False
and a field does not exist