fiftyone.core.document¶

Base classes for objects that are backed by database documents.

Copyright 2017-2024, Voxel51, Inc.
voxel51.com

Classes:

`Document`(**kwargs)	Abstract base class for objects that are associated with `fiftyone.core.dataset.Dataset` instances and are backed by documents in database collections.
`DocumentView`(doc, view[, selected_fields, …])	A view into a `Document` in a dataset.

class fiftyone.core.document.Document(**kwargs)¶

Bases: fiftyone.core.document._Document

Abstract base class for objects that are associated with fiftyone.core.dataset.Dataset instances and are backed by documents in database collections.

Document subclasses whose in-dataset instances should be singletons can inherit this behavior by deriving from the fiftyone.core.singletons.DocumentSingleton metaclass.

Parameters: **kwargs – field names and values

Methods:

`copy`([fields, omit_fields])	Returns a deep copy of the document that has not been added to the database.
`reload`([hard])	Reloads the document from the database.
`from_doc`(doc[, dataset])	Creates a document backed by the given database document.
`from_dict`(d)	Loads the document from a JSON dictionary.
`from_json`(s)	Loads the document from a JSON string.
`clear_field`(field_name)	Clears the value of a field of the document.
`get_field`(field_name)	Gets the value of a field of the document.
`has_field`(field_name)	Determines whether the document has the given field.
`iter_fields`([include_id, include_timestamps])	Returns an iterator over the `(name, value)` pairs of the public fields of the document.
`merge`(document[, fields, omit_fields, …])	Merges the fields of the document into this document.
`save`()	Saves the document to the database.
`set_field`(field_name, value[, create, …])	Sets the value of a field of the document.
`to_dict`([include_private])	Serializes the document to a JSON dictionary.
`to_json`([pretty_print])	Serializes the document to a JSON string.
`to_mongo_dict`([include_id])	Serializes the document to a BSON dictionary equivalent to the representation that would be stored in the database.
`update_fields`(fields_dict[, expand_schema, …])	Sets the dictionary of fields on the document.

Attributes:

`dataset`	The dataset to which this document belongs, or `None` if it has not been added to a dataset.
`field_names`	An ordered tuple of the public field names of this document.
`in_dataset`	Whether the document has been added to a dataset.

copy(fields=None, omit_fields=None)¶

Returns a deep copy of the document that has not been added to the database.

Parameters

fields (None) – an optional field or iterable of fields to which to restrict the copy. This can also be a dict mapping existing field names to new field names
omit_fields (None) – an optional field or iterable of fields to exclude from the copy

Returns

a Document

reload(hard=False)¶

Reloads the document from the database.

Parameters: hard (False) – whether to reload the document’s schema in addition to its field values. This is necessary if new fields may have been added to the document schema

classmethod from_doc(doc, dataset=None)¶

Creates a document backed by the given database document.

Parameters

doc – a fiftyone.core.odm.document.Document
dataset (None) – the fiftyone.core.dataset.Dataset that the document belongs to, if any

Returns

a Document

classmethod from_dict(d)¶

Loads the document from a JSON dictionary.

The returned document will not belong to a dataset.

Returns: a Document

classmethod from_json(s)¶

Loads the document from a JSON string.

The returned document will not belong to a dataset.

Parameters: s – the JSON string
Returns: a Document

clear_field(field_name)¶

Clears the value of a field of the document.

Parameters: field_name – the name of the field to clear
Raises: AttributeError – if the field does not exist

property dataset¶: The dataset to which this document belongs, or None if it has not been added to a dataset.

property field_names¶: An ordered tuple of the public field names of this document.

get_field(field_name)¶

Gets the value of a field of the document.

Parameters: field_name – the field name
Returns: the field value
Raises: AttributeError – if the field does not exist

has_field(field_name)¶

Determines whether the document has the given field.

Parameters: field_name – the field name
Returns: True/False

property in_dataset¶: Whether the document has been added to a dataset.

iter_fields(include_id=False, include_timestamps=False)¶

Returns an iterator over the (name, value) pairs of the public fields of the document.

Parameters

include_id (False) – whether to include the id field
include_timestamps (False) – whether to include the created_at and last_modified_at fields

Returns

an iterator that emits (name, value) tuples

merge(document, fields=None, omit_fields=None, merge_lists=True, overwrite=True, expand_schema=True, validate=True, dynamic=False)¶

Merges the fields of the document into this document.

The behavior of this method is highly customizable. By default, all top-level fields from the provided document are merged in, overwriting any existing values for those fields, with the exception of list fields (e.g., tags) and label list fields (e.g., fiftyone.core.labels.Detections fields), in which case the elements of the lists themselves are merged. In the case of label list fields, labels with the same id in both documents are updated rather than duplicated.

To avoid confusion between missing fields and fields whose value is None, None-valued fields are always treated as missing while merging.

This method can be configured in numerous ways, including:

Whether new fields can be added to the document schema
Whether list fields should be treated as ordinary fields and merged as a whole rather than merging their elements
Whether to merge only specific fields, or all but certain fields
Mapping input document fields to different field names of this document

Parameters

document – a Document or DocumentView of the same type
fields (None) – an optional field or iterable of fields to which to restrict the merge. This can also be a dict mapping field names of the input document to field names of this document
omit_fields (None) – an optional field or iterable of fields to exclude from the merge
merge_lists (True) – whether to merge the elements of top-level list fields (e.g., tags) and label list fields (e.g., fiftyone.core.labels.Detections fields) rather than merging the entire top-level field like other field types. For label lists fields, existing fiftyone.core.label.Label elements are either replaced (when overwrite is True) or kept (when overwrite is False) when their id matches a label from the provided document
overwrite (True) – whether to overwrite (True) or skip (False) existing fields and label elements
expand_schema (True) – whether to dynamically add new fields encountered to the document schema. If False, an error is raised if any fields are not in the document schema
validate (True) – whether to validate values for existing fields
dynamic (False) – whether to declare dynamic embedded document fields

Raises

AttributeError – if expand_schema == False and a field does not exist

save()¶: Saves the document to the database.

set_field(field_name, value, create=True, validate=True, dynamic=False)¶

Sets the value of a field of the document.

Parameters

field_name – the field name
value – the field value
create (True) – whether to create the field if it does not exist
validate (True) – whether to validate values for existing fields
dynamic (False) – whether to declare dynamic embedded document fields

Raises

ValueError – if field_name is not an allowed field name
AttributeError – if the field does not exist and create == False

to_dict(include_private=False)¶

Serializes the document to a JSON dictionary.

Parameters: include_private (False) – whether to include private fields
Returns: a JSON dict

to_json(pretty_print=False)¶

Serializes the document to a JSON string.

The document ID and private fields are excluded in this representation.

Parameters: pretty_print (False) – whether to render the JSON in human readable format with newlines and indentations
Returns: a JSON string

to_mongo_dict(include_id=False)¶

Serializes the document to a BSON dictionary equivalent to the representation that would be stored in the database.

Parameters: include_id (False) – whether to include the document ID
Returns: a BSON dict

update_fields(fields_dict, expand_schema=True, validate=True, dynamic=False)¶

Sets the dictionary of fields on the document.

Parameters

fields_dict – a dict mapping field names to values
expand_schema (True) – whether to dynamically add new fields encountered to the document schema. If False, an error is raised if any fields are not in the document schema
validate (True) – whether to validate values for existing fields
dynamic (False) – whether to declare dynamic embedded document fields

Raises

AttributeError – if expand_schema == False and a field does not exist

class fiftyone.core.document.DocumentView(doc, view, selected_fields=None, excluded_fields=None, filtered_fields=None)¶

Bases: fiftyone.core.document._Document

A view into a Document in a dataset.

Like Document instances, the fields of a DocumentView instance can be modified, new fields can be created, and any changes can be saved to the database.

DocumentView instances differ from Document instances in the following ways:

A document view may contain only a subset of the fields of its source document, either by selecting and/or excluding specific fields
A document view may contain array fields or embedded array fields that have been filtered, thus containing only a subset of the array elements from the source document
Excluded fields of a document view may not be accessed or modified

Note

DocumentView.save() will not delete any excluded fields or filtered array elements from the source document.

Document views should never be created manually; they are generated when accessing the contents of a fiftyone.core.view.DatasetView.

Parameters

doc – a fiftyone.core.odm.document.Document
view – the fiftyone.core.view.DatasetView that the document belongs to
selected_fields (None) – a set of field names that this document view is restricted to, if any
excluded_fields (None) – a set of field names that are excluded from this document view, if any
filtered_fields (None) – a set of field names of array fields that are filtered in this document view, if any

Attributes:

`field_names`	An ordered tuple of field names of this document view.
`selected_field_names`	The set of field names that are selected on this document view, or `None` if no fields are explicitly selected.
`excluded_field_names`	The set of field names that are excluded on this document view, or `None` if no fields are explicitly excluded.
`filtered_field_names`	The set of field names or `embedded.field.names` that have been filtered on this document view, or `None` if no fields are filtered.
`dataset`	The dataset to which this document belongs, or `None` if it has not been added to a dataset.
`in_dataset`	Whether the document has been added to a dataset.

Methods:

`has_field`(field_name)	Determines whether the document has the given field.
`get_field`(field_name)	Gets the value of a field of the document.
`set_field`(field_name, value[, create, …])	Sets the value of a field of the document.
`clear_field`(field_name)	Clears the value of a field of the document.
`to_dict`([include_private])	Serializes the document to a JSON dictionary.
`to_mongo_dict`([include_id])	Serializes the document to a BSON dictionary equivalent to the representation that would be stored in the database.
`copy`([fields, omit_fields])	Returns a deep copy of the document that has not been added to the database.
`save`()	Saves the document view to the database.
`iter_fields`([include_id, include_timestamps])	Returns an iterator over the `(name, value)` pairs of the public fields of the document.
`merge`(document[, fields, omit_fields, …])	Merges the fields of the document into this document.
`to_json`([pretty_print])	Serializes the document to a JSON string.
`update_fields`(fields_dict[, expand_schema, …])	Sets the dictionary of fields on the document.

property field_names¶

An ordered tuple of field names of this document view.

This may be a subset of all fields of the document if fields have been selected or excluded.

property selected_field_names¶: The set of field names that are selected on this document view, or None if no fields are explicitly selected.

property excluded_field_names¶: The set of field names that are excluded on this document view, or None if no fields are explicitly excluded.

property filtered_field_names¶: The set of field names or embedded.field.names that have been filtered on this document view, or None if no fields are filtered.

has_field(field_name)¶

Determines whether the document has the given field.

Parameters: field_name – the field name
Returns: True/False

get_field(field_name)¶

Gets the value of a field of the document.

Parameters: field_name – the field name
Returns: the field value
Raises: AttributeError – if the field does not exist

set_field(field_name, value, create=True, validate=True, dynamic=False)¶

Sets the value of a field of the document.

Parameters

field_name – the field name
value – the field value
create (True) – whether to create the field if it does not exist
validate (True) – whether to validate values for existing fields
dynamic (False) – whether to declare dynamic embedded document fields

Raises

ValueError – if field_name is not an allowed field name
AttributeError – if the field does not exist and create == False

clear_field(field_name)¶

Clears the value of a field of the document.

Parameters: field_name – the name of the field to clear
Raises: AttributeError – if the field does not exist

to_dict(include_private=False)¶

Serializes the document to a JSON dictionary.

Parameters: include_private (False) – whether to include private fields
Returns: a JSON dict

to_mongo_dict(include_id=False)¶

Serializes the document to a BSON dictionary equivalent to the representation that would be stored in the database.

Parameters: include_id (False) – whether to include the document ID
Returns: a BSON dict

copy(fields=None, omit_fields=None)¶

Returns a deep copy of the document that has not been added to the database.

Parameters

fields (None) – an optional field or iterable of fields to which to restrict the copy. This can also be a dict mapping existing field names to new field names
omit_fields (None) – an optional field or iterable of fields to exclude from the copy

Returns

a Document

save()¶: Saves the document view to the database.

property dataset¶: The dataset to which this document belongs, or None if it has not been added to a dataset.

property in_dataset¶: Whether the document has been added to a dataset.

iter_fields(include_id=False, include_timestamps=False)¶

Returns an iterator over the (name, value) pairs of the public fields of the document.

Parameters

include_id (False) – whether to include the id field
include_timestamps (False) – whether to include the created_at and last_modified_at fields

Returns

an iterator that emits (name, value) tuples

merge(document, fields=None, omit_fields=None, merge_lists=True, overwrite=True, expand_schema=True, validate=True, dynamic=False)¶

Merges the fields of the document into this document.

The behavior of this method is highly customizable. By default, all top-level fields from the provided document are merged in, overwriting any existing values for those fields, with the exception of list fields (e.g., tags) and label list fields (e.g., fiftyone.core.labels.Detections fields), in which case the elements of the lists themselves are merged. In the case of label list fields, labels with the same id in both documents are updated rather than duplicated.

To avoid confusion between missing fields and fields whose value is None, None-valued fields are always treated as missing while merging.

This method can be configured in numerous ways, including:

Whether new fields can be added to the document schema
Whether list fields should be treated as ordinary fields and merged as a whole rather than merging their elements
Whether to merge only specific fields, or all but certain fields
Mapping input document fields to different field names of this document

Parameters

document – a Document or DocumentView of the same type
fields (None) – an optional field or iterable of fields to which to restrict the merge. This can also be a dict mapping field names of the input document to field names of this document
omit_fields (None) – an optional field or iterable of fields to exclude from the merge
merge_lists (True) – whether to merge the elements of top-level list fields (e.g., tags) and label list fields (e.g., fiftyone.core.labels.Detections fields) rather than merging the entire top-level field like other field types. For label lists fields, existing fiftyone.core.label.Label elements are either replaced (when overwrite is True) or kept (when overwrite is False) when their id matches a label from the provided document
overwrite (True) – whether to overwrite (True) or skip (False) existing fields and label elements
expand_schema (True) – whether to dynamically add new fields encountered to the document schema. If False, an error is raised if any fields are not in the document schema
validate (True) – whether to validate values for existing fields
dynamic (False) – whether to declare dynamic embedded document fields

Raises

AttributeError – if expand_schema == False and a field does not exist

to_json(pretty_print=False)¶

Serializes the document to a JSON string.

The document ID and private fields are excluded in this representation.

Parameters: pretty_print (False) – whether to render the JSON in human readable format with newlines and indentations
Returns: a JSON string

update_fields(fields_dict, expand_schema=True, validate=True, dynamic=False)¶

Sets the dictionary of fields on the document.

Parameters

fields_dict – a dict mapping field names to values
expand_schema (True) – whether to dynamically add new fields encountered to the document schema. If False, an error is raised if any fields are not in the document schema
validate (True) – whether to validate values for existing fields
dynamic (False) – whether to declare dynamic embedded document fields

Raises

AttributeError – if expand_schema == False and a field does not exist