fiftyone.core.odm.database#
Database utilities.
Classes:
|
Backing document for the database config. |
Functions:
Retrieves the database config. |
|
|
Establishes the database connection. |
|
Executes one or more aggregations on a collection. |
Ensures database connection exists |
|
Returns a database client. |
|
Returns a connection to the database. |
|
|
Returns an async database client. |
|
Returns an async connection to the database. |
Drops the database. |
|
Syncs all pending database writes to disk. |
|
Returns a list of all collection names in the database. |
|
|
Drops specified collection from the database. |
|
Drops all orphan collections from the database. |
|
Drops all orphan saved views from the database. |
|
Drops all orphan runs from the database. |
|
Drops all orphan execution stores from the database. |
|
Streams the contents of the collection to stdout. |
|
Sets stats about the collection. |
|
|
|
Exports the document to disk in JSON format. |
|
Exports the collection to disk in JSON format. |
|
Imports a document from JSON on disk. |
|
Imports the collection from JSON on disk. |
|
Inserts documents into a collection. |
|
Performs a batch of write operations on a collection. |
Returns the list of available FiftyOne datasets. |
|
|
Ensures that the saved view documents in the |
|
Ensures that the workspace documents in the |
|
Ensures that the annotation runs in the |
|
Ensures that the brain method runs in the |
|
Ensures that the evaluation runs in the |
|
Ensures that the runs in the |
|
Deletes the dataset with the given name. |
|
Deletes the saved view with the given name from the dataset with the given name. |
|
Deletes all saved views from the dataset with the given name. |
|
Deletes the annotation run with the given key from the dataset with the given name. |
|
Deletes all annotation runs from the dataset with the given name. |
|
Deletes the brain method run with the given key from the dataset with the given name. |
|
Deletes all brain method runs from the dataset with the given name. |
|
Deletes the evaluation run with the given key from the dataset with the given name. |
|
Deletes all evaluations from the dataset with the given name. |
|
Deletes the run with the given key from the dataset with the given name. |
|
Deletes all runs from the dataset with the given name. |
|
Returns the values of the field(s) for all samples in the given collection that are covered by the index. |
- class fiftyone.core.odm.database.DatabaseConfigDocument(conn, version=None, type=None, *args, **kwargs)#
Bases:
object
Backing document for the database config.
Attributes:
Methods:
save
()- version: str#
- type: str#
- save()#
- fiftyone.core.odm.database.get_db_config()#
Retrieves the database config.
- Returns:
- fiftyone.core.odm.database.establish_db_conn(config)#
Establishes the database connection.
If
fiftyone.config.database_uri
is defined, then we connect to that URI. Otherwise, afiftyone.core.service.DatabaseService
is created.- Parameters:
config – a
fiftyone.core.config.FiftyOneConfig
- Raises:
ConnectionError – if a connection to
mongod
could not be establishedFiftyOneConfigError – if
fiftyone.config.database_uri
is not defined andmongod
could not be foundServiceExecutableNotFound – if
fiftyone.core.service.DatabaseService
startup was attempted, butmongod
was not found infiftyone.db.bin
RuntimeError – if the
mongod
found does not meet FiftyOne’s requirements, or validation could not occur
- fiftyone.core.odm.database.aggregate(collection, pipelines, hints=None, maxTimeMS=None, _stream=False)#
Executes one or more aggregations on a collection.
Multiple aggregations are executed using multiple threads, and their results are returned as lists rather than cursors.
- Parameters:
collection – a
pymongo.collection.Collection
ormotor.motor_asyncio.AsyncIOMotorCollection
pipelines – a MongoDB aggregation pipeline or a list of pipelines
hints (None) – a corresponding index hint or list of index hints for each pipeline
maxTimeMS (None) – max timeout for the request(s)
- Returns:
If a single pipeline is provided, a
pymongo.command_cursor.CommandCursor
ormotor.motor_asyncio.AsyncIOMotorCommandCursor
is returnedIf multiple pipelines are provided, each cursor is extracted into a list and the list of lists is returned
- fiftyone.core.odm.database.ensure_connection()#
Ensures database connection exists
- fiftyone.core.odm.database.get_db_client()#
Returns a database client.
- Returns:
a
pymongo.mongo_client.MongoClient
- fiftyone.core.odm.database.get_db_conn()#
Returns a connection to the database.
- Returns:
a
pymongo.database.Database
- fiftyone.core.odm.database.get_async_db_client(use_global=False)#
Returns an async database client.
- Parameters:
use_global – whether to use the global client singleton
- Returns:
a
motor.motor_asyncio.AsyncIOMotorClient
- fiftyone.core.odm.database.get_async_db_conn(use_global=False)#
Returns an async connection to the database.
- Returns:
a
motor.motor_asyncio.AsyncIOMotorDatabase
- fiftyone.core.odm.database.drop_database()#
Drops the database.
- fiftyone.core.odm.database.sync_database()#
Syncs all pending database writes to disk.
- fiftyone.core.odm.database.list_collections()#
Returns a list of all collection names in the database.
- Returns:
a list of all collection names
- fiftyone.core.odm.database.drop_collection(collection_name)#
Drops specified collection from the database.
- Parameters:
collection_name – the collection name
- fiftyone.core.odm.database.drop_orphan_collections(dry_run=False)#
Drops all orphan collections from the database.
Orphan collections are collections that are not associated with any known dataset or other collections used by FiftyOne.
- Parameters:
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.drop_orphan_saved_views(dry_run=False)#
Drops all orphan saved views from the database.
Orphan saved views are saved view documents that are not associated with any known dataset or other collections used by FiftyOne.
- Parameters:
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.drop_orphan_runs(dry_run=False)#
Drops all orphan runs from the database.
Orphan runs are runs that are not associated with any known dataset or other collections used by FiftyOne.
- Parameters:
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.drop_orphan_stores(dry_run=False)#
Drops all orphan execution stores from the database.
Orphan stores are those that are associated with a dataset that no longer exists in the database.
- Parameters:
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.stream_collection(collection_name)#
Streams the contents of the collection to stdout.
- Parameters:
collection_name – the name of the collection
- fiftyone.core.odm.database.get_collection_stats(collection_name)#
Sets stats about the collection.
- Parameters:
collection_name – the name of the collection
- Returns:
a stats dict
- fiftyone.core.odm.database.count_documents(coll, pipeline)#
- fiftyone.core.odm.database.export_document(doc, json_path)#
Exports the document to disk in JSON format.
- Parameters:
doc – a BSON document dict
json_path – the path to write the JSON file
- fiftyone.core.odm.database.export_collection(docs, json_dir_or_path, key='documents', patt='{idx:06d}-{id}.json', num_docs=None, progress=None)#
Exports the collection to disk in JSON format.
- Parameters:
docs – an iterable containing the documents to export
json_dir_or_path – the path to write a single JSON file containing the entire collection, or a directory in which to write per-document JSON files
key ("documents") – the field name under which to store the documents when
json_path
is a single JSON file("{idx (patt) – 06d}-{id}.json”): a filename pattern to use when
json_path
is a directory. The pattern may containidx
to refer to the index of the document indocs
orid
to refer to the document’s IDnum_docs (None) – the total number of documents. If omitted, this must be computable via
len(docs)
progress (None) – whether to render a progress bar (True/False), use the default value
fiftyone.config.show_progress_bars
(None), or a progress callback function to invoke instead
- fiftyone.core.odm.database.import_document(json_path)#
Imports a document from JSON on disk.
- Parameters:
json_path – the path to the document
- Returns:
a BSON document dict
- fiftyone.core.odm.database.import_collection(json_dir_or_path, key='documents')#
Imports the collection from JSON on disk.
- Parameters:
json_dir_or_path – the path to a JSON file on disk, or a directory containing per-document JSON files
key ("documents") – the field name under which the documents are stored when
json_path
is a single JSON file
- Returns:
a tuple of
an iterable of BSON documents
the number of documents
- fiftyone.core.odm.database.insert_documents(docs, coll, ordered=False, batcher=None, progress=None, num_docs=None)#
Inserts documents into a collection.
The
_id
field of the input documents will be populated if it is not already set.- Parameters:
docs – an iterable of BSON document dicts
coll – a pymongo collection
ordered (False) – whether the documents must be inserted in order
batcher (None) – an optional
fiftyone.core.utils.Batcher
class to use to batch the documents, orFalse
to strictly insert the documents in a single batch. By default,fiftyone.config.default_batcher
is usedprogress (None) – whether to render a progress bar (True/False), use the default value
fiftyone.config.show_progress_bars
(None), or a progress callback function to invoke insteadnum_docs (None) – the total number of documents. Only used when
progress=True
. If omitted, this will be computed vialen(docs)
, if possible
- Returns:
a list of IDs of the inserted documents
- fiftyone.core.odm.database.bulk_write(ops, coll, ordered=False, batcher=None, progress=False)#
Performs a batch of write operations on a collection.
- Parameters:
ops – a list of pymongo operations
coll – a pymongo collection
ordered (False) – whether the operations must be performed in order
batcher (None) – an optional
fiftyone.core.utils.Batcher
class to use to batch the operations, orFalse
to strictly perform the operations in a single batch. By default,fiftyone.config.default_batcher
is usedprogress (False) – whether to render a progress bar (True/False), use the default value
fiftyone.config.show_progress_bars
(None), or a progress callback function to invoke instead
- Returns:
A list of
pymongo.results.BulkWriteResult
objects
- fiftyone.core.odm.database.list_datasets()#
Returns the list of available FiftyOne datasets.
This is a low-level implementation of dataset listing that does not call
fiftyone.core.dataset.list_datasets()
, which is helpful if a database may be corrupted.- Returns:
a list of
Dataset
names
- fiftyone.core.odm.database.patch_saved_views(dataset_name, dry_run=False)#
Ensures that the saved view documents in the
views
collection for the given dataset exactly match the IDs in its dataset document.- Parameters:
dataset_name – the name of the dataset
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.patch_workspaces(dataset_name, dry_run=False)#
Ensures that the workspace documents in the
workspaces
collection for the given dataset exactly match the IDs in its dataset document.- Parameters:
dataset_name – the name of the dataset
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.patch_annotation_runs(dataset_name, dry_run=False)#
Ensures that the annotation runs in the
runs
collection for the given dataset exactly match the values in its dataset document.- Parameters:
dataset_name – the name of the dataset
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.patch_brain_runs(dataset_name, dry_run=False)#
Ensures that the brain method runs in the
runs
collection for the given dataset exactly match the values in its dataset document.- Parameters:
dataset_name – the name of the dataset
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.patch_evaluations(dataset_name, dry_run=False)#
Ensures that the evaluation runs in the
runs
collection for the given dataset exactly match the values in its dataset document.- Parameters:
dataset_name – the name of the dataset
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.patch_runs(dataset_name, dry_run=False)#
Ensures that the runs in the
runs
collection for the given dataset exactly match the values in its dataset document.- Parameters:
dataset_name – the name of the dataset
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.delete_dataset(name, dry_run=False)#
Deletes the dataset with the given name.
This is a low-level implementation of deletion that does not call
fiftyone.core.dataset.load_dataset()
, which is helpful if a dataset’s backing document or collections are corrupted and cannot be loaded via the normal pathways.- Parameters:
name – the name of the dataset
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.delete_saved_view(dataset_name, view_name, dry_run=False)#
Deletes the saved view with the given name from the dataset with the given name.
This is a low-level implementation of deletion that does not call
fiftyone.core.dataset.load_dataset()
orfiftyone.core.collections.SampleCollection.load_saved_view()
, which is helpful if a dataset’s backing document or collections are corrupted and cannot be loaded via the normal pathways.- Parameters:
dataset_name – the name of the dataset
view_name – the name of the saved view
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.delete_saved_views(dataset_name, dry_run=False)#
Deletes all saved views from the dataset with the given name.
This is a low-level implementation of deletion that does not call
fiftyone.core.dataset.load_dataset()
orfiftyone.core.collections.SampleCollection.load_saved_view()
, which is helpful if a dataset’s backing document or collections are corrupted and cannot be loaded via the normal pathways.- Parameters:
dataset_name – the name of the dataset
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.delete_annotation_run(name, anno_key, dry_run=False)#
Deletes the annotation run with the given key from the dataset with the given name.
This is a low-level implementation of deletion that does not call
fiftyone.core.dataset.load_dataset()
orfiftyone.core.collections.SampleCollection.delete_annotation_run()
, which is helpful if a dataset’s backing document or collections are corrupted and cannot be loaded via the normal pathways.Note that, as this method does not load
fiftyone.core.runs.Run
instances, it does not callfiftyone.core.runs.Run.cleanup()
.- Parameters:
name – the name of the dataset
anno_key – the annotation key
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.delete_annotation_runs(name, dry_run=False)#
Deletes all annotation runs from the dataset with the given name.
This is a low-level implementation of deletion that does not call
fiftyone.core.dataset.load_dataset()
orfiftyone.core.collections.SampleCollection.delete_annotation_runs()
, which is helpful if a dataset’s backing document or collections are corrupted and cannot be loaded via the normal pathways.Note that, as this method does not load
fiftyone.core.runs.Run
instances, it does not callfiftyone.core.runs.Run.cleanup()
.- Parameters:
name – the name of the dataset
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.delete_brain_run(name, brain_key, dry_run=False)#
Deletes the brain method run with the given key from the dataset with the given name.
This is a low-level implementation of deletion that does not call
fiftyone.core.dataset.load_dataset()
orfiftyone.core.collections.SampleCollection.delete_brain_run()
, which is helpful if a dataset’s backing document or collections are corrupted and cannot be loaded via the normal pathways.Note that, as this method does not load
fiftyone.core.runs.Run
instances, it does not callfiftyone.core.runs.Run.cleanup()
.- Parameters:
name – the name of the dataset
brain_key – the brain key
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.delete_brain_runs(name, dry_run=False)#
Deletes all brain method runs from the dataset with the given name.
This is a low-level implementation of deletion that does not call
fiftyone.core.dataset.load_dataset()
orfiftyone.core.collections.SampleCollection.delete_brain_runs()
, which is helpful if a dataset’s backing document or collections are corrupted and cannot be loaded via the normal pathways.Note that, as this method does not load
fiftyone.core.runs.Run
instances, it does not callfiftyone.core.runs.Run.cleanup()
.- Parameters:
name – the name of the dataset
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.delete_evaluation(name, eval_key, dry_run=False)#
Deletes the evaluation run with the given key from the dataset with the given name.
This is a low-level implementation of deletion that does not call
fiftyone.core.dataset.load_dataset()
orfiftyone.core.collections.SampleCollection.delete_evaluation()
, which is helpful if a dataset’s backing document or collections are corrupted and cannot be loaded via the normal pathways.Note that, as this method does not load
fiftyone.core.runs.Run
instances, it does not callfiftyone.core.runs.Run.cleanup()
.- Parameters:
name – the name of the dataset
eval_key – the evaluation key
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.delete_evaluations(name, dry_run=False)#
Deletes all evaluations from the dataset with the given name.
This is a low-level implementation of deletion that does not call
fiftyone.core.dataset.load_dataset()
orfiftyone.core.collections.SampleCollection.delete_evaluations()
, which is helpful if a dataset’s backing document or collections are corrupted and cannot be loaded via the normal pathways.Note that, as this method does not load
fiftyone.core.runs.Run
instances, it does not callfiftyone.core.runs.Run.cleanup()
.- Parameters:
name – the name of the dataset
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.delete_run(name, run_key, dry_run=False)#
Deletes the run with the given key from the dataset with the given name.
This is a low-level implementation of deletion that does not call
fiftyone.core.dataset.load_dataset()
orfiftyone.core.collections.SampleCollection.delete_run()
, which is helpful if a dataset’s backing document or collections are corrupted and cannot be loaded via the normal pathways.Note that, as this method does not load
fiftyone.core.runs.Run
instances, it does not callfiftyone.core.runs.Run.cleanup()
.- Parameters:
name – the name of the dataset
run_key – the run key
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.delete_runs(name, dry_run=False)#
Deletes all runs from the dataset with the given name.
This is a low-level implementation of deletion that does not call
fiftyone.core.dataset.load_dataset()
orfiftyone.core.collections.SampleCollection.delete_runs()
, which is helpful if a dataset’s backing document or collections are corrupted and cannot be loaded via the normal pathways.Note that, as this method does not load
fiftyone.core.runs.Run
instances, it does not callfiftyone.core.runs.Run.cleanup()
.- Parameters:
name – the name of the dataset
dry_run (False) – whether to log the actions that would be taken but not perform them
- fiftyone.core.odm.database.get_indexed_values(collection, field_or_fields, *, index_key=None, query=None, values_only=False, _stream=False)#
Returns the values of the field(s) for all samples in the given collection that are covered by the index. Raises an error if the field is not indexed.
- Parameters:
collection – a
pymongo.collection.Collection
ormotor.motor_asyncio.AsyncIOMotorCollection
field_or_fields – the field name or list of field names to retrieve.
index_key (None) – the name of the index to use. If None, the default index name will be constructed from the field name(s).
query (None) – a dict selection filter to apply when querying. For performance, this should only include fields that are in the specified index.
values_only (False) – whether to remove field names from the resulting list. If True, the field names are removed and only the values will be returned as a list for each sample. If False, the field names are preserved and the values will be returned as a dict for each sample.
- Returns:
a list of values for the specified field or index keys for each sample sorted in the same order as the index
- Raises:
ValueError – if the field is not indexed