fiftyone.core.aggregations¶
Aggregations.
Classes:
|
Abstract base class for all aggregations. |
|
Computes the bounds of a numeric field of a collection. |
|
Counts the number of field values in a collection. |
|
Counts the occurrences of field values in a collection. |
|
Computes the distinct values of a field in a collection. |
|
Efficiently computes a set of aggregations rooted at a common path using faceted computations. |
|
Computes a histogram of the field values in a collection. |
|
Computes the minimum of a numeric field of a collection. |
|
Computes the maximum of a numeric field of a collection. |
|
Computes the arithmetic mean of the field values of a collection. |
|
Computes the quantile(s) of the field values of a collection. |
|
Extracts the names and types of the attributes of a specified embedded document field across all samples in a collection. |
|
Extracts the value type(s) in a specified list field across all samples in a collection. |
|
Computes the standard deviation of the field values of a collection. |
|
Computes the sum of the field values of a collection. |
|
Extracts the values of the field from all samples in a collection. |
Exceptions:
An error raised during the execution of an |
-
class
fiftyone.core.aggregations.
Aggregation
(field_or_expr, expr=None, safe=False)¶ Bases:
object
Abstract base class for all aggregations.
Aggregation
instances represent an aggregation or reduction of afiftyone.core.collections.SampleCollection
instance.- Parameters
field_or_expr – a field name,
embedded.field.name
,fiftyone.core.expressions.ViewExpression
, or MongoDB expression defining the field or expression to aggregateexpr (None) –
a
fiftyone.core.expressions.ViewExpression
or MongoDB expression to apply tofield_or_expr
(which must be a field) before aggregatingsafe (False) – whether to ignore nan/inf values when dealing with floating point values
Attributes:
The name of the field being computed on, if any.
The expression being computed, if any.
Whether nan/inf values will be ignored when dealing with floating point values.
Methods:
to_mongo
(sample_collection[, context])Returns the MongoDB aggregation pipeline for this aggregation.
parse_result
(d)Parses the output of
to_mongo()
.Returns the default result for this aggregation.
-
property
field_name
¶ The name of the field being computed on, if any.
-
property
expr
¶ The expression being computed, if any.
-
property
safe
¶ Whether nan/inf values will be ignored when dealing with floating point values.
-
to_mongo
(sample_collection, context=None)¶ Returns the MongoDB aggregation pipeline for this aggregation.
- Parameters
sample_collection – the
fiftyone.core.collections.SampleCollection
to which the aggregation is being appliedcontext (None) – a path context from which to resolve
- Returns
a MongoDB aggregation pipeline (list of dicts)
-
parse_result
(d)¶ Parses the output of
to_mongo()
.- Parameters
d – the result dict, or, when
_is_big_batchable()
is True, the iterable of result dicts- Returns
the aggregation result
-
default_result
()¶ Returns the default result for this aggregation.
Default results are used when aggregations are applied to empty collections.
- Returns
the aggregation result
-
exception
fiftyone.core.aggregations.
AggregationError
¶ Bases:
Exception
An error raised during the execution of an
Aggregation
.-
args
¶
-
with_traceback
()¶ Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
-
-
class
fiftyone.core.aggregations.
Bounds
(field_or_expr, expr=None, safe=False, _count_nonfinites=False)¶ Bases:
fiftyone.core.aggregations.Aggregation
Computes the bounds of a numeric field of a collection.
None
-valued fields are ignored.This aggregation is typically applied to numeric or date field types (or lists of such types):
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", numeric_field=1.0, numeric_list_field=[1, 2, 3], ), fo.Sample( filepath="/path/to/image2.png", numeric_field=4.0, numeric_list_field=[1, 2], ), fo.Sample( filepath="/path/to/image3.png", numeric_field=None, numeric_list_field=None, ), ] ) # # Compute the bounds of a numeric field # aggregation = fo.Bounds("numeric_field") bounds = dataset.aggregate(aggregation) print(bounds) # (min, max) # # Compute the bounds of a numeric list field # aggregation = fo.Bounds("numeric_list_field") bounds = dataset.aggregate(aggregation) print(bounds) # (min, max) # # Compute the bounds of a transformation of a numeric field # aggregation = fo.Bounds(2 * (F("numeric_field") + 1)) bounds = dataset.aggregate(aggregation) print(bounds) # (min, max)
- Parameters
field_or_expr –
a field name,
embedded.field.name
,fiftyone.core.expressions.ViewExpression
, or MongoDB expression defining the field or expression to aggregateexpr (None) –
a
fiftyone.core.expressions.ViewExpression
or MongoDB expression to apply tofield_or_expr
(which must be a field) before aggregatingsafe (False) – whether to ignore nan/inf values when dealing with floating point values
Methods:
Returns the default result for this aggregation.
parse_result
(d)Parses the output of
to_mongo()
.to_mongo
(sample_collection[, context])Returns the MongoDB aggregation pipeline for this aggregation.
Attributes:
The expression being computed, if any.
The name of the field being computed on, if any.
Whether nan/inf values will be ignored when dealing with floating point values.
-
default_result
()¶ Returns the default result for this aggregation.
- Returns
(None, None)
-
parse_result
(d)¶ Parses the output of
to_mongo()
.- Parameters
d – the result dict
- Returns
the
(min, max)
bounds
-
to_mongo
(sample_collection, context=None)¶ Returns the MongoDB aggregation pipeline for this aggregation.
- Parameters
sample_collection – the
fiftyone.core.collections.SampleCollection
to which the aggregation is being appliedcontext (None) – a path context from which to resolve
- Returns
a MongoDB aggregation pipeline (list of dicts)
-
property
expr
¶ The expression being computed, if any.
-
property
field_name
¶ The name of the field being computed on, if any.
-
property
safe
¶ Whether nan/inf values will be ignored when dealing with floating point values.
-
class
fiftyone.core.aggregations.
Count
(field_or_expr=None, expr=None, safe=False, _unwind=True)¶ Bases:
fiftyone.core.aggregations.Aggregation
Counts the number of field values in a collection.
None
-valued fields are ignored.If no field or expression is provided, the samples themselves are counted.
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", predictions=fo.Detections( detections=[ fo.Detection(label="cat"), fo.Detection(label="dog"), ] ), ), fo.Sample( filepath="/path/to/image2.png", predictions=fo.Detections( detections=[ fo.Detection(label="cat"), fo.Detection(label="rabbit"), fo.Detection(label="squirrel"), ] ), ), fo.Sample( filepath="/path/to/image3.png", predictions=None, ), ] ) # # Count the number of samples in the dataset # aggregation = fo.Count() count = dataset.aggregate(aggregation) print(count) # the count # # Count the number of samples with `predictions` # aggregation = fo.Count("predictions") count = dataset.aggregate(aggregation) print(count) # the count # # Count the number of objects in the `predictions` field # aggregation = fo.Count("predictions.detections") count = dataset.aggregate(aggregation) print(count) # the count # # Count the number of objects in samples with > 2 predictions # aggregation = fo.Count( (F("predictions.detections").length() > 2).if_else( F("predictions.detections"), None ) ) count = dataset.aggregate(aggregation) print(count) # the count
- Parameters
field_or_expr (None) –
a field name,
embedded.field.name
,fiftyone.core.expressions.ViewExpression
, or MongoDB expression defining the field or expression to aggregate. If neitherfield_or_expr
orexpr
is provided, the samples themselves are countedexpr (None) –
a
fiftyone.core.expressions.ViewExpression
or MongoDB expression to apply tofield_or_expr
(which must be a field) before aggregatingsafe (False) – whether to ignore nan/inf values when dealing with floating point values
Methods:
Returns the default result for this aggregation.
parse_result
(d)Parses the output of
to_mongo()
.to_mongo
(sample_collection[, context])Returns the MongoDB aggregation pipeline for this aggregation.
Attributes:
The expression being computed, if any.
The name of the field being computed on, if any.
Whether nan/inf values will be ignored when dealing with floating point values.
-
default_result
()¶ Returns the default result for this aggregation.
- Returns
0
-
parse_result
(d)¶ Parses the output of
to_mongo()
.- Parameters
d – the result dict
- Returns
the count
-
to_mongo
(sample_collection, context=None)¶ Returns the MongoDB aggregation pipeline for this aggregation.
- Parameters
sample_collection – the
fiftyone.core.collections.SampleCollection
to which the aggregation is being appliedcontext (None) – a path context from which to resolve
- Returns
a MongoDB aggregation pipeline (list of dicts)
-
property
expr
¶ The expression being computed, if any.
-
property
field_name
¶ The name of the field being computed on, if any.
-
property
safe
¶ Whether nan/inf values will be ignored when dealing with floating point values.
-
class
fiftyone.core.aggregations.
CountValues
(field_or_expr, expr=None, safe=False, _first=None, _sort_by='count', _asc=True, _include=None, _search='', _selected=[])¶ Bases:
fiftyone.core.aggregations.Aggregation
Counts the occurrences of field values in a collection.
This aggregation is typically applied to countable field types (or lists of such types):
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", tags=["sunny"], predictions=fo.Detections( detections=[ fo.Detection(label="cat"), fo.Detection(label="dog"), ] ), ), fo.Sample( filepath="/path/to/image2.png", tags=["cloudy"], predictions=fo.Detections( detections=[ fo.Detection(label="cat"), fo.Detection(label="rabbit"), ] ), ), fo.Sample( filepath="/path/to/image3.png", predictions=None, ), ] ) # # Compute the tag counts in the dataset # aggregation = fo.CountValues("tags") counts = dataset.aggregate(aggregation) print(counts) # dict mapping values to counts # # Compute the predicted label counts in the dataset # aggregation = fo.CountValues("predictions.detections.label") counts = dataset.aggregate(aggregation) print(counts) # dict mapping values to counts # # Compute the predicted label counts after some normalization # aggregation = fo.CountValues( F("predictions.detections.label").map_values( {"cat": "pet", "dog": "pet"} ).upper() ) counts = dataset.aggregate(aggregation) print(counts) # dict mapping values to counts
- Parameters
field_or_expr –
a field name,
embedded.field.name
,fiftyone.core.expressions.ViewExpression
, or MongoDB expression defining the field or expression to aggregateexpr (None) –
a
fiftyone.core.expressions.ViewExpression
or MongoDB expression to apply tofield_or_expr
(which must be a field) before aggregatingsafe (False) – whether to treat nan/inf values as None when dealing with floating point values
Methods:
Returns the default result for this aggregation.
parse_result
(d)Parses the output of
to_mongo()
.to_mongo
(sample_collection[, context])Returns the MongoDB aggregation pipeline for this aggregation.
Attributes:
The expression being computed, if any.
The name of the field being computed on, if any.
Whether nan/inf values will be ignored when dealing with floating point values.
-
default_result
()¶ Returns the default result for this aggregation.
- Returns
{}
-
parse_result
(d)¶ Parses the output of
to_mongo()
.- Parameters
d – the result dict
- Returns
a dict mapping values to counts
-
to_mongo
(sample_collection, context=None)¶ Returns the MongoDB aggregation pipeline for this aggregation.
- Parameters
sample_collection – the
fiftyone.core.collections.SampleCollection
to which the aggregation is being appliedcontext (None) – a path context from which to resolve
- Returns
a MongoDB aggregation pipeline (list of dicts)
-
property
expr
¶ The expression being computed, if any.
-
property
field_name
¶ The name of the field being computed on, if any.
-
property
safe
¶ Whether nan/inf values will be ignored when dealing with floating point values.
-
class
fiftyone.core.aggregations.
Distinct
(field_or_expr, expr=None, safe=False)¶ Bases:
fiftyone.core.aggregations.Aggregation
Computes the distinct values of a field in a collection.
None
-valued fields are ignored.This aggregation is typically applied to countable field types (or lists of such types):
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", tags=["sunny"], predictions=fo.Detections( detections=[ fo.Detection(label="cat"), fo.Detection(label="dog"), ] ), ), fo.Sample( filepath="/path/to/image2.png", tags=["sunny", "cloudy"], predictions=fo.Detections( detections=[ fo.Detection(label="cat"), fo.Detection(label="rabbit"), ] ), ), fo.Sample( filepath="/path/to/image3.png", predictions=None, ), ] ) # # Get the distinct tags in a dataset # aggregation = fo.Distinct("tags") values = dataset.aggregate(aggregation) print(values) # list of distinct values # # Get the distinct predicted labels in a dataset # aggregation = fo.Distinct("predictions.detections.label") values = dataset.aggregate(aggregation) print(values) # list of distinct values # # Get the distinct predicted labels after some normalization # aggregation = fo.Distinct( F("predictions.detections.label").map_values( {"cat": "pet", "dog": "pet"} ).upper() ) values = dataset.aggregate(aggregation) print(values) # list of distinct values
- Parameters
field_or_expr –
a field name,
embedded.field.name
,fiftyone.core.expressions.ViewExpression
, or MongoDB expression defining the field or expression to aggregateexpr (None) –
a
fiftyone.core.expressions.ViewExpression
or MongoDB expression to apply tofield_or_expr
(which must be a field) before aggregatingsafe (False) – whether to ignore nan/inf values when dealing with floating point values
Methods:
Returns the default result for this aggregation.
parse_result
(d)Parses the output of
to_mongo()
.to_mongo
(sample_collection[, context])Returns the MongoDB aggregation pipeline for this aggregation.
Attributes:
The expression being computed, if any.
The name of the field being computed on, if any.
Whether nan/inf values will be ignored when dealing with floating point values.
-
default_result
()¶ Returns the default result for this aggregation.
- Returns
[]
-
parse_result
(d)¶ Parses the output of
to_mongo()
.- Parameters
d – the result dict
- Returns
a sorted list of distinct values
-
to_mongo
(sample_collection, context=None)¶ Returns the MongoDB aggregation pipeline for this aggregation.
- Parameters
sample_collection – the
fiftyone.core.collections.SampleCollection
to which the aggregation is being appliedcontext (None) – a path context from which to resolve
- Returns
a MongoDB aggregation pipeline (list of dicts)
-
property
expr
¶ The expression being computed, if any.
-
property
field_name
¶ The name of the field being computed on, if any.
-
property
safe
¶ Whether nan/inf values will be ignored when dealing with floating point values.
-
class
fiftyone.core.aggregations.
FacetAggregations
(field_name, aggregations, _compiled=False)¶ Bases:
fiftyone.core.aggregations.Aggregation
Efficiently computes a set of aggregations rooted at a common path using faceted computations.
Note
All
aggregations
provided to this method are interpreted relative to the providedfield_name
.Examples:
import fiftyone as fo dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", tags=["sunny"], predictions=fo.Detections( detections=[ fo.Detection(label="cat", confidence=0.4), fo.Detection(label="dog", confidence=0.5), ] ), ), fo.Sample( filepath="/path/to/image2.png", tags=["sunny", "cloudy"], predictions=fo.Detections( detections=[ fo.Detection(label="cat", confidence=0.6), fo.Detection(label="rabbit", confidence=0.7), ] ), ), fo.Sample( filepath="/path/to/image3.png", predictions=None, ), ] ) # # Compute prediction label value counts and confidence bounds # values, bounds = dataset.aggregate( fo.FacetAggregations( "predictions.detections", [fo.CountValues("label"), fo.Bounds("confidence")] ) ) print(values) # label value counts print(bounds) # confidence bounds
- Parameters
field_name – a field name or
embedded.field.name
aggregations – a list or dict of
Aggregation
instances
Methods:
Returns the default result for this aggregation.
parse_result
(d)Parses the output of
to_mongo()
.to_mongo
(sample_collection[, context])Returns the MongoDB aggregation pipeline for this aggregation.
Attributes:
The expression being computed, if any.
The name of the field being computed on, if any.
Whether nan/inf values will be ignored when dealing with floating point values.
-
default_result
()¶ Returns the default result for this aggregation.
- Returns
the default result of each sub-aggregation in the same container type as the sub-aggregations were provided (list or dict)
-
parse_result
(d)¶ Parses the output of
to_mongo()
.- Parameters
d – the result dict
- Returns
the parsed result of each sub-aggregation in the same container type as the sub-aggregations were provided (list or dict)
-
to_mongo
(sample_collection, context=None)¶ Returns the MongoDB aggregation pipeline for this aggregation.
- Parameters
sample_collection – the
fiftyone.core.collections.SampleCollection
to which the aggregation is being appliedcontext (None) – a path context from which to resolve
- Returns
a MongoDB aggregation pipeline (list of dicts)
-
property
expr
¶ The expression being computed, if any.
-
property
field_name
¶ The name of the field being computed on, if any.
-
property
safe
¶ Whether nan/inf values will be ignored when dealing with floating point values.
-
class
fiftyone.core.aggregations.
HistogramValues
(field_or_expr, expr=None, bins=None, range=None, auto=False)¶ Bases:
fiftyone.core.aggregations.Aggregation
Computes a histogram of the field values in a collection.
This aggregation is typically applied to numeric or date field types (or lists of such types):
Examples:
import numpy as np import matplotlib.pyplot as plt import fiftyone as fo from fiftyone import ViewField as F samples = [] for idx in range(100): samples.append( fo.Sample( filepath="/path/to/image%d.png" % idx, numeric_field=np.random.randn(), numeric_list_field=list(np.random.randn(10)), ) ) dataset = fo.Dataset() dataset.add_samples(samples) def plot_hist(counts, edges): counts = np.asarray(counts) edges = np.asarray(edges) left_edges = edges[:-1] widths = edges[1:] - edges[:-1] plt.bar(left_edges, counts, width=widths, align="edge") # # Compute a histogram of a numeric field # aggregation = fo.HistogramValues("numeric_field", bins=50) counts, edges, other = dataset.aggregate(aggregation) plot_hist(counts, edges) plt.show(block=False) # # Compute the histogram of a numeric list field # aggregation = fo.HistogramValues("numeric_list_field", bins=50) counts, edges, other = dataset.aggregate(aggregation) plot_hist(counts, edges) plt.show(block=False) # # Compute the histogram of a transformation of a numeric field # aggregation = fo.HistogramValues(2 * (F("numeric_field") + 1), bins=50) counts, edges, other = dataset.aggregate(aggregation) plot_hist(counts, edges) plt.show(block=False)
- Parameters
field_or_expr –
a field name,
embedded.field.name
,fiftyone.core.expressions.ViewExpression
, or MongoDB expression defining the field or expression to aggregateexpr (None) –
a
fiftyone.core.expressions.ViewExpression
or MongoDB expression to apply tofield_or_expr
(which must be a field) before aggregatingbins (None) – can be either an integer number of bins to generate or a monotonically increasing sequence specifying the bin edges to use. By default, 10 bins are created. If
bins
is an integer and norange
is specified, bin edges are automatically computed from the bounds of the fieldrange (None) – a
(lower, upper)
tuple specifying a range in which to generate equal-width bins. Only applicable whenbins
is an integer orNone
auto (False) – whether to automatically choose bin edges in an attempt to evenly distribute the counts in each bin. If this option is chosen,
bins
will only be used if it is an integer, and therange
parameter is ignored
Methods:
Returns the default result for this aggregation.
parse_result
(d)Parses the output of
to_mongo()
.to_mongo
(sample_collection[, context])Returns the MongoDB aggregation pipeline for this aggregation.
Attributes:
The expression being computed, if any.
The name of the field being computed on, if any.
Whether nan/inf values will be ignored when dealing with floating point values.
-
default_result
()¶ Returns the default result for this aggregation.
- Returns
a tuple of
counts:
[]
edges:
[]
other:
0
-
parse_result
(d)¶ Parses the output of
to_mongo()
.- Parameters
d – the result dict
- Returns
a tuple of
counts: a list of counts in each bin
edges: an increasing list of bin edges of length
len(counts) + 1
. Note that each bin is treated as having an inclusive lower boundary and exclusive upper boundary,[lower, upper)
, including the rightmost binother: the number of items outside the bins
-
to_mongo
(sample_collection, context=None)¶ Returns the MongoDB aggregation pipeline for this aggregation.
- Parameters
sample_collection – the
fiftyone.core.collections.SampleCollection
to which the aggregation is being appliedcontext (None) – a path context from which to resolve
- Returns
a MongoDB aggregation pipeline (list of dicts)
-
property
expr
¶ The expression being computed, if any.
-
property
field_name
¶ The name of the field being computed on, if any.
-
property
safe
¶ Whether nan/inf values will be ignored when dealing with floating point values.
-
class
fiftyone.core.aggregations.
Min
(field_or_expr, expr=None, safe=False)¶ Bases:
fiftyone.core.aggregations.Aggregation
Computes the minimum of a numeric field of a collection.
None
-valued fields are ignored.This aggregation is typically applied to numeric or date field types (or lists of such types):
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", numeric_field=1.0, numeric_list_field=[1, 2, 3], ), fo.Sample( filepath="/path/to/image2.png", numeric_field=4.0, numeric_list_field=[1, 2], ), fo.Sample( filepath="/path/to/image3.png", numeric_field=None, numeric_list_field=None, ), ] ) # # Compute the minimum of a numeric field # aggregation = fo.Min("numeric_field") min = dataset.aggregate(aggregation) print(min) # the min # # Compute the minimum of a numeric list field # aggregation = fo.Min("numeric_list_field") min = dataset.aggregate(aggregation) print(min) # the min # # Compute the minimum of a transformation of a numeric field # aggregation = fo.Min(2 * (F("numeric_field") + 1)) min = dataset.aggregate(aggregation) print(min) # the min
- Parameters
field_or_expr –
a field name,
embedded.field.name
,fiftyone.core.expressions.ViewExpression
, or MongoDB expression defining the field or expression to aggregateexpr (None) –
a
fiftyone.core.expressions.ViewExpression
or MongoDB expression to apply tofield_or_expr
(which must be a field) before aggregatingsafe (False) – whether to ignore nan/inf values when dealing with floating point values
Methods:
Returns the default result for this aggregation.
parse_result
(d)Parses the output of
to_mongo()
.to_mongo
(sample_collection[, context])Returns the MongoDB aggregation pipeline for this aggregation.
Attributes:
The expression being computed, if any.
The name of the field being computed on, if any.
Whether nan/inf values will be ignored when dealing with floating point values.
-
default_result
()¶ Returns the default result for this aggregation.
- Returns
None
-
parse_result
(d)¶ Parses the output of
to_mongo()
.- Parameters
d – the result dict
- Returns
the minimum value
-
to_mongo
(sample_collection, context=None)¶ Returns the MongoDB aggregation pipeline for this aggregation.
- Parameters
sample_collection – the
fiftyone.core.collections.SampleCollection
to which the aggregation is being appliedcontext (None) – a path context from which to resolve
- Returns
a MongoDB aggregation pipeline (list of dicts)
-
property
expr
¶ The expression being computed, if any.
-
property
field_name
¶ The name of the field being computed on, if any.
-
property
safe
¶ Whether nan/inf values will be ignored when dealing with floating point values.
-
class
fiftyone.core.aggregations.
Max
(field_or_expr, expr=None, safe=False)¶ Bases:
fiftyone.core.aggregations.Aggregation
Computes the maximum of a numeric field of a collection.
None
-valued fields are ignored.This aggregation is typically applied to numeric or date field types (or lists of such types):
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", numeric_field=1.0, numeric_list_field=[1, 2, 3], ), fo.Sample( filepath="/path/to/image2.png", numeric_field=4.0, numeric_list_field=[1, 2], ), fo.Sample( filepath="/path/to/image3.png", numeric_field=None, numeric_list_field=None, ), ] ) # # Compute the maximum of a numeric field # aggregation = fo.Max("numeric_field") max = dataset.aggregate(aggregation) print(max) # the max # # Compute the maximum of a numeric list field # aggregation = fo.Max("numeric_list_field") max = dataset.aggregate(aggregation) print(max) # the max # # Compute the maximum of a transformation of a numeric field # aggregation = fo.Max(2 * (F("numeric_field") + 1)) max = dataset.aggregate(aggregation) print(max) # the max
- Parameters
field_or_expr –
a field name,
embedded.field.name
,fiftyone.core.expressions.ViewExpression
, or MongoDB expression defining the field or expression to aggregateexpr (None) –
a
fiftyone.core.expressions.ViewExpression
or MongoDB expression to apply tofield_or_expr
(which must be a field) before aggregatingsafe (False) – whether to ignore nan/inf values when dealing with floating point values
Methods:
Returns the default result for this aggregation.
parse_result
(d)Parses the output of
to_mongo()
.to_mongo
(sample_collection[, context])Returns the MongoDB aggregation pipeline for this aggregation.
Attributes:
The expression being computed, if any.
The name of the field being computed on, if any.
Whether nan/inf values will be ignored when dealing with floating point values.
-
default_result
()¶ Returns the default result for this aggregation.
- Returns
None
-
parse_result
(d)¶ Parses the output of
to_mongo()
.- Parameters
d – the result dict
- Returns
the maximum value
-
to_mongo
(sample_collection, context=None)¶ Returns the MongoDB aggregation pipeline for this aggregation.
- Parameters
sample_collection – the
fiftyone.core.collections.SampleCollection
to which the aggregation is being appliedcontext (None) – a path context from which to resolve
- Returns
a MongoDB aggregation pipeline (list of dicts)
-
property
expr
¶ The expression being computed, if any.
-
property
field_name
¶ The name of the field being computed on, if any.
-
property
safe
¶ Whether nan/inf values will be ignored when dealing with floating point values.
-
class
fiftyone.core.aggregations.
Mean
(field_or_expr, expr=None, safe=False)¶ Bases:
fiftyone.core.aggregations.Aggregation
Computes the arithmetic mean of the field values of a collection.
None
-valued fields are ignored.This aggregation is typically applied to numeric field types (or lists of such types):
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", numeric_field=1.0, numeric_list_field=[1, 2, 3], ), fo.Sample( filepath="/path/to/image2.png", numeric_field=4.0, numeric_list_field=[1, 2], ), fo.Sample( filepath="/path/to/image3.png", numeric_field=None, numeric_list_field=None, ), ] ) # # Compute the mean of a numeric field # aggregation = fo.Mean("numeric_field") mean = dataset.aggregate(aggregation) print(mean) # the mean # # Compute the mean of a numeric list field # aggregation = fo.Mean("numeric_list_field") mean = dataset.aggregate(aggregation) print(mean) # the mean # # Compute the mean of a transformation of a numeric field # aggregation = fo.Mean(2 * (F("numeric_field") + 1)) mean = dataset.aggregate(aggregation) print(mean) # the mean
- Parameters
field_or_expr –
a field name,
embedded.field.name
,fiftyone.core.expressions.ViewExpression
, or MongoDB expression defining the field or expression to aggregateexpr (None) –
a
fiftyone.core.expressions.ViewExpression
or MongoDB expression to apply tofield_or_expr
(which must be a field) before aggregatingsafe (False) – whether to ignore nan/inf values when dealing with floating point values
Methods:
Returns the default result for this aggregation.
parse_result
(d)Parses the output of
to_mongo()
.to_mongo
(sample_collection[, context])Returns the MongoDB aggregation pipeline for this aggregation.
Attributes:
The expression being computed, if any.
The name of the field being computed on, if any.
Whether nan/inf values will be ignored when dealing with floating point values.
-
default_result
()¶ Returns the default result for this aggregation.
- Returns
0
-
parse_result
(d)¶ Parses the output of
to_mongo()
.- Parameters
d – the result dict
- Returns
the mean
-
to_mongo
(sample_collection, context=None)¶ Returns the MongoDB aggregation pipeline for this aggregation.
- Parameters
sample_collection – the
fiftyone.core.collections.SampleCollection
to which the aggregation is being appliedcontext (None) – a path context from which to resolve
- Returns
a MongoDB aggregation pipeline (list of dicts)
-
property
expr
¶ The expression being computed, if any.
-
property
field_name
¶ The name of the field being computed on, if any.
-
property
safe
¶ Whether nan/inf values will be ignored when dealing with floating point values.
-
class
fiftyone.core.aggregations.
Quantiles
(field_or_expr, quantiles, expr=None, safe=False)¶ Bases:
fiftyone.core.aggregations.Aggregation
Computes the quantile(s) of the field values of a collection.
None
-valued fields are ignored.This aggregation is typically applied to numeric field types (or lists of such types):
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", numeric_field=1.0, numeric_list_field=[1, 2, 3], ), fo.Sample( filepath="/path/to/image2.png", numeric_field=4.0, numeric_list_field=[1, 2], ), fo.Sample( filepath="/path/to/image3.png", numeric_field=None, numeric_list_field=None, ), ] ) # # Compute the quantiles of a numeric field # aggregation = fo.Quantiles("numeric_field", [0.1, 0.5, 0.9]) quantiles = dataset.aggregate(aggregation) print(quantiles) # the quantiles # # Compute the quantiles of a numeric list field # aggregation = fo.Quantiles("numeric_list_field", [0.1, 0.5, 0.9]) quantiles = dataset.aggregate(aggregation) print(quantiles) # the quantiles # # Compute the mean of a transformation of a numeric field # aggregation = fo.Quantiles(2 * (F("numeric_field") + 1), [0.1, 0.5, 0.9]) quantiles = dataset.aggregate(aggregation) print(quantiles) # the quantiles
- Parameters
field_or_expr –
a field name,
embedded.field.name
,fiftyone.core.expressions.ViewExpression
, or MongoDB expression defining the field or expression to aggregatequantiles – the quantile or iterable of quantiles to compute. Each quantile must be a numeric value in
[0, 1]
expr (None) –
a
fiftyone.core.expressions.ViewExpression
or MongoDB expression to apply tofield_or_expr
(which must be a field) before aggregatingsafe (False) – whether to ignore nan/inf values when dealing with floating point values
Methods:
Returns the default result for this aggregation.
parse_result
(d)Parses the output of
to_mongo()
.to_mongo
(sample_collection[, context])Returns the MongoDB aggregation pipeline for this aggregation.
Attributes:
The expression being computed, if any.
The name of the field being computed on, if any.
Whether nan/inf values will be ignored when dealing with floating point values.
-
default_result
()¶ Returns the default result for this aggregation.
- Returns
None
or[None, None, None]
-
parse_result
(d)¶ Parses the output of
to_mongo()
.- Parameters
d – the result dict
- Returns
the quantile or list of quantiles
-
to_mongo
(sample_collection, context=None)¶ Returns the MongoDB aggregation pipeline for this aggregation.
- Parameters
sample_collection – the
fiftyone.core.collections.SampleCollection
to which the aggregation is being appliedcontext (None) – a path context from which to resolve
- Returns
a MongoDB aggregation pipeline (list of dicts)
-
property
expr
¶ The expression being computed, if any.
-
property
field_name
¶ The name of the field being computed on, if any.
-
property
safe
¶ Whether nan/inf values will be ignored when dealing with floating point values.
-
class
fiftyone.core.aggregations.
Schema
(field_or_expr, expr=None, dynamic_only=False, _doc_type=None, _include_private=False)¶ Bases:
fiftyone.core.aggregations.Aggregation
Extracts the names and types of the attributes of a specified embedded document field across all samples in a collection.
Schema aggregations are useful for detecting the presence and types of dynamic attributes of
fiftyone.core.labels.Label
fields across a collection.Examples:
import fiftyone as fo dataset = fo.Dataset() sample1 = fo.Sample( filepath="image1.png", ground_truth=fo.Detections( detections=[ fo.Detection( label="cat", bounding_box=[0.1, 0.1, 0.4, 0.4], foo="bar", hello=True, ), fo.Detection( label="dog", bounding_box=[0.5, 0.5, 0.4, 0.4], hello=None, ) ] ) ) sample2 = fo.Sample( filepath="image2.png", ground_truth=fo.Detections( detections=[ fo.Detection( label="rabbit", bounding_box=[0.1, 0.1, 0.4, 0.4], foo=None, ), fo.Detection( label="squirrel", bounding_box=[0.5, 0.5, 0.4, 0.4], hello="there", ), ] ) ) dataset.add_samples([sample1, sample2]) # # Get schema of all dynamic attributes on the detections in a # `Detections` field # aggregation = fo.Schema("ground_truth.detections", dynamic_only=True) print(dataset.aggregate(aggregation)) # {'foo': StringField, 'hello': [BooleanField, StringField]}
- Parameters
field_or_expr –
a field name,
embedded.field.name
,fiftyone.core.expressions.ViewExpression
, or MongoDB expression defining the field or expression to aggregateexpr (None) –
a
fiftyone.core.expressions.ViewExpression
or MongoDB expression to apply tofield_or_expr
(which must be a field) before aggregatingdynamic_only (False) – whether to only include dynamically added attributes
Methods:
Returns the default result for this aggregation.
parse_result
(d)Parses the output of
to_mongo()
.to_mongo
(sample_collection[, context])Returns the MongoDB aggregation pipeline for this aggregation.
Attributes:
The expression being computed, if any.
The name of the field being computed on, if any.
Whether nan/inf values will be ignored when dealing with floating point values.
-
default_result
()¶ Returns the default result for this aggregation.
- Returns
{}
-
parse_result
(d)¶ Parses the output of
to_mongo()
.- Parameters
d – the result dict
- Returns
a dict mapping field names to
fiftyone.core.fields.Field
instances. If a field’s values takes multiple non-None types, the list of observed types will be returned
-
to_mongo
(sample_collection, context=None)¶ Returns the MongoDB aggregation pipeline for this aggregation.
- Parameters
sample_collection – the
fiftyone.core.collections.SampleCollection
to which the aggregation is being appliedcontext (None) – a path context from which to resolve
- Returns
a MongoDB aggregation pipeline (list of dicts)
-
property
expr
¶ The expression being computed, if any.
-
property
field_name
¶ The name of the field being computed on, if any.
-
property
safe
¶ Whether nan/inf values will be ignored when dealing with floating point values.
-
class
fiftyone.core.aggregations.
ListSchema
(field_or_expr, expr=None)¶ Bases:
fiftyone.core.aggregations.Aggregation
Extracts the value type(s) in a specified list field across all samples in a collection.
Examples:
from datetime import datetime import fiftyone as fo dataset = fo.Dataset() sample1 = fo.Sample( filepath="image1.png", ground_truth=fo.Classification( label="cat", info=[ fo.DynamicEmbeddedDocument( task="initial_annotation", author="Alice", timestamp=datetime(1970, 1, 1), notes=["foo", "bar"], ), fo.DynamicEmbeddedDocument( task="editing_pass", author="Bob", timestamp=datetime.utcnow(), ), ], ), ) sample2 = fo.Sample( filepath="image2.png", ground_truth=fo.Classification( label="dog", info=[ fo.DynamicEmbeddedDocument( task="initial_annotation", author="Bob", timestamp=datetime(2018, 10, 18), notes=["spam", "eggs"], ), ], ), ) dataset.add_samples([sample1, sample2]) # Determine that `ground_truth.info` contains embedded documents aggregation = fo.ListSchema("ground_truth.info") print(dataset.aggregate(aggregation)) # fo.EmbeddedDocumentField # Determine the fields of the embedded documents in the list aggregation = fo.Schema("ground_truth.info[]") print(dataset.aggregate(aggregation)) # {'task': StringField, ..., 'notes': ListField} # Determine the type of the values in the nested `notes` list field # Since `ground_truth.info` is not yet declared on the dataset's # schema, we must manually include `[]` to unwind the info lists aggregation = fo.ListSchema("ground_truth.info[].notes") print(dataset.aggregate(aggregation)) # fo.StringField # Declare the `ground_truth.info` field dataset.add_sample_field( "ground_truth.info", fo.ListField, subfield=fo.EmbeddedDocumentField, embedded_doc_type=fo.DynamicEmbeddedDocument, ) # Now we can inspect the nested `notes` field without unwinding aggregation = fo.ListSchema("ground_truth.info.notes") print(dataset.aggregate(aggregation)) # fo.StringField
- Parameters
field_or_expr –
a field name,
embedded.field.name
,fiftyone.core.expressions.ViewExpression
, or MongoDB expression defining the field or expression to aggregateexpr (None) –
a
fiftyone.core.expressions.ViewExpression
or MongoDB expression to apply tofield_or_expr
(which must be a field) before aggregating
Methods:
Returns the default result for this aggregation.
parse_result
(d)Parses the output of
to_mongo()
.to_mongo
(sample_collection[, context])Returns the MongoDB aggregation pipeline for this aggregation.
Attributes:
The expression being computed, if any.
The name of the field being computed on, if any.
Whether nan/inf values will be ignored when dealing with floating point values.
-
default_result
()¶ Returns the default result for this aggregation.
- Returns
[]
-
parse_result
(d)¶ Parses the output of
to_mongo()
.- Parameters
d – the result dict
- Returns
a
fiftyone.core.fields.Field
or list offiftyone.core.fields.Field
instances describing the value type(s) in the list
-
to_mongo
(sample_collection, context=None)¶ Returns the MongoDB aggregation pipeline for this aggregation.
- Parameters
sample_collection – the
fiftyone.core.collections.SampleCollection
to which the aggregation is being appliedcontext (None) – a path context from which to resolve
- Returns
a MongoDB aggregation pipeline (list of dicts)
-
property
expr
¶ The expression being computed, if any.
-
property
field_name
¶ The name of the field being computed on, if any.
-
property
safe
¶ Whether nan/inf values will be ignored when dealing with floating point values.
-
class
fiftyone.core.aggregations.
Std
(field_or_expr, expr=None, safe=False, sample=False)¶ Bases:
fiftyone.core.aggregations.Aggregation
Computes the standard deviation of the field values of a collection.
None
-valued fields are ignored.This aggregation is typically applied to numeric field types (or lists of such types):
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", numeric_field=1.0, numeric_list_field=[1, 2, 3], ), fo.Sample( filepath="/path/to/image2.png", numeric_field=4.0, numeric_list_field=[1, 2], ), fo.Sample( filepath="/path/to/image3.png", numeric_field=None, numeric_list_field=None, ), ] ) # # Compute the standard deviation of a numeric field # aggregation = fo.Std("numeric_field") std = dataset.aggregate(aggregation) print(std) # the standard deviation # # Compute the standard deviation of a numeric list field # aggregation = fo.Std("numeric_list_field") std = dataset.aggregate(aggregation) print(std) # the standard deviation # # Compute the standard deviation of a transformation of a numeric field # aggregation = fo.Std(2 * (F("numeric_field") + 1)) std = dataset.aggregate(aggregation) print(std) # the standard deviation
- Parameters
field_or_expr –
a field name,
embedded.field.name
,fiftyone.core.expressions.ViewExpression
, or MongoDB expression defining the field or expression to aggregateexpr (None) –
a
fiftyone.core.expressions.ViewExpression
or MongoDB expression to apply tofield_or_expr
(which must be a field) before aggregatingsafe (False) – whether to ignore nan/inf values when dealing with floating point values
sample (False) – whether to compute the sample standard deviation rather than the population standard deviation
Methods:
Returns the default result for this aggregation.
parse_result
(d)Parses the output of
to_mongo()
.to_mongo
(sample_collection[, context])Returns the MongoDB aggregation pipeline for this aggregation.
Attributes:
The expression being computed, if any.
The name of the field being computed on, if any.
Whether nan/inf values will be ignored when dealing with floating point values.
-
default_result
()¶ Returns the default result for this aggregation.
- Returns
0
-
parse_result
(d)¶ Parses the output of
to_mongo()
.- Parameters
d – the result dict
- Returns
the standard deviation
-
to_mongo
(sample_collection, context=None)¶ Returns the MongoDB aggregation pipeline for this aggregation.
- Parameters
sample_collection – the
fiftyone.core.collections.SampleCollection
to which the aggregation is being appliedcontext (None) – a path context from which to resolve
- Returns
a MongoDB aggregation pipeline (list of dicts)
-
property
expr
¶ The expression being computed, if any.
-
property
field_name
¶ The name of the field being computed on, if any.
-
property
safe
¶ Whether nan/inf values will be ignored when dealing with floating point values.
-
class
fiftyone.core.aggregations.
Sum
(field_or_expr, expr=None, safe=False)¶ Bases:
fiftyone.core.aggregations.Aggregation
Computes the sum of the field values of a collection.
None
-valued fields are ignored.This aggregation is typically applied to numeric field types (or lists of such types):
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", numeric_field=1.0, numeric_list_field=[1, 2, 3], ), fo.Sample( filepath="/path/to/image2.png", numeric_field=4.0, numeric_list_field=[1, 2], ), fo.Sample( filepath="/path/to/image3.png", numeric_field=None, numeric_list_field=None, ), ] ) # # Compute the sum of a numeric field # aggregation = fo.Sum("numeric_field") total = dataset.aggregate(aggregation) print(total) # the sum # # Compute the sum of a numeric list field # aggregation = fo.Sum("numeric_list_field") total = dataset.aggregate(aggregation) print(total) # the sum # # Compute the sum of a transformation of a numeric field # aggregation = fo.Sum(2 * (F("numeric_field") + 1)) total = dataset.aggregate(aggregation) print(total) # the sum
- Parameters
field_or_expr –
a field name,
embedded.field.name
,fiftyone.core.expressions.ViewExpression
, or MongoDB expression defining the field or expression to aggregateexpr (None) –
a
fiftyone.core.expressions.ViewExpression
or MongoDB expression to apply tofield_or_expr
(which must be a field) before aggregatingsafe (False) – whether to ignore nan/inf values when dealing with floating point values
Methods:
Returns the default result for this aggregation.
parse_result
(d)Parses the output of
to_mongo()
.to_mongo
(sample_collection[, context])Returns the MongoDB aggregation pipeline for this aggregation.
Attributes:
The expression being computed, if any.
The name of the field being computed on, if any.
Whether nan/inf values will be ignored when dealing with floating point values.
-
default_result
()¶ Returns the default result for this aggregation.
- Returns
0
-
parse_result
(d)¶ Parses the output of
to_mongo()
.- Parameters
d – the result dict
- Returns
the sum
-
to_mongo
(sample_collection, context=None)¶ Returns the MongoDB aggregation pipeline for this aggregation.
- Parameters
sample_collection – the
fiftyone.core.collections.SampleCollection
to which the aggregation is being appliedcontext (None) – a path context from which to resolve
- Returns
a MongoDB aggregation pipeline (list of dicts)
-
property
expr
¶ The expression being computed, if any.
-
property
field_name
¶ The name of the field being computed on, if any.
-
property
safe
¶ Whether nan/inf values will be ignored when dealing with floating point values.
-
class
fiftyone.core.aggregations.
Values
(field_or_expr, expr=None, missing_value=None, unwind=False, _allow_missing=False, _big_result=True, _raw=False, _field=None)¶ Bases:
fiftyone.core.aggregations.Aggregation
Extracts the values of the field from all samples in a collection.
Values aggregations are useful for efficiently extracting a slice of field or embedded field values across all samples in a collection. See the examples below for more details.
The dual function of
Values
isset_values()
, which can be used to efficiently set a field or embedded field of all samples in a collection by providing lists of values of same structure returned by this aggregation.Note
Unlike other aggregations,
Values
does not automatically unwind list fields, which ensures that the returned values match the potentially-nested structure of the documents.You can opt-in to unwinding specific list fields using the
[]
syntax, or you can pass the optionalunwind=True
parameter to unwind all supported list fields. See Aggregating list fields for more information.Examples:
import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", numeric_field=1.0, numeric_list_field=[1, 2, 3], ), fo.Sample( filepath="/path/to/image2.png", numeric_field=4.0, numeric_list_field=[1, 2], ), fo.Sample( filepath="/path/to/image3.png", numeric_field=None, numeric_list_field=None, ), ] ) # # Get all values of a field # aggregation = fo.Values("numeric_field") values = dataset.aggregate(aggregation) print(values) # [1.0, 4.0, None] # # Get all values of a list field # aggregation = fo.Values("numeric_list_field") values = dataset.aggregate(aggregation) print(values) # [[1, 2, 3], [1, 2], None] # # Get all values of transformed field # aggregation = fo.Values(2 * (F("numeric_field") + 1)) values = dataset.aggregate(aggregation) print(values) # [4.0, 10.0, None] # # Get values from a label list field # dataset = foz.load_zoo_dataset("quickstart") # list of `Detections` aggregation = fo.Values("ground_truth") detections = dataset.aggregate(aggregation) # list of lists of `Detection` instances aggregation = fo.Values("ground_truth.detections") detections = dataset.aggregate(aggregation) # list of lists of detection labels aggregation = fo.Values("ground_truth.detections.label") labels = dataset.aggregate(aggregation)
- Parameters
field_or_expr –
a field name,
embedded.field.name
,fiftyone.core.expressions.ViewExpression
, or MongoDB expression defining the field or expression to aggregateexpr (None) –
a
fiftyone.core.expressions.ViewExpression
or MongoDB expression to apply tofield_or_expr
(which must be a field) before aggregatingmissing_value (None) – a value to insert for missing or
None
-valued fieldsunwind (False) – whether to automatically unwind all recognized list fields (True) or unwind all list fields except the top-level sample field (-1)
Methods:
Returns the default result for this aggregation.
parse_result
(d)Parses the output of
to_mongo()
.to_mongo
(sample_collection[, big_field, context])Returns the MongoDB aggregation pipeline for this aggregation.
Attributes:
The expression being computed, if any.
The name of the field being computed on, if any.
Whether nan/inf values will be ignored when dealing with floating point values.
-
default_result
()¶ Returns the default result for this aggregation.
- Returns
[]
-
parse_result
(d)¶ Parses the output of
to_mongo()
.- Parameters
d – the result dict
- Returns
the list of field values
-
to_mongo
(sample_collection, big_field='values', context=None)¶ Returns the MongoDB aggregation pipeline for this aggregation.
- Parameters
sample_collection – the
fiftyone.core.collections.SampleCollection
to which the aggregation is being appliedcontext (None) – a path context from which to resolve
- Returns
a MongoDB aggregation pipeline (list of dicts)
-
property
expr
¶ The expression being computed, if any.
-
property
field_name
¶ The name of the field being computed on, if any.
-
property
safe
¶ Whether nan/inf values will be ignored when dealing with floating point values.