fiftyone.utils.groups#

Grouped dataset utilities.

Copyright 2017-2025, Voxel51, Inc.

Functions:

group_collections(coll_dict, group_key[, ...])

Merges the given collections into a grouped dataset using the specified field as a group key.

fiftyone.utils.groups.group_collections(coll_dict, group_key, group_field='group')#

Merges the given collections into a grouped dataset using the specified field as a group key.

The returned dataset will contain all samples from the input collections with non-None values for the specified group_key, with all samples with a given group_key value in the same group.

Examples:

import fiftyone as fo
import fiftyone.utils.groups as foug

dataset1 = fo.Dataset()
dataset1.add_samples(
    [
        fo.Sample(filepath="image-left1.jpg", group_id=1),
        fo.Sample(filepath="image-left2.jpg", group_id=2),
        fo.Sample(filepath="image-left3.jpg", group_id=3),
        fo.Sample(filepath="skip-me1.jpg"),
    ]
)

dataset2 = fo.Dataset()
dataset2.add_samples(
    [
        fo.Sample(filepath="image-right1.jpg", group_id=1),
        fo.Sample(filepath="image-right2.jpg", group_id=2),
        fo.Sample(filepath="image-right4.jpg", group_id=4),
        fo.Sample(filepath="skip-me2.jpg"),
    ]
)

dataset = foug.group_collections(
    {"left": dataset1, "right": dataset2}, "group_id"
)
Parameters:
  • coll_dict – a dict mapping slice names to fiftyone.core.collections.SampleCollection instances

  • group_key – the field to use as a group membership key. The field may contain values of any hashable type (int, string, etc)

  • group_field ("group") – a name to use for the group field of the returned dataset

Returns:

a fiftyone.core.dataset.Dataset