Utilities for working with Hugging Face.
Lists all FiftyOne datasets available on the Hugging Face Hub. |
Push a FiftyOne dataset to the Hugging Face Hub. |
Loads a dataset from the Hugging Face Hub into FiftyOne. |
Config for a Hugging Face Hub dataset. |
Config for a Hugging Face Hub dataset that is stored as parquet files. |
(info=False)¶ Lists all FiftyOne datasets available on the Hugging Face Hub.
This method includes all datasets that are tagged to the
library in Hugging Face.Examples:
from fiftyone.utils.huggingface import list_hub_datasets datasets = list_hub_datasets() print(datasets)
- Parameters
info (False) – whether to return dataset names (False) or
objects (True)- Returns
a list of dataset names or objects
(dataset, repo_name, description=None, license=None, tags=None, private=False, exist_ok=False, dataset_type=None, min_fiftyone_version=None, label_field=None, frame_labels_field=None, token=None, preview_path=None, chunk_size=None, **data_card_kwargs)¶ Push a FiftyOne dataset to the Hugging Face Hub.
- Parameters
dataset – a FiftyOne dataset
repo_name – the name of the dataset repo to create. The repo ID will be
description (None) – a description of the dataset
license (None) – the license of the dataset
tags (None) – a list of tags for the dataset
private (True) – whether the repo should be private
exist_ok (False) – if True, do not raise an error if repo already exists.
dataset_type (None) – the type of the dataset to create
min_fiftyone_version (None) – the minimum version of FiftyOne required to load the dataset. For example
.label_field (None) –
controls the label field(s) to export. Only applicable to labeled datasets. Can be any of the following:
the name of a label field to export
a glob pattern of label field(s) to export
a list or tuple of label field(s) to export
a dictionary mapping label field names to keys to use when constructing the label dictionaries to pass to the exporter
frame_labels_field (None) –
controls the frame label field(s) to export. The “frames.” prefix is optional. Only applicable to labeled video datasets. Can be any of the following:
the name of a frame label field to export
a glob pattern of frame label field(s) to export
a list or tuple of frame label field(s) to export
a dictionary mapping frame label field names to keys to use when constructing the frame label dictionaries to pass to the exporter
token (None) – a Hugging Face API token to use. May also be provided via the
environment variablepreview_path (None) – a path to a preview image or video to display on the readme of the dataset repo.
chunk_size (None) – the number of media files to put in each subdirectory, to avoid having too many files in a single directory. If None, no chunking is performed. If the dataset has more than 10,000 samples, it will be chunked by default to avoid exceeding the maximum number of files in a directory on Hugging Face Hub. This parameter is only applicable to :class:fiftyone.types.dataset_types.FiftyOneDataset datasets.
data_card_kwargs – additional keyword arguments to pass to the DatasetCard constructor
(repo_id, revision=None, split=None, splits=None, subset=None, subsets=None, max_samples=None, batch_size=None, num_workers=None, overwrite=False, persistent=False, name=None, token=None, config_file=None, **kwargs)¶ Loads a dataset from the Hugging Face Hub into FiftyOne.
- Parameters
repo_id – the Hugging Face Hub identifier of the dataset
revision (None) – the revision of the dataset to load
split (None) – the split of the dataset to load
splits (None) – the splits of the dataset to load
subset (None) – the subset of the dataset to load
subsets (None) – the subsets of the dataset to load
max_samples (None) – the maximum number of samples to load
batch_size (None) – the batch size to use when loading samples
num_workers (None) – a suggested number of threads to use when downloading media
overwrite (True) – whether to overwrite an existing dataset with the same name
persistent (False) – whether the dataset should be persistent
name (None) – an optional name to give the dataset
token (None) – a Hugging Face API token to use. May also be provided via the
environment variableconfig_file (None) – the path to a config file on disk specifying how to load the dataset if the repo has no
file**kwargs – keyword arguments specifying config parameters to load the dataset if the repo has no
- Returns
(**kwargs)¶ Bases:
Config for a Hugging Face Hub dataset.
- Parameters
name – the name of the dataset
repo_type – the type of the repository
repo_id – the identifier of the repository
revision – the revision of the dataset
filename – the name of the file
format – the format of the dataset
tags – the tags of the dataset
license – the license of the dataset
description – the description of the dataset
fiftyone – the fiftyone version requirement of the dataset
app_media_fields – the media fields visible in the App
grid_media_field – the media field to use in the grid view
Returns a list of class attributes to be serialized.
()Returns a ConfigBuilder instance for this class.
()Returns a deep copy of the object.
([dynamic, private])Returns a customizable list of class attributes.
()Returns the default config instance.
(d)Constructs a Config object from a JSON dictionary.
(path, *args, **kwargs)Constructs a Serializable object from a JSON file.
(**kwargs)Constructs a Config object from keyword arguments.
(s, *args, **kwargs)Constructs a Serializable object from a JSON string.
Returns the fully-qualified class name string of this object.
Loads the default config instance from file.
(d, key[, default])Parses a raw array attribute.
(d, key[, default])Parses a boolean value.
(d, key, choices[, default])Parses a categorical JSON field, which must take a value from among the given choices.
(d, key[, default])Parses a dictionary attribute.
(d, key[, default])Parses an integer attribute.
(fields)Parses a mutually exclusive dictionary of pre-parsed fields, which must contain exactly one field with a truthy value.
(d, key[, default])Parses a number attribute.
(d, key, cls[, default])Parses an object attribute.
(d, key, cls[, default])Parses an array of objects.
(d, key, cls[, default])Parses a dictionary whose values are objects.
(d, key[, default])Parses a path attribute.
(d, key[, default])Parses a raw (arbitrary) JSON field.
(d, key[, default])Parses a string attribute.
([reflective])Serializes the object into a dictionary.
([pretty_print])Returns a string representation of this object.
(fields)Validates a dictionary of pre-parsed fields checking that either all or none of the fields have a truthy value.
(path[, pretty_print])Serializes the object and writes it to disk.
()¶ Returns a list of class attributes to be serialized.
This method is called internally by serialize() to determine the class attributes to serialize.
Subclasses can override this method, but, by default, all attributes in vars(self) are returned, minus private attributes, i.e., those starting with “_”. The order of the attributes in this list is preserved when serializing objects, so a common pattern is for subclasses to override this method if they want their JSON files to be organized in a particular way.
- Returns
a list of class attributes to be serialized
()¶ Returns a ConfigBuilder instance for this class.
()¶ Returns a deep copy of the object.
- Returns
a Serializable instance
(dynamic=False, private=False)¶ Returns a customizable list of class attributes.
By default, all attributes in vars(self) are returned, minus private attributes (those starting with “_”).
- Parameters
dynamic – whether to include dynamic properties, e.g., those defined by getter/setter methods or the @property decorator. By default, this is False
private – whether to include private properties, i.e., those starting with “_”. By default, this is False
- Returns
a list of class attributes
()¶ Returns the default config instance.
By default, this method instantiates the class from an empty dictionary, which will only succeed if all attributes are optional. Otherwise, subclasses should override this method to provide the desired default configuration.
(d)¶ Constructs a Config object from a JSON dictionary.
Config subclass constructors accept JSON dictionaries, so this method simply passes the dictionary to cls().
- Parameters
d – a dict of fields expected by cls
- Returns
an instance of cls
(path, *args, **kwargs)¶ Constructs a Serializable object from a JSON file.
Subclasses may override this method, but, by default, this method simply reads the JSON and calls from_dict(), which subclasses must implement.
- Parameters
path – the path to the JSON file on disk
*args – optional positional arguments for self.from_dict()
**kwargs – optional keyword arguments for self.from_dict()
- Returns
an instance of the Serializable class
(**kwargs)¶ Constructs a Config object from keyword arguments.
- Parameters
**kwargs – keyword arguments that define the fields expected by cls
- Returns
an instance of cls
(s, *args, **kwargs)¶ Constructs a Serializable object from a JSON string.
Subclasses may override this method, but, by default, this method simply parses the string and calls from_dict(), which subclasses must implement.
- Parameters
s – a JSON string representation of a Serializable object
*args – optional positional arguments for self.from_dict()
**kwargs – optional keyword arguments for self.from_dict()
- Returns
an instance of the Serializable class
()¶ Returns the fully-qualified class name string of this object.
()¶ Loads the default config instance from file.
Subclasses must implement this method if they intend to support default instances.
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a raw array attribute.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default list to return if key is not present
- Returns
a list of raw (untouched) values
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a boolean value.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default bool to return if key is not present
- Returns
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
(d, key, choices, default=<eta.core.config.NoDefault object>)¶ Parses a categorical JSON field, which must take a value from among the given choices.
- Parameters
d – a JSON dictionary
key – the key to parse
choices – either an iterable of possible values or an enum-like class whose attributes define the possible values
default – a default value to return if key is not present
- Returns
the raw (untouched) value of the given field, which is equal to a value from choices
- Raises
ConfigError – if the key was present in the dictionary but its value was not an allowed choice, or if no default value was provided and the key was not found in the dictionary
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a dictionary attribute.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default dict to return if key is not present
- Returns
a dictionary
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses an integer attribute.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default integer value to return if key is not present
- Returns
an int
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
(fields)¶ Parses a mutually exclusive dictionary of pre-parsed fields, which must contain exactly one field with a truthy value.
- Parameters
fields – a dictionary of pre-parsed fields
- Returns
the (field, value) that was set
- Raises
ConfigError – if zero or more than one truthy value was found
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a number attribute.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default numeric value to return if key is not present
- Returns
a number (e.g. int, float)
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
(d, key, cls, default=<eta.core.config.NoDefault object>)¶ Parses an object attribute.
The value of d[key] can be either an instance of cls or a serialized dict from an instance of cls.
- Parameters
d – a JSON dictionary
key – the key to parse
cls – the class of d[key]
default – a default cls instance to return if key is not present
- Returns
an instance of cls
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
(d, key, cls, default=<eta.core.config.NoDefault object>)¶ Parses an array of objects.
The values in d[key] can be either instances of cls or serialized dicts from instances of cls.
- Parameters
d – a JSON dictionary
key – the key to parse
cls – the class of the elements of list d[key]
default – the default list to return if key is not present
- Returns
a list of cls instances
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
(d, key, cls, default=<eta.core.config.NoDefault object>)¶ Parses a dictionary whose values are objects.
The values in d[key] can be either instances of cls or serialized dicts from instances of cls.
- Parameters
d – a JSON dictionary
key – the key to parse
cls – the class of the values of dictionary d[key]
default – the default dict of cls instances to return if key is not present
- Returns
a dictionary whose values are cls instances
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a path attribute.
The path is converted to an absolute path if necessary via
.- Parameters
d – a JSON dictionary
key – the key to parse
default – a default string to return if key is not present
- Returns
a path string
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a raw (arbitrary) JSON field.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default value to return if key is not present
- Returns
the raw (untouched) value of the given field
- Raises
ConfigError – if no default value was provided and the key was not found in the dictionary
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a string attribute.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default string to return if key is not present
- Returns
a string
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
(reflective=False)¶ Serializes the object into a dictionary.
Serialization is applied recursively to all attributes in the object, including element-wise serialization of lists and dictionary values.
- Parameters
reflective – whether to include reflective attributes when serializing the object. By default, this is False
- Returns
a JSON dictionary representation of the object
(pretty_print=True, **kwargs)¶ Returns a string representation of this object.
- Parameters
pretty_print – whether to render the JSON in human readable format with newlines and indentations. By default, this is True
**kwargs – optional keyword arguments for self.serialize()
- Returns
a string representation of the object
(fields)¶ Validates a dictionary of pre-parsed fields checking that either all or none of the fields have a truthy value.
- Parameters
fields – a dictionary of pre-parsed fields
- Raises
ConfigError – if some values are truth and some are not
(path, pretty_print=False, **kwargs)¶ Serializes the object and writes it to disk.
- Parameters
path – the output path
pretty_print – whether to render the JSON in human readable format with newlines and indentations. By default, this is False
**kwargs – optional keyword arguments for self.serialize()
(**kwargs)¶ Bases:
Config for a Hugging Face Hub dataset that is stored as parquet files.
- Parameters
name – the name of the dataset
repo_type – the type of the repository
repo_id – the identifier of the repository
revision – the revision of the dataset
filename – the name of the file
format – the format of the dataset
tags – the tags of the dataset
license – the license of the dataset
description – the description of the dataset
fiftyone – the fiftyone version requirement of the dataset
label_fields – the label fields of the dataset
media_type – the media type of the dataset
default_media_fields – the default media fields of the dataset
additional_media_fields – the additional media fields of the dataset
Returns a list of class attributes to be serialized.
()Returns a ConfigBuilder instance for this class.
()Returns a deep copy of the object.
([dynamic, private])Returns a customizable list of class attributes.
()Returns the default config instance.
(d)Constructs a Config object from a JSON dictionary.
(path, *args, **kwargs)Constructs a Serializable object from a JSON file.
(**kwargs)Constructs a Config object from keyword arguments.
(s, *args, **kwargs)Constructs a Serializable object from a JSON string.
Returns the fully-qualified class name string of this object.
Loads the default config instance from file.
(d, key[, default])Parses a raw array attribute.
(d, key[, default])Parses a boolean value.
(d, key, choices[, default])Parses a categorical JSON field, which must take a value from among the given choices.
(d, key[, default])Parses a dictionary attribute.
(d, key[, default])Parses an integer attribute.
(fields)Parses a mutually exclusive dictionary of pre-parsed fields, which must contain exactly one field with a truthy value.
(d, key[, default])Parses a number attribute.
(d, key, cls[, default])Parses an object attribute.
(d, key, cls[, default])Parses an array of objects.
(d, key, cls[, default])Parses a dictionary whose values are objects.
(d, key[, default])Parses a path attribute.
(d, key[, default])Parses a raw (arbitrary) JSON field.
(d, key[, default])Parses a string attribute.
([reflective])Serializes the object into a dictionary.
([pretty_print])Returns a string representation of this object.
(fields)Validates a dictionary of pre-parsed fields checking that either all or none of the fields have a truthy value.
(path[, pretty_print])Serializes the object and writes it to disk.
()¶ Returns a list of class attributes to be serialized.
This method is called internally by serialize() to determine the class attributes to serialize.
Subclasses can override this method, but, by default, all attributes in vars(self) are returned, minus private attributes, i.e., those starting with “_”. The order of the attributes in this list is preserved when serializing objects, so a common pattern is for subclasses to override this method if they want their JSON files to be organized in a particular way.
- Returns
a list of class attributes to be serialized
()¶ Returns a ConfigBuilder instance for this class.
()¶ Returns a deep copy of the object.
- Returns
a Serializable instance
(dynamic=False, private=False)¶ Returns a customizable list of class attributes.
By default, all attributes in vars(self) are returned, minus private attributes (those starting with “_”).
- Parameters
dynamic – whether to include dynamic properties, e.g., those defined by getter/setter methods or the @property decorator. By default, this is False
private – whether to include private properties, i.e., those starting with “_”. By default, this is False
- Returns
a list of class attributes
()¶ Returns the default config instance.
By default, this method instantiates the class from an empty dictionary, which will only succeed if all attributes are optional. Otherwise, subclasses should override this method to provide the desired default configuration.
(d)¶ Constructs a Config object from a JSON dictionary.
Config subclass constructors accept JSON dictionaries, so this method simply passes the dictionary to cls().
- Parameters
d – a dict of fields expected by cls
- Returns
an instance of cls
(path, *args, **kwargs)¶ Constructs a Serializable object from a JSON file.
Subclasses may override this method, but, by default, this method simply reads the JSON and calls from_dict(), which subclasses must implement.
- Parameters
path – the path to the JSON file on disk
*args – optional positional arguments for self.from_dict()
**kwargs – optional keyword arguments for self.from_dict()
- Returns
an instance of the Serializable class
(**kwargs)¶ Constructs a Config object from keyword arguments.
- Parameters
**kwargs – keyword arguments that define the fields expected by cls
- Returns
an instance of cls
(s, *args, **kwargs)¶ Constructs a Serializable object from a JSON string.
Subclasses may override this method, but, by default, this method simply parses the string and calls from_dict(), which subclasses must implement.
- Parameters
s – a JSON string representation of a Serializable object
*args – optional positional arguments for self.from_dict()
**kwargs – optional keyword arguments for self.from_dict()
- Returns
an instance of the Serializable class
()¶ Returns the fully-qualified class name string of this object.
()¶ Loads the default config instance from file.
Subclasses must implement this method if they intend to support default instances.
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a raw array attribute.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default list to return if key is not present
- Returns
a list of raw (untouched) values
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a boolean value.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default bool to return if key is not present
- Returns
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
(d, key, choices, default=<eta.core.config.NoDefault object>)¶ Parses a categorical JSON field, which must take a value from among the given choices.
- Parameters
d – a JSON dictionary
key – the key to parse
choices – either an iterable of possible values or an enum-like class whose attributes define the possible values
default – a default value to return if key is not present
- Returns
the raw (untouched) value of the given field, which is equal to a value from choices
- Raises
ConfigError – if the key was present in the dictionary but its value was not an allowed choice, or if no default value was provided and the key was not found in the dictionary
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a dictionary attribute.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default dict to return if key is not present
- Returns
a dictionary
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses an integer attribute.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default integer value to return if key is not present
- Returns
an int
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
(fields)¶ Parses a mutually exclusive dictionary of pre-parsed fields, which must contain exactly one field with a truthy value.
- Parameters
fields – a dictionary of pre-parsed fields
- Returns
the (field, value) that was set
- Raises
ConfigError – if zero or more than one truthy value was found
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a number attribute.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default numeric value to return if key is not present
- Returns
a number (e.g. int, float)
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
(d, key, cls, default=<eta.core.config.NoDefault object>)¶ Parses an object attribute.
The value of d[key] can be either an instance of cls or a serialized dict from an instance of cls.
- Parameters
d – a JSON dictionary
key – the key to parse
cls – the class of d[key]
default – a default cls instance to return if key is not present
- Returns
an instance of cls
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
(d, key, cls, default=<eta.core.config.NoDefault object>)¶ Parses an array of objects.
The values in d[key] can be either instances of cls or serialized dicts from instances of cls.
- Parameters
d – a JSON dictionary
key – the key to parse
cls – the class of the elements of list d[key]
default – the default list to return if key is not present
- Returns
a list of cls instances
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
(d, key, cls, default=<eta.core.config.NoDefault object>)¶ Parses a dictionary whose values are objects.
The values in d[key] can be either instances of cls or serialized dicts from instances of cls.
- Parameters
d – a JSON dictionary
key – the key to parse
cls – the class of the values of dictionary d[key]
default – the default dict of cls instances to return if key is not present
- Returns
a dictionary whose values are cls instances
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a path attribute.
The path is converted to an absolute path if necessary via
.- Parameters
d – a JSON dictionary
key – the key to parse
default – a default string to return if key is not present
- Returns
a path string
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a raw (arbitrary) JSON field.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default value to return if key is not present
- Returns
the raw (untouched) value of the given field
- Raises
ConfigError – if no default value was provided and the key was not found in the dictionary
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a string attribute.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default string to return if key is not present
- Returns
a string
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
(reflective=False)¶ Serializes the object into a dictionary.
Serialization is applied recursively to all attributes in the object, including element-wise serialization of lists and dictionary values.
- Parameters
reflective – whether to include reflective attributes when serializing the object. By default, this is False
- Returns
a JSON dictionary representation of the object
(pretty_print=True, **kwargs)¶ Returns a string representation of this object.
- Parameters
pretty_print – whether to render the JSON in human readable format with newlines and indentations. By default, this is True
**kwargs – optional keyword arguments for self.serialize()
- Returns
a string representation of the object
(fields)¶ Validates a dictionary of pre-parsed fields checking that either all or none of the fields have a truthy value.
- Parameters
fields – a dictionary of pre-parsed fields
- Raises
ConfigError – if some values are truth and some are not
(path, pretty_print=False, **kwargs)¶ Serializes the object and writes it to disk.
- Parameters
path – the output path
pretty_print – whether to render the JSON in human readable format with newlines and indentations. By default, this is False
**kwargs – optional keyword arguments for self.serialize()
(media_field_key, media_field_name, feature, download_dir)¶ Bases:
(sample_dict, row)-
(sample_dict, row)¶