fiftyone.utils.open_clip¶
CLIP model wrapper for the FiftyOne Model Zoo.
Classes:
Configuration for running a |
|
|
Torch implementation of CLIP from https://github.com/mlfoundations/open_clip. |
-
class
fiftyone.utils.open_clip.
TorchOpenClipModelConfig
(d)¶ Bases:
fiftyone.utils.torch.TorchImageModelConfig
,fiftyone.zoo.models.HasZooModel
Configuration for running a
TorchOpenClipModel
.See
fiftyone.utils.torch.TorchImageModelConfig
for additional arguments.- Parameters
text_prompt – the text prompt to use, e.g.,
"A photo of"
clip_model ("ViT-B-32") – the Open CLIP model to use
pretrained ("openai") – the pretrained version to use
classes (None) – a list of custom classes for zero-shot prediction
Methods:
Returns a list of class attributes to be serialized.
builder
()Returns a ConfigBuilder instance for this class.
copy
()Returns a deep copy of the object.
custom_attributes
([dynamic, private])Returns a customizable list of class attributes.
default
()Returns the default config instance.
Downloads the published model specified by the config, if necessary.
from_dict
(d)Constructs a Config object from a JSON dictionary.
from_json
(path, *args, **kwargs)Constructs a Serializable object from a JSON file.
from_kwargs
(**kwargs)Constructs a Config object from keyword arguments.
from_str
(s, *args, **kwargs)Constructs a Serializable object from a JSON string.
Returns the fully-qualified class name string of this object.
init
(d)Initializes the published model config.
Loads the default config instance from file.
parse_array
(d, key[, default])Parses a raw array attribute.
parse_bool
(d, key[, default])Parses a boolean value.
parse_categorical
(d, key, choices[, default])Parses a categorical JSON field, which must take a value from among the given choices.
parse_dict
(d, key[, default])Parses a dictionary attribute.
parse_int
(d, key[, default])Parses an integer attribute.
parse_mutually_exclusive_fields
(fields)Parses a mutually exclusive dictionary of pre-parsed fields, which must contain exactly one field with a truthy value.
parse_number
(d, key[, default])Parses a number attribute.
parse_object
(d, key, cls[, default])Parses an object attribute.
parse_object_array
(d, key, cls[, default])Parses an array of objects.
parse_object_dict
(d, key, cls[, default])Parses a dictionary whose values are objects.
parse_path
(d, key[, default])Parses a path attribute.
parse_raw
(d, key[, default])Parses a raw (arbitrary) JSON field.
parse_string
(d, key[, default])Parses a string attribute.
serialize
([reflective])Serializes the object into a dictionary.
to_str
([pretty_print])Returns a string representation of this object.
validate_all_or_nothing_fields
(fields)Validates a dictionary of pre-parsed fields checking that either all or none of the fields have a truthy value.
write_json
(path[, pretty_print])Serializes the object and writes it to disk.
-
attributes
()¶ Returns a list of class attributes to be serialized.
This method is called internally by serialize() to determine the class attributes to serialize.
Subclasses can override this method, but, by default, all attributes in vars(self) are returned, minus private attributes, i.e., those starting with “_”. The order of the attributes in this list is preserved when serializing objects, so a common pattern is for subclasses to override this method if they want their JSON files to be organized in a particular way.
- Returns
a list of class attributes to be serialized
-
classmethod
builder
()¶ Returns a ConfigBuilder instance for this class.
-
copy
()¶ Returns a deep copy of the object.
- Returns
a Serializable instance
-
custom_attributes
(dynamic=False, private=False)¶ Returns a customizable list of class attributes.
By default, all attributes in vars(self) are returned, minus private attributes (those starting with “_”).
- Parameters
dynamic – whether to include dynamic properties, e.g., those defined by getter/setter methods or the @property decorator. By default, this is False
private – whether to include private properties, i.e., those starting with “_”. By default, this is False
- Returns
a list of class attributes
-
classmethod
default
()¶ Returns the default config instance.
By default, this method instantiates the class from an empty dictionary, which will only succeed if all attributes are optional. Otherwise, subclasses should override this method to provide the desired default configuration.
-
download_model_if_necessary
()¶ Downloads the published model specified by the config, if necessary.
After this method is called, the model_path attribute will always contain the path to the model on disk.
-
classmethod
from_dict
(d)¶ Constructs a Config object from a JSON dictionary.
Config subclass constructors accept JSON dictionaries, so this method simply passes the dictionary to cls().
- Parameters
d – a dict of fields expected by cls
- Returns
an instance of cls
-
classmethod
from_json
(path, *args, **kwargs)¶ Constructs a Serializable object from a JSON file.
Subclasses may override this method, but, by default, this method simply reads the JSON and calls from_dict(), which subclasses must implement.
- Parameters
path – the path to the JSON file on disk
*args – optional positional arguments for self.from_dict()
**kwargs – optional keyword arguments for self.from_dict()
- Returns
an instance of the Serializable class
-
classmethod
from_kwargs
(**kwargs)¶ Constructs a Config object from keyword arguments.
- Parameters
**kwargs – keyword arguments that define the fields expected by cls
- Returns
an instance of cls
-
classmethod
from_str
(s, *args, **kwargs)¶ Constructs a Serializable object from a JSON string.
Subclasses may override this method, but, by default, this method simply parses the string and calls from_dict(), which subclasses must implement.
- Parameters
s – a JSON string representation of a Serializable object
*args – optional positional arguments for self.from_dict()
**kwargs – optional keyword arguments for self.from_dict()
- Returns
an instance of the Serializable class
-
classmethod
get_class_name
()¶ Returns the fully-qualified class name string of this object.
-
init
(d)¶ Initializes the published model config.
This method should be called by ModelConfig.__init__(), and it performs the following tasks:
Parses the model_name and model_path parameters
Populates any default parameters in the provided ModelConfig dict
- Parameters
d – a ModelConfig dict
- Returns
a ModelConfig dict with any default parameters populated
-
classmethod
load_default
()¶ Loads the default config instance from file.
Subclasses must implement this method if they intend to support default instances.
-
static
parse_array
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a raw array attribute.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default list to return if key is not present
- Returns
a list of raw (untouched) values
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
-
static
parse_bool
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a boolean value.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default bool to return if key is not present
- Returns
True/False
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
-
static
parse_categorical
(d, key, choices, default=<eta.core.config.NoDefault object>)¶ Parses a categorical JSON field, which must take a value from among the given choices.
- Parameters
d – a JSON dictionary
key – the key to parse
choices – either an iterable of possible values or an enum-like class whose attributes define the possible values
default – a default value to return if key is not present
- Returns
the raw (untouched) value of the given field, which is equal to a value from choices
- Raises
ConfigError – if the key was present in the dictionary but its value was not an allowed choice, or if no default value was provided and the key was not found in the dictionary
-
static
parse_dict
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a dictionary attribute.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default dict to return if key is not present
- Returns
a dictionary
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
-
static
parse_int
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses an integer attribute.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default integer value to return if key is not present
- Returns
an int
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
-
static
parse_mutually_exclusive_fields
(fields)¶ Parses a mutually exclusive dictionary of pre-parsed fields, which must contain exactly one field with a truthy value.
- Parameters
fields – a dictionary of pre-parsed fields
- Returns
the (field, value) that was set
- Raises
ConfigError – if zero or more than one truthy value was found
-
static
parse_number
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a number attribute.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default numeric value to return if key is not present
- Returns
a number (e.g. int, float)
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
-
static
parse_object
(d, key, cls, default=<eta.core.config.NoDefault object>)¶ Parses an object attribute.
The value of d[key] can be either an instance of cls or a serialized dict from an instance of cls.
- Parameters
d – a JSON dictionary
key – the key to parse
cls – the class of d[key]
default – a default cls instance to return if key is not present
- Returns
an instance of cls
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
-
static
parse_object_array
(d, key, cls, default=<eta.core.config.NoDefault object>)¶ Parses an array of objects.
The values in d[key] can be either instances of cls or serialized dicts from instances of cls.
- Parameters
d – a JSON dictionary
key – the key to parse
cls – the class of the elements of list d[key]
default – the default list to return if key is not present
- Returns
a list of cls instances
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
-
static
parse_object_dict
(d, key, cls, default=<eta.core.config.NoDefault object>)¶ Parses a dictionary whose values are objects.
The values in d[key] can be either instances of cls or serialized dicts from instances of cls.
- Parameters
d – a JSON dictionary
key – the key to parse
cls – the class of the values of dictionary d[key]
default – the default dict of cls instances to return if key is not present
- Returns
a dictionary whose values are cls instances
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
-
static
parse_path
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a path attribute.
The path is converted to an absolute path if necessary via
os.path.abspath(os.path.expanduser(value))
.- Parameters
d – a JSON dictionary
key – the key to parse
default – a default string to return if key is not present
- Returns
a path string
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
-
static
parse_raw
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a raw (arbitrary) JSON field.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default value to return if key is not present
- Returns
the raw (untouched) value of the given field
- Raises
ConfigError – if no default value was provided and the key was not found in the dictionary
-
static
parse_string
(d, key, default=<eta.core.config.NoDefault object>)¶ Parses a string attribute.
- Parameters
d – a JSON dictionary
key – the key to parse
default – a default string to return if key is not present
- Returns
a string
- Raises
ConfigError – if the field value was the wrong type or no default value was provided and the key was not found in the dictionary
-
serialize
(reflective=False)¶ Serializes the object into a dictionary.
Serialization is applied recursively to all attributes in the object, including element-wise serialization of lists and dictionary values.
- Parameters
reflective – whether to include reflective attributes when serializing the object. By default, this is False
- Returns
a JSON dictionary representation of the object
-
to_str
(pretty_print=True, **kwargs)¶ Returns a string representation of this object.
- Parameters
pretty_print – whether to render the JSON in human readable format with newlines and indentations. By default, this is True
**kwargs – optional keyword arguments for self.serialize()
- Returns
a string representation of the object
-
static
validate_all_or_nothing_fields
(fields)¶ Validates a dictionary of pre-parsed fields checking that either all or none of the fields have a truthy value.
- Parameters
fields – a dictionary of pre-parsed fields
- Raises
ConfigError – if some values are truth and some are not
-
write_json
(path, pretty_print=False, **kwargs)¶ Serializes the object and writes it to disk.
- Parameters
path – the output path
pretty_print – whether to render the JSON in human readable format with newlines and indentations. By default, this is False
**kwargs – optional keyword arguments for self.serialize()
-
class
fiftyone.utils.open_clip.
TorchOpenClipModel
(config)¶ Bases:
fiftyone.utils.torch.TorchImageModel
,fiftyone.core.models.PromptMixin
Torch implementation of CLIP from https://github.com/mlfoundations/open_clip.
- Parameters
config – a
TorchOpenClipModelConfig
Attributes:
Whether this instance can generate prompt embeddings.
The list of class labels for the model, if known.
The
torch.torch.device
that the model is using.Whether this instance has embeddings.
Whether this instance can generate logits.
The mask targets for the model, if any.
The media type processed by the model.
The number of classes for the model, if known.
Whether to apply preprocessing transforms for inference, if any.
Whether
transforms()
may return tensors of different sizes.The keypoint skeleton for the model, if any.
Whether the model should store logits in its predictions.
A
torchvision.transforms
function that will be applied to each input before prediction, if any.Whether the model is using GPU.
Whether the model is using half precision.
Methods:
embed_prompt
(prompt)Generates an embedding for the given text prompt.
embed_prompts
(prompts)Generates an embedding for the given text prompts.
embed
(arg)Generates an embedding for the given data.
embed_all
(args)Generates embeddings for the given iterable of data.
from_config
(config)Instantiates a Configurable class from a <cls>Config instance.
from_dict
(d)Instantiates a Configurable class from a <cls>Config dict.
from_json
(json_path)Instantiates a Configurable class from a <cls>Config JSON file.
from_kwargs
(**kwargs)Instantiates a Configurable class from keyword arguments defining the attributes of a <cls>Config.
Returns the embeddings generated by the last forward pass of the model.
parse
(class_name[, module_name])Parses a Configurable subclass name string.
predict
(img)Performs prediction on the given image.
predict_all
(imgs)Performs prediction on the given batch of images.
validate
(config)Validates that the given config is an instance of <cls>Config.
-
property
can_embed_prompts
¶ Whether this instance can generate prompt embeddings.
This method returns
False
by default. Methods that can generate prompt embeddings will override this via implementing thePromptMixin
interface.
-
embed_prompt
(prompt)¶ Generates an embedding for the given text prompt.
- Parameters
prompt – a text string
- Returns
a numpy vector
-
embed_prompts
(prompts)¶ Generates an embedding for the given text prompts.
- Parameters
prompts – an iterable of text strings
- Returns
a
num_prompts x num_dims
array of prompt embeddings
-
property
classes
¶ The list of class labels for the model, if known.
-
property
device
¶ The
torch.torch.device
that the model is using.
-
embed
(arg)¶ Generates an embedding for the given data.
Subclasses can override this method to increase efficiency, but, by default, this method simply calls
predict()
and then returnsget_embeddings()
.- Parameters
arg – the data. See
predict()
for details- Returns
a numpy array containing the embedding
-
embed_all
(args)¶ Generates embeddings for the given iterable of data.
Subclasses can override this method to increase efficiency, but, by default, this method simply iterates over the data and applies
embed()
to each.- Parameters
args – an iterable of data. See
predict_all()
for details- Returns
a numpy array containing the embeddings stacked along axis 0
-
classmethod
from_config
(config)¶ Instantiates a Configurable class from a <cls>Config instance.
-
classmethod
from_dict
(d)¶ Instantiates a Configurable class from a <cls>Config dict.
- Parameters
d – a dict to construct a <cls>Config
- Returns
an instance of cls
-
classmethod
from_json
(json_path)¶ Instantiates a Configurable class from a <cls>Config JSON file.
- Parameters
json_path – path to a JSON file for type <cls>Config
- Returns
an instance of cls
-
classmethod
from_kwargs
(**kwargs)¶ Instantiates a Configurable class from keyword arguments defining the attributes of a <cls>Config.
- Parameters
**kwargs – keyword arguments that define the fields of a <cls>Config dict
- Returns
an instance of cls
-
get_embeddings
()¶ Returns the embeddings generated by the last forward pass of the model.
By convention, this method should always return an array whose first axis represents batch size (which will always be 1 when
predict()
was last used).- Returns
a numpy array containing the embedding(s)
-
property
has_embeddings
¶ Whether this instance has embeddings.
-
property
has_logits
¶ Whether this instance can generate logits.
-
property
mask_targets
¶ The mask targets for the model, if any.
-
property
media_type
¶ The media type processed by the model.
-
property
num_classes
¶ The number of classes for the model, if known.
-
static
parse
(class_name, module_name=None)¶ Parses a Configurable subclass name string.
Assumes both the Configurable class and the Config class are defined in the same module. The module containing the classes will be loaded if necessary.
- Parameters
class_name – a string containing the name of the Configurable class, e.g. “ClassName”, or a fully-qualified class name, e.g. “eta.core.config.ClassName”
module_name – a string containing the fully-qualified module name, e.g. “eta.core.config”, or None if class_name includes the module name. Set module_name = __name__ to load a class from the calling module
- Returns
the Configurable class config_cls: the Config class associated with cls
- Return type
cls
-
predict
(img)¶ Performs prediction on the given image.
- Parameters
img –
the image to process, which can be any of the following:
A PIL image
A uint8 numpy array (HWC)
A Torch tensor (CHW)
- Returns
a
fiftyone.core.labels.Label
instance or dict offiftyone.core.labels.Label
instances containing the predictions
-
predict_all
(imgs)¶ Performs prediction on the given batch of images.
- Parameters
imgs –
the batch of images to process, which can be any of the following:
A list of PIL images
A list of uint8 numpy arrays (HWC)
A list of Torch tensors (CHW)
A uint8 numpy tensor (NHWC)
A Torch tensor (NCHW)
- Returns
a list of
fiftyone.core.labels.Label
instances or a list of dicts offiftyone.core.labels.Label
instances containing the predictions
-
property
preprocess
¶ Whether to apply preprocessing transforms for inference, if any.
-
property
ragged_batches
¶ Whether
transforms()
may return tensors of different sizes. If True, then passing ragged lists of images topredict_all()
may not be not allowed.
-
property
skeleton
¶ The keypoint skeleton for the model, if any.
-
property
store_logits
¶ Whether the model should store logits in its predictions.
-
property
transforms
¶ A
torchvision.transforms
function that will be applied to each input before prediction, if any.
-
property
using_gpu
¶ Whether the model is using GPU.
-
property
using_half_precision
¶ Whether the model is using half precision.
-
classmethod
validate
(config)¶ Validates that the given config is an instance of <cls>Config.
- Raises
ConfigurableError – if config is not an instance of <cls>Config