Remotely-Sourced Zoo Models¶
This page describes how to work with and create zoo models whose definitions are hosted via GitHub repositories or public URLs.
Note
To download from a private GitHub repository that you have access to,
provide your GitHub personal access token by setting the GITHUB_TOKEN
environment variable.
Working with remotely-sourced models¶
Working with remotely-sourced zoo models is just like built-in zoo models, as both varieties support the full zoo API.
When specifying remote sources, you can provide any of the following:
A GitHub repo URL like
https://github.com/<user>/<repo>
A GitHub ref like
https://github.com/<user>/<repo>/tree/<branch>
orhttps://github.com/<user>/<repo>/commit/<commit>
A GitHub ref string like
<user>/<repo>[/<ref>]
A publicly accessible URL of an archive (eg zip or tar) file
Here’s the basic recipe for working with remotely-sourced zoo models:
Use register_zoo_model_source()
to register a remote source of zoo models:
1 2 3 4 | import fiftyone as fo import fiftyone.zoo as foz foz.register_zoo_model_source("https://github.com/voxel51/openai-clip") |
Use list_zoo_model_sources()
to list all remote sources that have been registered locally:
1 2 3 4 | remote_sources = foz.list_zoo_model_sources() print(remote_sources) # [..., "https://github.com/voxel51/openai-clip", ...] |
Once you’ve registered a remote source, any models that it
declares will subsequently appear as
available zoo models when you call
list_zoo_models()
:
1 2 3 4 | available_models = foz.list_zoo_models() print(available_models) # [..., "voxel51/clip-vit-base32-torch", ...] |
You can download a remote zoo model by calling
download_zoo_model()
:
1 | foz.download_zoo_model("voxel51/clip-vit-base32-torch") |
You can also directly download a remote zoo model and implicitly register its source via the following syntax:
1 2 3 4 | foz.download_zoo_model( "https://github.com/voxel51/openai-clip", model_name="voxel51/clip-vit-base32-torch", ) |
You can load a remote zoo model and apply it to a dataset or view via
load_zoo_model()
and
apply_model()
:
1 2 3 4 | dataset = foz.load_zoo_dataset("quickstart") model = foz.load_zoo_model("voxel51/clip-vit-base32-torch") dataset.apply_model(model, label_field="clip") |
You can delete the local copy of a remotely-sourced zoo model via
delete_zoo_model()
:
1 | foz.delete_zoo_model("voxel51/clip-vit-base32-torch") |
You can unregister a remote source of zoo models and delete any local
copies of models that it declares via
delete_zoo_model_source()
:
1 | foz.delete_zoo_model_source("https://github.com/voxel51/openai-clip") |
Use fiftyone zoo models register-source to register a remote source of zoo models:
fiftyone zoo models register-source \
https://github.com/voxel51/openai-clip
Use fiftyone zoo models list-sources to list all remote sources that have been registered locally:
fiftyone zoo models list-sources
# contains a row for 'https://github.com/voxel51/openai-clip'
Once you’ve registered a remote source, any models that it declares will subsequently appear as available zoo models when you call fiftyone zoo models list:
fiftyone zoo models list
# contains a row for 'voxel51/clip-vit-base32-torch'
You can download a remote zoo model by calling fiftyone zoo models download:
fiftyone zoo models download voxel51/clip-vit-base32-torch
You can also directly download a remote zoo model and implicitly register its source via the following syntax:
fiftyone zoo models \
download https://github.com/voxel51/openai-clip \
--model-name voxel51/clip-vit-base32-torch
You can load a remote zoo model and apply it to a dataset via fiftyone zoo models apply:
MODEL_NAME=voxel51/clip-vit-base32-torch
DATASET_NAME=quickstart
LABEL_FIELD=clip
fiftyone zoo models apply $MODEL_NAME $DATASET_NAME $LABEL_FIELD
You can delete the local copy of a remotely-sourced zoo model via fiftyone zoo models delete:
fiftyone zoo models delete voxel51/clip-vit-base32-torch
You can unregister a remote source of zoo models and delete any local copies of models that it declares via fiftyone zoo models delete-source:
fiftyone zoo models delete-source https://github.com/voxel51/openai-clip
Creating remotely-sourced models¶
A remote source of models is defined by a directory with the following contents:
manifest.json
__init__.py
def download_model(model_name, model_path):
pass
def load_model(model_name, model_path, **kwargs):
pass
Each component is described in detail below.
Note
By convention, model sources also contain an optional README.md
file that
provides additional information about the models that it contains and
example syntaxes for downloading and working with them.
manifest.json¶
The remote source’s manifest.json
file defines relevant metadata about the
model(s) that it contains:
Field |
Required? |
Description |
---|---|---|
|
yes |
The base name of the model (no version info) |
|
The base filename or directory of the model (no version info), if applicable. This is required in order for
|
|
|
The author of the model |
|
|
The version of the model (if applicable). If a version is provided, then users can refer to a specific version of the model by
appending |
|
|
The URL at which the model is hosted |
|
|
The license under which the model is distributed |
|
|
The original source of the model |
|
|
A brief description of the model |
|
|
A list of tags for the model. Useful in conjunction with
|
|
|
The size of the model on disk |
|
|
The time that the model was added to the source |
|
|
JSON description of the model’s package/runtime requirements |
|
|
A |
|
|
A |
It can also provide optional metadata about the remote source itself:
Field |
Required? |
Description |
---|---|---|
|
A name for the remote model source |
|
|
The URL of the remote model source |
Here’s an exaxmple model manifest file that declares a single model:
{
"name": "voxel51/openai-clip",
"url": "https://github.com/voxel51/openai-clip",
"models": [
{
"base_name": "voxel51/clip-vit-base32-torch",
"base_filename": "CLIP-ViT-B-32.pt",
"author": "OpenAI",
"license": "MIT",
"source": "https://github.com/openai/CLIP",
"description": "CLIP text/image encoder from Learning Transferable Visual Models From Natural Language Supervision (https://arxiv.org/abs/2103.00020) trained on 400M text-image pairs",
"tags": [
"classification",
"logits",
"embeddings",
"torch",
"clip",
"zero-shot"
],
"size_bytes": 353976522,
"date_added": "2022-04-12 17:49:51",
"requirements": {
"packages": ["torch", "torchvision"],
"cpu": {
"support": true
},
"gpu": {
"support": true
}
}
}
]
}
Download model¶
If a remote source contains model(s) that don’t use the manager
key in its
manifest, then it must contain an
__init__.py
file that defines a download_model()
method with the
signature below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | def download_model(model_name, model_path): """Downloads the model. Args: model_name: the name of the model to download, as declared by the ``base_name`` and optional ``version`` fields of the manifest model_path: the absolute filename or directory to which to download the model, as declared by the ``base_filename`` field of the manifest """ # Determine where to download `model_name` from url = ... # Download `url` to `model_path` ... |
This method is called under-the-hood when a user calls
download_zoo_model()
or
load_zoo_model()
, and its job is
to download any relevant files from the web and organize and/or prepare
them as necessary at the provided path.
Load model¶
If a remote source contains model(s) that don’t use the
default_deployment_config_dict
key in its
manifest, then it must contain an
__init__.py
file that defines a load_model()
method with the signature
below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | def load_model(model_name, model_path, **kwargs): """Loads the model. Args: model_name: the name of the model to load, as declared by the ``base_name`` and optional ``version`` fields of the manifest model_path: the absolute filename or directory to which the model was donwloaded, as declared by the ``base_filename`` field of the manifest **kwargs: optional keyword arguments that configure how the model is loaded Returns: a :class:`fiftyone.core.models.Model` """ # The directory containing this file model_dir = os.path.dirname(model_path) # Consturct the specified `Model` instance, generally by importing # other modules in `model_dir` model = ... return model |
This method’s job is to load the Model
instance for the specified model whose
associated weights are stored at the provided path.
Remotely-sourced models can optionally support customized loading by accepting
optional keyword arguments to their load_model()
method.
When
load_zoo_model(name_or_url, ..., **kwargs)
is called, any kwargs
are passed through to load_model(..., **kwargs)
.
Note
Check out voxel51/openai-clip for an example of a remote model source.