Note

This is a community plugin, an external project maintained by its respective author. Community plugins are not part of FiftyOne core and may change independently. Please review each plugin’s documentation and license before use.

Image Captioning Plugin#

image_captioning

Plugin Overview#

This plugin lets you generate and store captions for your samples using state-of-the-art image captioning models.

Supported Models#

This version of the plugin supports the following models:

BLIP Base from Hugging Face
BLIPv2 (via Replicate)
Fuyu-8b from Adept AI (via Replicate)
GiT from Hugging Face
Llava-1.5-7b from Hugging Face
Llava-13b (via Replicate)
Qwen-vl-chat (via Replicate
ViT-GPT2 from Hugging Face

Feel free to fork this plugin and add support for other models!

Installation#

Pre-requisites#

If you plan to use it, install the Hugging Face transformers library:

pip install transformers

If you plan to use it, install the Replicate library:

pip install replicate

And add your Replicate API key to your environment:

export REPLICATE_API_TOKEN=<your-api-token>

Install the plugin#

fiftyone plugins download https://github.com/jacobmarks/fiftyone-image-captioning-plugin

Operators#

`caption_images`#

Applies the selected image captioning model to the desired target view, and stores the resulting captions in the specified field on the samples.