ModernVBERT/bimodernvbert#

Note

This is a remotely-sourced model from the bimodernvbert plugin, maintained by the community. It is not part of FiftyOne core and may have special installation requirements. Please review the plugin documentation and license before use.

The ModernVBERT suite is a suite of compact 250M-parameter vision-language encoders. BiModernVBERT is the bi-encoder version that is fine-tuned for visual document retrieval tasks..

Details

Model name: ModernVBERT/bimodernvbert
Model source: https://huggingface.co/ModernVBERT/bimodernvbert
Model author: Paul Teiletche, et. al
Model license: MIT
Exposes embeddings? yes
Tags: classification, logits, embeddings, torch, visual-document-retrieval, zero-shot

Requirements

Packages: huggingface-hub, transformers, torch, torchvision, colpali-engine
CPU support
- yes
GPU support
- yes

Example usage

import fiftyone as fo
import fiftyone.zoo as foz

foz.register_zoo_model_source("https://github.com/harpreetsahota204/bimodernvbert")

dataset = foz.load_zoo_dataset(
    "coco-2017",
    split="validation",
    dataset_name=fo.get_default_dataset_name(),
    max_samples=50,
    shuffle=True,
)

model = foz.load_zoo_model("ModernVBERT/bimodernvbert")

dataset.apply_model(model, label_field="predictions")

session = fo.launch_app(dataset)