microsoft/Florence-2-base#

Note

This is a remotely-sourced model from the florence2 plugin, maintained by the community. It is not part of FiftyOne core and may have special installation requirements. Please review the plugin documentation and license before use.

Florence-2 is a vision foundation model with a unified, prompt-based representation for a variety of computer vision and vision-language tasks (https://arxiv.org/abs/2311.06242)..

Details

Model name: microsoft/Florence-2-base
Model source: https://huggingface.co/microsoft/Florence-2-base
Model author: Microsoft
Model license: MIT
Exposes embeddings? no
Tags: detection, segmentation, ocr, VLM, zero-shot

Requirements

Packages: huggingface-hub, transformers, torch, torchvision, einops, timm, accelerate
CPU support
- yes
GPU support
- yes

Example usage

import fiftyone as fo
import fiftyone.zoo as foz

foz.register_zoo_model_source("https://github.com/harpreetsahota204/florence2")

dataset = foz.load_zoo_dataset(
    "coco-2017",
    split="validation",
    dataset_name=fo.get_default_dataset_name(),
    max_samples=50,
    shuffle=True,
)

model = foz.load_zoo_model("microsoft/Florence-2-base")

dataset.apply_model(model, label_field="predictions")

session = fo.launch_app(dataset)