Skip to main content
FiftyOne 1.7.2 documentation - Home FiftyOne 1.7.2 documentation - Home
Products
  • Data Annotation
  • Data Curation
  • Model Evaluation
  • Integrations
  • Plugins
Solutions
  • Agriculture
  • Autonomous Vehicles & Systems
  • Aviation
  • Defense
  • Healthcare
  • Manufacturing
  • Research
  • Retail
  • Robotics
  • Security
  • Sports
Customers
Resources
  • Blog
  • Upcoming Events
  • On-Demand Webinars
  • Whitepapers & Reports
  • Computer Vision Community
  • CV Research
  • Newsroom
Docs
Book a demo
  • Overview
  • FiftyOne Enterprise πŸš€
    • Overview
    • Installation
    • API connection
    • Cloud-backed media
    • Roles and permissions
    • Dataset Versioning
    • App
    • Data Lens NEW
    • Data Quality NEW
    • Query Performance NEW
    • Plugins
    • Secrets
    • Management SDK
    • Migrations
    • Pluggable Auth
  • Installation
    • Virtual environments
    • Troubleshooting
  • Environments
  • Getting Started Guides
    • Object Detection Guide
      • Loading Detection Datasets
      • Adding Object Detections
      • Finding Detection Mistakes
      • Evaluating Detections
      • Guide Summary
    • Medical Imaging Guide
      • Getting Started with Medical Imaging
      • Guide Summary
    • Self-Driving Guide
      • Loading Self-Driving Datasets
      • Advanced Self-Driving Techniques
      • Guide Summary
    • 3D Visual AI Guide
      • Getting Started with 3D Datasets
      • Getting Started with Loading 3D Annotations
      • Guide Summary
    • Model Evaluation Guide
      • Basic Model Evaluation
      • Advanced Evaluation Analysis
      • Guide Summary
  • Tutorials
    • DINOv3 visual search
    • pandas and FiftyOne
    • Evaluating object detections
    • Evaluating a classifier
    • Using image embeddings
    • Annotating with CVAT
    • Annotating with Labelbox
    • Working with Open Images
    • Training with Detectron2
    • Exploring image uniqueness
    • Finding class mistakes
    • Finding detection mistakes
    • Embeddings with Qdrant
    • Fine-tuning YOLOv8 models
    • 3D point clouds with Point-E
    • Monocular depth estimation
    • Dimensionality reduction
    • Zero-shot classification
    • Data augmentation
    • Clustering images
    • Detecting small objects
    • Anomaly detection
  • Recipes
    • Creating views
    • Removing duplicate images
    • Removing duplicate objects
    • Adding classifier predictions
    • Adding object detections
    • Draw labels on samples
    • Convert dataset formats
    • Merging datasets
    • Custom dataset importers
    • Custom dataset exporters
    • Custom sample parsers
  • Cheat Sheets
    • FiftyOne terminology
    • Filtering cheat sheet
    • Views cheat sheet
    • pandas vs FiftyOne
  • User Guide
    • FiftyOne basics
    • Importing data
    • Using datasets
    • Using the App
    • Dataset views
    • Grouped datasets
    • Annotating datasets
    • Evaluating models NEW
    • Using aggregations
    • Interactive plots
    • Exporting datasets
    • Drawing labels on samples
    • Using sample parsers
    • Configuring FiftyOne
  • Dataset Zoo
    • Built-in datasets
    • Remote datasets
    • API reference
  • Model Zoo
    • Built-in models
    • Remote models
    • Model interface
    • API reference
  • FiftyOne Brain
  • Integrations
    • COCO
    • Open Images
    • ActivityNet
    • CVAT
    • Label Studio
    • V7
    • Labelbox
    • Qdrant
    • Redis
    • Pinecone
    • MongoDB
    • Elasticsearch
    • PostgreSQL Pgvector
    • Databricks Mosaic AI
    • Milvus
    • LanceDB
    • Hugging Face
    • Ultralytics
    • Albumentations
    • SuperGradients
    • OpenCLIP
    • PyTorch Hub
    • Lightning Flash
  • Plugins
    • Overview
    • Using plugins
    • Developing plugins
    • API reference
      • plugins.operators
        • plugins.operators.model_evaluation
        • plugins.operators.group_by
      • plugins.panels
        • plugins.panels.model_evaluation
      • plugins.utils
        • plugins.utils.model_evaluation
    • TypeScript API reference
      • @fiftyone/state
      • @fiftyone/plugins
      • @fiftyone/operators
      • @fiftyone/spaces
      • @fiftyone/aggregations
      • @fiftyone/relay
      • @fiftyone/utilities
  • CLI
  • API Reference
    • fiftyone.brain
      • fiftyone.brain.internal
        • fiftyone.brain.internal.core
      • fiftyone.brain.config
      • fiftyone.brain.similarity
      • fiftyone.brain.visualization
    • fiftyone.core
      • fiftyone.core.map
        • fiftyone.core.map.batcher
        • fiftyone.core.map.factory
        • fiftyone.core.map.mapper
        • fiftyone.core.map.process
        • fiftyone.core.map.threading
        • fiftyone.core.map.typing
      • fiftyone.core.odm
        • fiftyone.core.odm.database
        • fiftyone.core.odm.dataset
        • fiftyone.core.odm.document
        • fiftyone.core.odm.embedded_document
        • fiftyone.core.odm.frame
        • fiftyone.core.odm.mixins
        • fiftyone.core.odm.runs
        • fiftyone.core.odm.sample
        • fiftyone.core.odm.utils
        • fiftyone.core.odm.views
        • fiftyone.core.odm.workspace
      • fiftyone.core.plots
        • fiftyone.core.plots.base
        • fiftyone.core.plots.manager
        • fiftyone.core.plots.matplotlib
        • fiftyone.core.plots.plotly
        • fiftyone.core.plots.utils
        • fiftyone.core.plots.views
      • fiftyone.core.session
        • fiftyone.core.session.client
        • fiftyone.core.session.events
        • fiftyone.core.session.notebooks
        • fiftyone.core.session.session
        • fiftyone.core.session.templates
      • fiftyone.core.threed
        • fiftyone.core.threed.camera
        • fiftyone.core.threed.lights
        • fiftyone.core.threed.material_3d
        • fiftyone.core.threed.mesh
        • fiftyone.core.threed.object_3d
        • fiftyone.core.threed.pointcloud
        • fiftyone.core.threed.scene_3d
        • fiftyone.core.threed.shape_3d
        • fiftyone.core.threed.transformation
        • fiftyone.core.threed.utils
        • fiftyone.core.threed.validators
      • fiftyone.core.aggregations
      • fiftyone.core.annotation
      • fiftyone.core.brain
      • fiftyone.core.cli
      • fiftyone.core.clips
      • fiftyone.core.collections
      • fiftyone.core.config
      • fiftyone.core.context
      • fiftyone.core.dataset
      • fiftyone.core.document
      • fiftyone.core.evaluation
      • fiftyone.core.expressions
      • fiftyone.core.fields
      • fiftyone.core.frame
      • fiftyone.core.frame_utils
      • fiftyone.core.groups
      • fiftyone.core.json
      • fiftyone.core.labels
      • fiftyone.core.logging
      • fiftyone.core.media
      • fiftyone.core.metadata
      • fiftyone.core.models
      • fiftyone.core.patches
      • fiftyone.core.runs
      • fiftyone.core.sample
      • fiftyone.core.service
      • fiftyone.core.singletons
      • fiftyone.core.stages
      • fiftyone.core.state
      • fiftyone.core.storage
      • fiftyone.core.uid
      • fiftyone.core.utils
      • fiftyone.core.validation
      • fiftyone.core.video
      • fiftyone.core.view
    • fiftyone.factory
      • fiftyone.factory.repos
        • fiftyone.factory.repos.delegated_operation
        • fiftyone.factory.repos.delegated_operation_doc
        • fiftyone.factory.repos.execution_store
      • fiftyone.factory.repo_factory
    • fiftyone.migrations
      • fiftyone.migrations.runner
    • fiftyone.operators
      • fiftyone.operators.cache
        • fiftyone.operators.cache.decorator
        • fiftyone.operators.cache.ephemeral
        • fiftyone.operators.cache.serialization
        • fiftyone.operators.cache.utils
      • fiftyone.operators.store
        • fiftyone.operators.store.models
        • fiftyone.operators.store.service
        • fiftyone.operators.store.store
      • fiftyone.operators.categories
      • fiftyone.operators.decorators
      • fiftyone.operators.delegated
      • fiftyone.operators.evaluation_metric
      • fiftyone.operators.events
      • fiftyone.operators.executor
      • fiftyone.operators.message
      • fiftyone.operators.operations
      • fiftyone.operators.operator
      • fiftyone.operators.panel
      • fiftyone.operators.permissions
      • fiftyone.operators.registry
      • fiftyone.operators.server
      • fiftyone.operators.types
      • fiftyone.operators.utils
    • fiftyone.plugins
      • fiftyone.plugins.constants
      • fiftyone.plugins.context
      • fiftyone.plugins.core
      • fiftyone.plugins.definitions
      • fiftyone.plugins.secrets
      • fiftyone.plugins.utils
    • fiftyone.types
      • fiftyone.types.dataset_types
    • fiftyone.utils
      • fiftyone.utils.clip
        • fiftyone.utils.clip.model
        • fiftyone.utils.clip.tokenizer
        • fiftyone.utils.clip.zoo
      • fiftyone.utils.data
        • fiftyone.utils.data.base
        • fiftyone.utils.data.converters
        • fiftyone.utils.data.exporters
        • fiftyone.utils.data.importers
        • fiftyone.utils.data.ingestors
        • fiftyone.utils.data.parsers
      • fiftyone.utils.eval
        • fiftyone.utils.eval.activitynet
        • fiftyone.utils.eval.base
        • fiftyone.utils.eval.classification
        • fiftyone.utils.eval.coco
        • fiftyone.utils.eval.detection
        • fiftyone.utils.eval.openimages
        • fiftyone.utils.eval.regression
        • fiftyone.utils.eval.segmentation
      • fiftyone.utils.tracking
        • fiftyone.utils.tracking.deepsort
      • fiftyone.utils.activitynet
      • fiftyone.utils.annotations
      • fiftyone.utils.aws
      • fiftyone.utils.bdd
      • fiftyone.utils.beam
      • fiftyone.utils.cityscapes
      • fiftyone.utils.coco
      • fiftyone.utils.csv
      • fiftyone.utils.cvat
      • fiftyone.utils.dicom
      • fiftyone.utils.eta
      • fiftyone.utils.fiw
      • fiftyone.utils.flash
      • fiftyone.utils.geojson
      • fiftyone.utils.geotiff
      • fiftyone.utils.github
      • fiftyone.utils.groups
      • fiftyone.utils.hmdb51
      • fiftyone.utils.huggingface
      • fiftyone.utils.image
      • fiftyone.utils.imagenet
      • fiftyone.utils.iou
      • fiftyone.utils.kinetics
      • fiftyone.utils.kitti
      • fiftyone.utils.labelbox
      • fiftyone.utils.labels
      • fiftyone.utils.labelstudio
      • fiftyone.utils.lfw
      • fiftyone.utils.open_clip
      • fiftyone.utils.openimages
      • fiftyone.utils.openlabel
      • fiftyone.utils.patches
      • fiftyone.utils.places
      • fiftyone.utils.quickstart
      • fiftyone.utils.random
      • fiftyone.utils.rerun
      • fiftyone.utils.sam
      • fiftyone.utils.sam2
      • fiftyone.utils.sama
      • fiftyone.utils.scale
      • fiftyone.utils.super_gradients
      • fiftyone.utils.tf
      • fiftyone.utils.torch
      • fiftyone.utils.transformers
      • fiftyone.utils.ucf101
      • fiftyone.utils.ultralytics
      • fiftyone.utils.useragent
      • fiftyone.utils.utils3d
      • fiftyone.utils.video
      • fiftyone.utils.voc
      • fiftyone.utils.yolo
      • fiftyone.utils.youtube
    • fiftyone.zoo
      • fiftyone.zoo.datasets
        • fiftyone.zoo.datasets.base
        • fiftyone.zoo.datasets.tf
        • fiftyone.zoo.datasets.torch
      • fiftyone.zoo.models
        • fiftyone.zoo.models.torch
    • fiftyone.constants
  • Release Notes
  • Deprecation Notices
  • FAQ
Products
  • Data Annotation
  • Data Curation
  • Model Evaluation
  • Integrations
  • Plugins
Solutions
  • Agriculture
  • Autonomous Vehicles & Systems
  • Aviation
  • Defense
  • Healthcare
  • Manufacturing
  • Research
  • Retail
  • Robotics
  • Security
  • Sports
Customers
Resources
  • Blog
  • Upcoming Events
  • On-Demand Webinars
  • Whitepapers & Reports
  • Computer Vision Community
  • CV Research
  • Newsroom
Docs
Book a demo
  • Overview
  • FiftyOne Enterprise πŸš€
    • Overview
    • Installation
    • API connection
    • Cloud-backed media
    • Roles and permissions
    • Dataset Versioning
    • App
    • Data Lens NEW
    • Data Quality NEW
    • Query Performance NEW
    • Plugins
    • Secrets
    • Management SDK
    • Migrations
    • Pluggable Auth
  • Installation
    • Virtual environments
    • Troubleshooting
  • Environments
  • Getting Started Guides
    • Object Detection Guide
      • Loading Detection Datasets
      • Adding Object Detections
      • Finding Detection Mistakes
      • Evaluating Detections
      • Guide Summary
    • Medical Imaging Guide
      • Getting Started with Medical Imaging
      • Guide Summary
    • Self-Driving Guide
      • Loading Self-Driving Datasets
      • Advanced Self-Driving Techniques
      • Guide Summary
    • 3D Visual AI Guide
      • Getting Started with 3D Datasets
      • Getting Started with Loading 3D Annotations
      • Guide Summary
    • Model Evaluation Guide
      • Basic Model Evaluation
      • Advanced Evaluation Analysis
      • Guide Summary
  • Tutorials
    • DINOv3 visual search
    • pandas and FiftyOne
    • Evaluating object detections
    • Evaluating a classifier
    • Using image embeddings
    • Annotating with CVAT
    • Annotating with Labelbox
    • Working with Open Images
    • Training with Detectron2
    • Exploring image uniqueness
    • Finding class mistakes
    • Finding detection mistakes
    • Embeddings with Qdrant
    • Fine-tuning YOLOv8 models
    • 3D point clouds with Point-E
    • Monocular depth estimation
    • Dimensionality reduction
    • Zero-shot classification
    • Data augmentation
    • Clustering images
    • Detecting small objects
    • Anomaly detection
  • Recipes
    • Creating views
    • Removing duplicate images
    • Removing duplicate objects
    • Adding classifier predictions
    • Adding object detections
    • Draw labels on samples
    • Convert dataset formats
    • Merging datasets
    • Custom dataset importers
    • Custom dataset exporters
    • Custom sample parsers
  • Cheat Sheets
    • FiftyOne terminology
    • Filtering cheat sheet
    • Views cheat sheet
    • pandas vs FiftyOne
  • User Guide
    • FiftyOne basics
    • Importing data
    • Using datasets
    • Using the App
    • Dataset views
    • Grouped datasets
    • Annotating datasets
    • Evaluating models NEW
    • Using aggregations
    • Interactive plots
    • Exporting datasets
    • Drawing labels on samples
    • Using sample parsers
    • Configuring FiftyOne
  • Dataset Zoo
    • Built-in datasets
    • Remote datasets
    • API reference
  • Model Zoo
    • Built-in models
    • Remote models
    • Model interface
    • API reference
  • FiftyOne Brain
  • Integrations
    • COCO
    • Open Images
    • ActivityNet
    • CVAT
    • Label Studio
    • V7
    • Labelbox
    • Qdrant
    • Redis
    • Pinecone
    • MongoDB
    • Elasticsearch
    • PostgreSQL Pgvector
    • Databricks Mosaic AI
    • Milvus
    • LanceDB
    • Hugging Face
    • Ultralytics
    • Albumentations
    • SuperGradients
    • OpenCLIP
    • PyTorch Hub
    • Lightning Flash
  • Plugins
    • Overview
    • Using plugins
    • Developing plugins
    • API reference
      • plugins.operators
        • plugins.operators.model_evaluation
        • plugins.operators.group_by
      • plugins.panels
        • plugins.panels.model_evaluation
      • plugins.utils
        • plugins.utils.model_evaluation
    • TypeScript API reference
      • @fiftyone/state
      • @fiftyone/plugins
      • @fiftyone/operators
      • @fiftyone/spaces
      • @fiftyone/aggregations
      • @fiftyone/relay
      • @fiftyone/utilities
  • CLI
  • API Reference
    • fiftyone.brain
      • fiftyone.brain.internal
        • fiftyone.brain.internal.core
      • fiftyone.brain.config
      • fiftyone.brain.similarity
      • fiftyone.brain.visualization
    • fiftyone.core
      • fiftyone.core.map
        • fiftyone.core.map.batcher
        • fiftyone.core.map.factory
        • fiftyone.core.map.mapper
        • fiftyone.core.map.process
        • fiftyone.core.map.threading
        • fiftyone.core.map.typing
      • fiftyone.core.odm
        • fiftyone.core.odm.database
        • fiftyone.core.odm.dataset
        • fiftyone.core.odm.document
        • fiftyone.core.odm.embedded_document
        • fiftyone.core.odm.frame
        • fiftyone.core.odm.mixins
        • fiftyone.core.odm.runs
        • fiftyone.core.odm.sample
        • fiftyone.core.odm.utils
        • fiftyone.core.odm.views
        • fiftyone.core.odm.workspace
      • fiftyone.core.plots
        • fiftyone.core.plots.base
        • fiftyone.core.plots.manager
        • fiftyone.core.plots.matplotlib
        • fiftyone.core.plots.plotly
        • fiftyone.core.plots.utils
        • fiftyone.core.plots.views
      • fiftyone.core.session
        • fiftyone.core.session.client
        • fiftyone.core.session.events
        • fiftyone.core.session.notebooks
        • fiftyone.core.session.session
        • fiftyone.core.session.templates
      • fiftyone.core.threed
        • fiftyone.core.threed.camera
        • fiftyone.core.threed.lights
        • fiftyone.core.threed.material_3d
        • fiftyone.core.threed.mesh
        • fiftyone.core.threed.object_3d
        • fiftyone.core.threed.pointcloud
        • fiftyone.core.threed.scene_3d
        • fiftyone.core.threed.shape_3d
        • fiftyone.core.threed.transformation
        • fiftyone.core.threed.utils
        • fiftyone.core.threed.validators
      • fiftyone.core.aggregations
      • fiftyone.core.annotation
      • fiftyone.core.brain
      • fiftyone.core.cli
      • fiftyone.core.clips
      • fiftyone.core.collections
      • fiftyone.core.config
      • fiftyone.core.context
      • fiftyone.core.dataset
      • fiftyone.core.document
      • fiftyone.core.evaluation
      • fiftyone.core.expressions
      • fiftyone.core.fields
      • fiftyone.core.frame
      • fiftyone.core.frame_utils
      • fiftyone.core.groups
      • fiftyone.core.json
      • fiftyone.core.labels
      • fiftyone.core.logging
      • fiftyone.core.media
      • fiftyone.core.metadata
      • fiftyone.core.models
      • fiftyone.core.patches
      • fiftyone.core.runs
      • fiftyone.core.sample
      • fiftyone.core.service
      • fiftyone.core.singletons
      • fiftyone.core.stages
      • fiftyone.core.state
      • fiftyone.core.storage
      • fiftyone.core.uid
      • fiftyone.core.utils
      • fiftyone.core.validation
      • fiftyone.core.video
      • fiftyone.core.view
    • fiftyone.factory
      • fiftyone.factory.repos
        • fiftyone.factory.repos.delegated_operation
        • fiftyone.factory.repos.delegated_operation_doc
        • fiftyone.factory.repos.execution_store
      • fiftyone.factory.repo_factory
    • fiftyone.migrations
      • fiftyone.migrations.runner
    • fiftyone.operators
      • fiftyone.operators.cache
        • fiftyone.operators.cache.decorator
        • fiftyone.operators.cache.ephemeral
        • fiftyone.operators.cache.serialization
        • fiftyone.operators.cache.utils
      • fiftyone.operators.store
        • fiftyone.operators.store.models
        • fiftyone.operators.store.service
        • fiftyone.operators.store.store
      • fiftyone.operators.categories
      • fiftyone.operators.decorators
      • fiftyone.operators.delegated
      • fiftyone.operators.evaluation_metric
      • fiftyone.operators.events
      • fiftyone.operators.executor
      • fiftyone.operators.message
      • fiftyone.operators.operations
      • fiftyone.operators.operator
      • fiftyone.operators.panel
      • fiftyone.operators.permissions
      • fiftyone.operators.registry
      • fiftyone.operators.server
      • fiftyone.operators.types
      • fiftyone.operators.utils
    • fiftyone.plugins
      • fiftyone.plugins.constants
      • fiftyone.plugins.context
      • fiftyone.plugins.core
      • fiftyone.plugins.definitions
      • fiftyone.plugins.secrets
      • fiftyone.plugins.utils
    • fiftyone.types
      • fiftyone.types.dataset_types
    • fiftyone.utils
      • fiftyone.utils.clip
        • fiftyone.utils.clip.model
        • fiftyone.utils.clip.tokenizer
        • fiftyone.utils.clip.zoo
      • fiftyone.utils.data
        • fiftyone.utils.data.base
        • fiftyone.utils.data.converters
        • fiftyone.utils.data.exporters
        • fiftyone.utils.data.importers
        • fiftyone.utils.data.ingestors
        • fiftyone.utils.data.parsers
      • fiftyone.utils.eval
        • fiftyone.utils.eval.activitynet
        • fiftyone.utils.eval.base
        • fiftyone.utils.eval.classification
        • fiftyone.utils.eval.coco
        • fiftyone.utils.eval.detection
        • fiftyone.utils.eval.openimages
        • fiftyone.utils.eval.regression
        • fiftyone.utils.eval.segmentation
      • fiftyone.utils.tracking
        • fiftyone.utils.tracking.deepsort
      • fiftyone.utils.activitynet
      • fiftyone.utils.annotations
      • fiftyone.utils.aws
      • fiftyone.utils.bdd
      • fiftyone.utils.beam
      • fiftyone.utils.cityscapes
      • fiftyone.utils.coco
      • fiftyone.utils.csv
      • fiftyone.utils.cvat
      • fiftyone.utils.dicom
      • fiftyone.utils.eta
      • fiftyone.utils.fiw
      • fiftyone.utils.flash
      • fiftyone.utils.geojson
      • fiftyone.utils.geotiff
      • fiftyone.utils.github
      • fiftyone.utils.groups
      • fiftyone.utils.hmdb51
      • fiftyone.utils.huggingface
      • fiftyone.utils.image
      • fiftyone.utils.imagenet
      • fiftyone.utils.iou
      • fiftyone.utils.kinetics
      • fiftyone.utils.kitti
      • fiftyone.utils.labelbox
      • fiftyone.utils.labels
      • fiftyone.utils.labelstudio
      • fiftyone.utils.lfw
      • fiftyone.utils.open_clip
      • fiftyone.utils.openimages
      • fiftyone.utils.openlabel
      • fiftyone.utils.patches
      • fiftyone.utils.places
      • fiftyone.utils.quickstart
      • fiftyone.utils.random
      • fiftyone.utils.rerun
      • fiftyone.utils.sam
      • fiftyone.utils.sam2
      • fiftyone.utils.sama
      • fiftyone.utils.scale
      • fiftyone.utils.super_gradients
      • fiftyone.utils.tf
      • fiftyone.utils.torch
      • fiftyone.utils.transformers
      • fiftyone.utils.ucf101
      • fiftyone.utils.ultralytics
      • fiftyone.utils.useragent
      • fiftyone.utils.utils3d
      • fiftyone.utils.video
      • fiftyone.utils.voc
      • fiftyone.utils.yolo
      • fiftyone.utils.youtube
    • fiftyone.zoo
      • fiftyone.zoo.datasets
        • fiftyone.zoo.datasets.base
        • fiftyone.zoo.datasets.tf
        • fiftyone.zoo.datasets.torch
      • fiftyone.zoo.models
        • fiftyone.zoo.models.torch
    • fiftyone.constants
  • Release Notes
  • Deprecation Notices
  • FAQ
  • FiftyOne Tutorials
  • DINOv3 visual search
  Run in Google Colab   View source on GitHub   Download notebook

DINOv3 visual search#

1. Install Required Libraries#

We start by installing FiftyOne and the Hugging Face transformers library. This will allow us to load the DINOv3 model from Hugging Face and use FiftyOne’s dataset visualization and analysis features.

Note: We install transformers directly from the development branch to ensure compatibility with the latest DINOv3 features.

Since the DINOv3 functionality is not yet available in the stable transformers release, we install it from the development branch. See the FiftyOne + Hugging Face integration guide for more details on using experimental model versions.

[ ]:
!pip install --upgrade pip
!pip install git+https://github.com/huggingface/transformers
!pip install -q huggingface_hub
!pip install fiftyone

2. Log in to Hugging Face#

We authenticate with Hugging Face to retrieve the latest model weights. You must have access to the model you want to load. See Hugging Face authentication docs for details.

[ ]:
from huggingface_hub import notebook_login
notebook_login()

Load a quick start dataset#

To explore more dataset option visit the docs Dataset Zoo or load your own dataset

[ ]:
import fiftyone as fo
import fiftyone.zoo as foz

# You can load your own dataset
dataset = foz.load_zoo_dataset(
    "https://github.com/voxel51/coco-2017",
    split="validation",
)

You can load any of the model available in Hugging Face https://huggingface.co/collections/facebook/dinov3-68924841bd6b561778e31009

Thanks to the integration of Hugging Face in Fiftyone we are able to perform multiple tasks with the model, explore more here Integration HuggingFace

Working with DINOv3 Embeddings in FiftyOne#

In this example, we focus on using DINOv3 embeddings for visual search and similarity-based exploration in FiftyOne.

Workflow#

  1. Compute embeddings
    We run each image through the DINOv3 model and extract either:
    • The class token embedding (for global representation), or

    • The patch token embeddings (for more granular, region-level similarity).

    Learn more: Computing embeddings in FiftyOne.

  2. Visualize embeddings
    We project the embeddings into 2D space using dimensionality reduction (e.g., t-SNE or UMAP) so we can see clusters of visually similar images.
    • FiftyOne makes this interactive through its Embeddings Visualization in the App.

  3. Compute similarity search
    Using FiftyOne’s similarity search tools, we select a query image (in this example, the first image in the dataset, but you can choose any).
    • The system finds and ranks the most visually similar images based on embedding distance.

    • Docs: Similarity search in FiftyOne.

  4. Sort by similarity
    We display the results sorted from most to least similar, making it easy to:
    • Detect near-duplicates.

    • Explore visual clusters.

    • Identify outliers in the dataset.

[51]:
import transformers
import fiftyone.utils.transformers as fouhft
transformers_model = transformers.AutoModel.from_pretrained("facebook/dinov3-vitl16-pretrain-lvd1689m")
model_config = fouhft.FiftyOneTransformerConfig(
    {
        "model": transformers_model,
        "name_or_path":"facebook/dinov3-vitl16-pretrain-lvd1689m",
    }
)
model = fouhft.FiftyOneTransformer(model_config)
[52]:
dataset.compute_embeddings(model, embeddings_field="embeddings_dinov3")
 100% |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5000/5000 [5.1m elapsed, 0s remaining, 16.3 samples/s]
INFO:eta.core.utils: 100% |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5000/5000 [5.1m elapsed, 0s remaining, 16.3 samples/s]
[54]:
dataset
[54]:
Name:        voxel51/coco-2017-validation
Media type:  image
Num samples: 5000
Persistent:  False
Tags:        []
Sample fields:
    id:                fiftyone.core.fields.ObjectIdField
    filepath:          fiftyone.core.fields.StringField
    tags:              fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:          fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)
    created_at:        fiftyone.core.fields.DateTimeField
    last_modified_at:  fiftyone.core.fields.DateTimeField
    detections:        fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    segmentations:     fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    embeddings_dinov3: fiftyone.core.fields.VectorField
[55]:
import fiftyone.brain as fob

viz = fob.compute_visualization(
    dataset,
    embeddings="embeddings_dinov3",
    brain_key="dino_dense_umap"
)
Generating visualization...
INFO:fiftyone.brain.visualization:Generating visualization...
UMAP( verbose=True)
Fri Aug 15 20:04:17 2025 Construct fuzzy simplicial set
Fri Aug 15 20:04:17 2025 Finding Nearest Neighbors
Fri Aug 15 20:04:17 2025 Building RP forest with 9 trees
Fri Aug 15 20:04:23 2025 NN descent for 12 iterations
         1  /  12
         2  /  12
         3  /  12
         4  /  12
         5  /  12
        Stopping threshold met -- exiting after 5 iterations
Fri Aug 15 20:04:39 2025 Finished Nearest Neighbor Search
Fri Aug 15 20:04:39 2025 Construct embedding
        completed  0  /  500 epochs
        completed  50  /  500 epochs
        completed  100  /  500 epochs
        completed  150  /  500 epochs
        completed  200  /  500 epochs
        completed  250  /  500 epochs
        completed  300  /  500 epochs
        completed  350  /  500 epochs
        completed  400  /  500 epochs
        completed  450  /  500 epochs
Fri Aug 15 20:04:44 2025 Finished embedding
[57]:
session = fo.launch_app(dataset, port=5151)
[57]:
print(session.url)
https://5151-gpu-t4-s-85ayl83jjz0q-a.us-west4-1.prod.colab.dev?polling=true
[58]:
idx = fob.compute_similarity(
    dataset,
    embeddings="embeddings_dinov3",
    metric="cosine",
    brain_key="dino_sim",
)
[59]:
query_id = dataset.first().id
view = dataset.sort_by_similarity(query_id, k=20)
[115]:
session.view = view
path

Classification Tasks with DINOv3#

In this example, we use the DINOv3 model to perform an image classification task by integrating its embeddings with a Logistic Regression (linear) classifier.

Workflow#

  1. Extract embeddings
    We feed each image through the DINOv3 model and extract the class token embedding.
    • The class token acts as a compact representation of the entire image.

    • More on embeddings: FiftyOne embeddings guide.

  2. Train a linear classifier
    Using the extracted embeddings as input features and the ground truth labels from our dataset, we train a Logistic Regression (linear) classifier to predict image classes.
    • Linear classifiers are effective for high-dimensional feature spaces like DINOv3 embeddings.

  3. Run inference
    We pass unseen images through the same pipeline to generate embeddings and predict their class labels using the trained SVM.
  4. Evaluate results in FiftyOne
    We visualize and analyze the model predictions in FiftyOne, using:
    • Classification evaluation to compute metrics like accuracy, precision, and recall.

    • Confusion matrix to see where the model is making mistakes.

Get id, path, embeddinds and classes to create a classifier

[95]:
from collections import Counter
from sklearn.preprocessing import normalize
from sklearn.linear_model import LogisticRegression
import numpy as np

ids        = dataset.values("id")
paths      = dataset.values("filepath")
embs       = dataset.values("embeddings_dinov3")
det_lists  = dataset.values("detections.detections.label")
img_labels = [Counter(L).most_common(1)[0][0] if L else None for L in det_lists]

dataset.set_values(
    "image_label",
    [fo.Classification(label=l) if l is not None else None for l in img_labels],
)
[96]:
mask = [(x is not None) and (y is not None) for x, y in zip(embs, img_labels)]
X = normalize(np.stack([x for x,m in zip(embs,mask) if m], axis=0))
y = [lab for lab,m in zip(img_labels,mask) if m]
[97]:
# 3) Train a tiny linear head
clf = LogisticRegression(max_iter=2000, class_weight="balanced", n_jobs=-1).fit(X, y)
[ ]:
# --- inference on ALL samples using embeddings only ---
for sample in dataset.iter_samples(autosave=True, progress=True):
    v = sample["embeddings_dinov3"]
    if v is None:
        continue

    X = normalize(np.asarray(v, dtype=np.float32).reshape(1, -1))
    p = clf.predict_proba(X)[0]
    k = int(np.argmax(p))

    sample["predict_dinov3"] = fo.Classification(
        label=str(clf.classes_[k]),
        confidence=float(p[k]),
    )
 100% |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5000/5000 [1.2m elapsed, 0s remaining, 93.4 samples/s]
INFO:eta.core.utils: 100% |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5000/5000 [1.2m elapsed, 0s remaining, 93.4 samples/s]

Evaluate the results of the classification

[99]:
results = dataset.evaluate_classifications(
    "predict_dinov3", gt_field="image_label", method="simple", eval_key="dino_simple"
)
path

PCA/CLS Foreground Segmentation with DINOv3#

This step builds a foreground mask per image straight from DINOv3’s internal features, no training required. It’s useful to:

  • Quickly highlight the main subject (saliency-ish) to reduce background bias

  • Improve visual search by focusing on foreground regions

  • Speed up data curation (find images with weak/strong foreground, spot occlusions)

  • Generate lightweight pseudo-labels that you can review in the FiftyOne App

What we compute (high level)#

  • ViT models (DINOv3 ViT): for each image patch, compute the cosine similarity to the CLS token. Patches more aligned with CLS are treated as foreground.

  • ConvNeXt-style models: compute the cosine similarity of each spatial feature to the global average feature vector.

We then normalize β†’ optionally smooth β†’ threshold the similarity map:

  1. Min–max scale to [0, 1]

  2. Optional average pooling in patch space to denoise

  3. Threshold to get a binary mask (foreground/background)

Finally, we upsample to the original image size and write the results into the dataset:

  • A binary segmentation field (fo.Segmentation)

  • Optionally, a soft heatmap field (fo.Heatmap) with values in [0, 1]

These overlays render natively in the FiftyOne App.

[117]:
import numpy as np
from PIL import Image, ImageOps
import torch
import torch.nn.functional as F
import fiftyone as fo
from fiftyone import Segmentation, Heatmap
from transformers import AutoImageProcessor, AutoModel

def build_pca_fg_masks(
    dataset: fo.Dataset,
    model_id: str = "facebook/dinov3-vits16-pretrain-lvd1689m",
    field: str = "pca_fg",                 # Segmentation field (binary 0/1)
    heatmap_field: str | None = None,      # optional: store soft map (0..1) as fo.Heatmap
    thresh: float = 0.5,                   # FG threshold after smoothing
    smooth_k: int = 3,                     # avg-pool kernel in patch space (0/1 to disable)
    device: str | None = None,
):
    """
    Compute a DINOv3 PCA/CLS-style foreground mask for every sample and write to dataset.

    - ViT: cosine(sim) to CLS over patch tokens
    - ConvNeXt: cosine(sim) to global-avg feature over feature map
    - Masks are overlaid natively in the FiftyOne App.
    """
    device = device or ("cuda" if torch.cuda.is_available() else "cpu")
    processor = AutoImageProcessor.from_pretrained(model_id)
    model = AutoModel.from_pretrained(model_id).to(device).eval()

    # --- schema (once) ---
    if not dataset.has_field(field):
        dataset.add_sample_field(field, fo.EmbeddedDocumentField, embedded_doc_type=fo.Segmentation)

    if heatmap_field and not dataset.has_field(heatmap_field):
        dataset.add_sample_field(heatmap_field, fo.EmbeddedDocumentField, embedded_doc_type=fo.Heatmap)

    # mask targets (older APIs use property, not a setter)
    mt = dict(dataset.mask_targets or {})
    mt[field] = {0: "background", 1: "foreground"}
    dataset.mask_targets = mt
    dataset.save()

    @torch.inference_mode()
    def _fg_mask(path: str) -> tuple[np.ndarray, np.ndarray]:
        """Returns (mask_uint8_HxW, soft_fg01_HxW_float32)."""
        img = ImageOps.exif_transpose(Image.open(path).convert("RGB"))
        W0, H0 = img.size

        bf = processor(images=img, return_tensors="pt").to(device)
        last = model(**bf).last_hidden_state  # ViT: [B,1+R+P,D]  |  ConvNeXt: [B,C,H,W]

        # ---- ViT path ----
        if last.ndim == 3:
            hs = last[0].float()                               # [1+R+P,D]
            num_reg = getattr(model.config, "num_register_tokens", 0)
            patch = getattr(model.config, "patch_size", 16)
            patches = hs[1 + num_reg :, :]                     # [P,D]
            _, _, Hc, Wc = bf["pixel_values"].shape
            gh, gw = Hc // patch, Wc // patch

            cls = hs[0:1, :]
            sims = (F.normalize(patches, dim=1) @ F.normalize(cls, dim=1).T).squeeze(1)  # [P]
            fg = sims.detach().cpu().view(gh, gw)               # CPU [gh,gw]

        # ---- ConvNeXt path ----
        else:
            fm = last[0].float()                                # [C,H,W]
            C, gh, gw = fm.shape
            grid = F.normalize(fm.permute(1, 2, 0).reshape(-1, C), dim=1)      # [H*W,C]
            gvec = F.normalize(fm.mean(dim=(1, 2), keepdim=True).squeeze().unsqueeze(0), dim=1)  # [1,C]
            fg = (grid @ gvec.T).detach().cpu().reshape(gh, gw) # CPU [gh,gw]

        # min-max β†’ [0,1]
        fg01 = (fg - fg.min()) / (fg.max() - fg.min() + 1e-8)

        # optional smoothing in patch space
        if smooth_k and smooth_k > 1:
            fg01 = F.avg_pool2d(fg01.unsqueeze(0).unsqueeze(0), smooth_k, 1, smooth_k // 2).squeeze()

        # threshold β†’ binary mask on patch grid
        mask_small = (fg01 > thresh).to(torch.uint8).numpy()    # [gh,gw] {0,1}

        # upsample both to original size
        mask_full = Image.fromarray(mask_small * 255).resize((W0, H0), Image.NEAREST)
        soft_full = Image.fromarray((fg01.numpy() * 255).astype(np.uint8)).resize((W0, H0), Image.BILINEAR)

        mask = (np.array(mask_full) > 127).astype(np.uint8)     # HxW {0,1}
        soft = np.array(soft_full).astype(np.float32) / 255.0   # HxW [0,1]
        return mask, soft

    # --- process all samples ---
    skipped = 0
    for s in dataset.iter_samples(autosave=True, progress=True):
        try:
            m, soft = _fg_mask(s.filepath)
            s[field] = Segmentation(mask=m)
            if heatmap_field:
                # Heatmap expects a 2D float array in [0,1]; the App colors it
                s[heatmap_field] = Heatmap(map=soft)
        except Exception:
            s[field] = None
            if heatmap_field:
                s[heatmap_field] = None
            skipped += 1

    print(f"βœ“ wrote masks to '{field}'" + (f" and heatmaps to '{heatmap_field}'" if heatmap_field else "") + f". skipped: {skipped}")
[118]:
build_pca_fg_masks(dataset, field="pca_fg", heatmap_field="pca_fg_heat", thresh=0.5, smooth_k=3)
 100% |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5000/5000 [6.1m elapsed, 0s remaining, 18.1 samples/s]
INFO:eta.core.utils: 100% |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5000/5000 [6.1m elapsed, 0s remaining, 18.1 samples/s]
βœ“ wrote masks to 'pca_fg' and heatmaps to 'pca_fg_heat'. skipped: 0
path

previous

FiftyOne Tutorials

next

pandas-style queries in FiftyOne

IN THIS ARTICLE
  • 1. Install Required Libraries
  • 2. Log in to Hugging Face
    • Load a quick start dataset
  • Working with DINOv3 Embeddings in FiftyOne
    • Workflow
  • Classification Tasks with DINOv3
    • Workflow
  • PCA/CLS Foreground Segmentation with DINOv3
    • What we compute (high level)
Voxel51
Talk to a CV expert
Product
Data Annotation Data Curation Model Evaluation Integrations Plugins Pricing
Solutions
Agriculture Autonomous Systems Defense Healthcare Manufacturing Retail Robotics Security
Developers
Documentation Events & Meetups Computer Vision Glossary Community
Resources
Blog Customer Stories Model Zoo Dataset Zoo CV Research
Company
About Voxel51 Careers Press

Β© Copyright 2017-2025, Voxel51, Inc.

Terms of Service
Privacy Policy