Note

This is a community plugin, an external project maintained by its respective author. Community plugins are not part of FiftyOne core and may change independently. Please review each plugin’s documentation and license before use.

mose-v2#

A FiftyOne remote zoo dataset integration for MOSEv2, a large-scale video object segmentation benchmark: thousands of videos, instance masks, and diverse real-world conditions (occlusion, small objects, weather, low light, camouflage, etc.). See the project site and upstream repo for the full benchmark description.

Source and citation#

Website: mose.video
GitHub: MOSEv2
Hugging Face (dataset card): FudanCVL/MOSEv2
License: Original MOVEv2 terms: CC BY-NC-SA 4.0
Citation: See also other related citations from the official MOSEv2 website.

@article{MOSEv2,
  title={{MOSEv2}: A More Challenging Dataset for Video Object Segmentation in Complex Scenes},
  author={Ding, Henghui and Ying, Kaining and Liu, Chang and He, Shuting and Jiang, Xudong and Jiang, Yu-Gang and Torr, Philip HS and Bai, Song},
  journal={arXiv preprint arXiv:2508.05630},
  year={2025}
}

Quick start#

Installation

pip install fiftyone
pip install gdown   # required for Google Drive download; see also requirements.txt

Load via the FiftyOne Dataset Zoo

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset(
    "https://github.com/voxel51/mose-v2",
    split="train",  # or "validation"
    max_samples=1000,  # optional, for quicker exploration
)

session = fo.launch_app(dataset)

# For a dynamic Grouped view
grouped_view = dataset.group_by("sequence_id", order_by="frame_number")

Notes:#

Downloads train and validation archives from Google Drive (file IDs are in __init__.py as DRIVE_FILE_IDS).
Extracts train/ and valid/ under the FiftyOne-managed dataset directory. A symlink validation → valid is created when needed so split names match FiftyOne’s expectations.

dataset_dir/
  train/
    JPEGImages/<sequence_name>/{00000,00001,...}.jpg
    Annotations/<sequence_name>/{00000,00001,...}.png
  valid/
    JPEGImages/<sequence_name>/{00000,00001,...}.jpg
    Annotations/<sequence_name>/00000.png

Registers one sample per video frame. Segmentation is stored as an indexed PNG per frame (ground_truth: fo.Segmentation with mask_path).
Annotation masks are 8-bit indexed PNGs: pixel value 0 is background; value N is object instance N.

Sample fields#

Field	Role
`filepath`	Path to the JPEG frame
`sequence_id`	Video sequence name
`frame_number`	Zero-based frame index
`tags`	Split and sequence (e.g. `train`, sequence id)
`ground_truth`	`Segmentation` with `mask_path` to the indexed PNG

Statistics#

Split	Sequences	Total Samples	Annotated Samples
train	3,666	311,843	311,843
validation	433	66,526	433 (first frame only)

Visualize#

Each image is tagged with its split and with its sequence name — frames that share a sequence_id belong to the same clip.

For a video-like browser in the App, use a dynamic grouped view — one group per sequence, frames ordered by frame_number.

MOSEv2 sample visualization (grid)

MOSEv2 grouped / carousel view