vitpose-base-simple-torch#
Simplified ViTPose with 90M parameters removing complex decoder components. Maintains 75.1 AP through direct heatmap prediction from transformer features. Streamlined architecture for easier deployment while preserving accuracy on human pose tasks..
Details
Model name:
vitpose-base-simple-torchModel source: ViTAE-Transformer/ViTPose
Model author: Yufei Xu, et al.
Model license: Apache 2.0
Model size: 327.75 MB
Exposes embeddings? no
Tags:
keypoints, coco, torch, transformers, pose-estimation
Requirements
Packages:
torch, torchvision, transformersCPU support
yes
GPU support
yes
Example usage
1import fiftyone as fo
2import fiftyone.zoo as foz
3
4dataset = foz.load_zoo_dataset(
5 "coco-2017",
6 split="validation",
7 dataset_name=fo.get_default_dataset_name(),
8 max_samples=50,
9 shuffle=True,
10)
11
12model = foz.load_zoo_model("vitpose-base-simple-torch")
13
14dataset.apply_model(model, prompt_field="ground_truth", label_field="predictions")
15
16session = fo.launch_app(dataset)