Note
This is a Hugging Face dataset. Learn how to load datasets from the Hub in the Hugging Face integration docs.
Dataset Card for aloha_pen_uncap#

This dataset is a FiftyOne conversion in LeRobot format of the aloha_pen_uncap_diverse subset of BiPlay.
The aloha_pen_uncap_diverse subset is a task-specific segment of BiPlay focusing on the long-horizon, dexterous bimanual task of un-capping a pen under diverse conditions. It contains episodes where the robot is required to grasp a pen and successfully remove its cap—an action requiring coordination and dexterity—across a wide range of object placements, backgrounds, and distractor objects. This diversity is designed specifically to benchmark policy generalization and to test the ability of learned policies (such as diffusion transformer-based ones) to adapt to varied real-world scenarios[4][5].
Key attributes of the aloha_pen_uncap_diverse subset:
Task: Bimanual pen uncapping with an ALOHA robot, including significant variation in scene and object arrangement.
Format: Converted into the LeRobot dataset v2.0 format for compatibility with common robotics learning frameworks[6][4].
Data Contents: The dataset includes state sequences, action sequences, velocities, efforts, and high-resolution images from multiple camera viewpoints for each time step.
Research Use: Commonly used to benchmark methods such as Diffusion Transformer Policies (DiT-Policy), which aim for robust, generalizable robotic manipulation through large-scale, language-annotated data[3][7].
Installation#
If you haven’t already, install FiftyOne:
pip install -U fiftyone
Usage#
import fiftyone as fo
from fiftyone.utils.huggingface import load_from_hub
# Load the dataset
# Note: other available arguments include 'max_samples', etc
dataset = load_from_hub("Voxel51/aloha_pen_uncap")
# Launch the App
session = fo.launch_app(dataset)
Dataset Sources#
• Paper: https://huggingface.co/papers/2410.10088
• Code: https://github.com/sudeepdasari/dit-policy
Learn more about converting LeRobot format datasets into FiftyOne format: https://github.com/harpreetsahota204/fiftyone_lerobot_importer
Citation#
@inproceedings{dasari2025ingredients,
title={The Ingredients for Robotic Diffusion Transformers},
author={Sudeep Dasari and Oier Mees and Sebastian Zhao and Mohan Kumar Srirama and Sergey Levine},
booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
year={2025},
address = {Atlanta, USA}
}