OXE-AugE Banner

OXE-AugE: Cross-Embodiment Robot Augmentation at Scale

Augmentation for OXE dataset

🔗 Project Page · 📄 Paper · 💻 GitHub · 🐦 Twitter

TL;DR

What we do. We transform one set of source-robot demos into 9 robots in total via a pipeline of source replay → segmentation & inpainting → target replay, then package everything in LeRobot format.
Why it matters. The same task can be trained and evaluated across embodiments and labs, enabling robust cross-embodiment learning and large-scale pretraining.
What’s included. We currently provide 16 datasets, covering 9 robots, totaling 4.4 millions of trajectories).

🤖 Robots & Coverage

Legend: ● = source robot | ✓ = augmented demos available For the full, current table, see the Dashboard or dataset READMEs.

Dataset	Panda	UR5e	Xarm7	Google	WidowX	Sawyer	Kinova3	IIWA	Jaco	# Episodes
Berkeley AUTOLab UR5	✓	●	✓	✓	✓	✓	✓	✓	✓	1000
TACO Play	●	✓	✓	✓	✓	✓	✓	✓	✓	3603
Austin BUDS	●	✓	✓	✓	✓	✓	✓	✓	✓	50
Austin Mutex	●	✓	✓	✓	✓	✓	✓	✓	✓	1500
Austin Sailor	●	✓	✓	✓	✓	✓	✓	✓	✓	240
CMU Franka Pick-Insert	●	✓	✓	✓	✓	✓	✓	✓	✓	631
KAIST Nonprehensile	●	✓	✓	✓	✓	✓	✓	✓	✓	201
NYU Franka Play	●	✓	✓	✓	✓	✓	✓	✓	✓	456
TOTO	●	✓	✓	✓	✓	✓	✓	✓	✓	1003
UTokyo xArm PickPlace	✓	✓	●	✓	✓	✓	✓	✓	✓	102
UCSD Kitchen	✓	✓	●	✓	✓	✓	✓	✓	✓	150
Austin VIOLA	●	✓	✓	✓	✓	✓	✓	✓	✓	150
Bridge	✓	✓	✓	✓	●	✓	✓	✓	✓	38935
RT-1 Robot Action	✓	✓	✓	●		✓	✓	✓	✓	87212
Jaco Play	✓	✓	✓	✓	✓	✓	✓	✓	●	1084
Language Table	✓	✓	●	✓		✓	✓	✓	✓	442226

📦 How to Use (Example script to extract frames and robot states)

#!/usr/bin/env python3
import csv
from pathlib import Path
import torch
from PIL import Image
from lerobot.datasets.lerobot_dataset import LeRobotDataset

# --- change me ---
n = 2  # 0-based episode index to extract
REPO_ID = "oxe-aug/jaco_play_test_0_108"
ROBOT_ENTITY = "google_robot"
KEYS = {
    "image": f"observation.images.{ROBOT_ENTITY}",
    "state": [
        f"observation.{ROBOT_ENTITY}.{field}"
        for field in (
            "base_orientation",
            "base_position",
            "ee_error",
            "ee_pose",
            "joints",
        )
    ],
}
OUT = Path(f"./episode_{n:06d}_frames")
ds = LeRobotDataset(REPO_ID, video_backend="pyav")  # simple video backend
ep_key = "episode_index"

# find n-th episode bounds by linear scan (no checks)
i = 0
k = -1
cur = None
while i < len(ds):
    s = ds[i]
    if cur is None or s[ep_key] != cur:
        k += 1
        cur = s[ep_key]
        if k == n:
            start = i
            # advance to end of this episode
            j = i
            while j < len(ds) and ds[j][ep_key] == cur:
                j += 1
            end = j
            break
    i += 1

OUT.mkdir(parents=True, exist_ok=True)

def to_uint8_hwc(x: torch.Tensor) -> torch.Tensor:
    return (
        x.detach()
        .mul(255.0)
        .clamp(0, 255)
        .round()
        .to(torch.uint8)
        .permute(1, 2, 0)
        .contiguous()
        .cpu()
    )

csv_path = OUT / f"ep{n:06d}_{ROBOT_ENTITY}_state.csv"

def _flat_list(t):
    return t.detach().cpu().flatten().tolist()

first_sample = ds[start]

# build header (discover lengths from the first frame in this episode)
header = ["frame"]
for k in KEYS["state"]:
    vals = _flat_list(first_sample[k])
    header += [f"{k}[{i}]" for i in range(len(vals))]

# save every frame in [start, end) while writing CSV rows
with open(csv_path, "w", newline="") as fcsv:
    writer = csv.writer(fcsv)
    writer.writerow(header)

    frame = 0
    use_first_sample = True
    for idx in range(start, end):
        if use_first_sample:
            sample = first_sample
            use_first_sample = False
        else:
            sample = ds[idx]

        img = sample[KEYS["image"]]
        im = to_uint8_hwc(img)
        Image.fromarray(im.numpy()).save(OUT / f"ep{n:06d}_f{frame:06d}.png")

        row = [frame]
        for k in KEYS["state"]:
            row += _flat_list(sample[k])
        writer.writerow(row)

        frame += 1

total_frames = frame

print(f"Saved {total_frames} PNGs to {OUT.resolve()}")
print(f"Saved {ROBOT_ENTITY} state CSV -> {csv_path.resolve()}")

🗓 Updates

{{2025-11}} — Released All 16 augmented datasets

📚 Citation

If you use OXE-AUG datasets or tools, please cite:

@misc{
  ji2025oxeaug,
  title  = {OXE-Aug: A Large-Scale Robot Augmentation of OXE for Scaling Cross-Embodiment Policy Learning},
  author = {Ji, Guanhua and Polavaram, Harsha and Chen, Lawrence Yunliang and Bajamahal, Sandeep and Ma, Zehan and Adebola, Simeon and Xu, Chenfeng and Goldberg, Ken},
  year   = {2025},
  note   = {Manuscript}
}

Also cite upstream datasets you rely on (see per-shard READMEs for references).

🪪 License & Responsible Use

Datasets: CC BY 4.0 (attribution required; state your modifications in derivatives).
Code: Apache-2.0 / MIT (match your repo choice).
Responsible Use: No personal data; research/robotics use; do not deploy in unlawful or harmful contexts.

🤝 Contribute & Contact

Contribute new shards or fixes via Issues/PRs on the corresponding dataset repos.
Collaboration, permissions, media: jgh1013@seas.upenn.edu
Organization home: https://huggingface.co/oxe-aug