Skip to content

Latest commit

 

History

History
230 lines (155 loc) · 8.94 KB

File metadata and controls

230 lines (155 loc) · 8.94 KB

ASkDAgger: Active Skill-level Data Aggregation for Interactive Imitation Learning

arXiv OpenReview Project Page TMLR 2025 Open In Colab Python 3.10

This repository contains the code for the CLIPort experiments from the paper:

ASkDAgger: Active Skill-level Data Aggregation for Interactive Imitation Learning
Jelle Luijkx, Zlatan Ajanović, Laura Ferranti, Jens Kober — TMLR 2025
[arXiv] [OpenReview] [Project page]

Related repositories from the same project: askdagger_mnist for the SAG validation experiments on MNIST.

Overview

ASkDAgger overview

Figure 1. The ASkDAgger framework consists of three main components: S-Aware Gating (SAG), Foresight Interactive Experience Replay (FIER), and Prioritized Interactive Experience Replay (PIER). In this interactive imitation learning framework, we allow the novice to say "I plan to do this, but I am uncertain." The uncertainty gating threshold is set by SAG to track a user-specified metric — sensitivity, specificity, or minimum system success rate. Teacher feedback is obtained with FIER, enabling demonstrations through validation, relabeling, or teacher annotation. PIER then prioritizes replay based on novice success, uncertainty, and demonstration age.

Videos and real-robot results are available on the project page.

Installation

Prerequisites: install uv

We advise using uv to install the dependencies of the askdagger_cliport package. Please install uv by following the installation instructions if you don't have it yet.

Install askdagger_cliport

Clone the repository and set the ASKDAGGER_ROOT environment variable:

git clone git@github.com:askdagger/askdagger_cliport.git
cd askdagger_cliport
export ASKDAGGER_ROOT=$(pwd)
echo "export ASKDAGGER_ROOT=$(pwd)" >> ~/.bashrc

Create and activate the virtual environment:

uv venv --python 3.10
source .venv/bin/activate

Install the package:

uv pip install -e .

Download the Google Scanned Objects used by the CLIPort tasks:

./scripts/google_objects_download.sh

Verify the installation

Run a short interactive training run with ASkDAgger (requires ~8GB of GPU memory):

python src/askdagger_cliport/train_interactive.py \
  interactive_demos=3 save_every=3 disp=True \
  exp_folder=exps_test train_interactive.batch_size=1

Then evaluate the resulting policy:

python src/askdagger_cliport/eval.py interactive_demos=3 n_demos=5 disp=True exp_folder=exps_test

The policy will likely fail — 3 demonstrations are not enough for convergence — but the code should run end-to-end without errors.

You can also run the test suite to confirm everything is working:

pytest tests

Reproducing the paper

Download pretrained results and generate the figures

To download the results used in the paper:

python scripts/results_download.py

Then generate each figure:

python figures/demo_types.py
python figures/domain_shift.py
python figures/training_scenario1.py
python figures/training_scenario2.py
python figures/training_scenario3.py
python figures/sensitivity.py
python figures/evaluation.py
python figures/real.py

The figures will appear in the figures/ directory.

Retrain from scratch (SLURM)

Full reproduction requires ~40GB of GPU memory per run. Example SLURM batch scripts are provided under scripts/ — treat them as templates and adjust for your cluster.

Training (run once with ASKDAGGER=True and once with ASKDAGGER=False):

sbatch --array=0-39 scripts/train_interactive.sh

Evaluation on seen and unseen objects (also once per ASKDAGGER value):

sbatch --array=0-239 scripts/eval.sh
sbatch --array=0-239 scripts/eval_unseen.sh

Domain-shift ablations (run three times, once per (PIER, FIER) configuration: (True, True), (True, False), (False, True)):

sbatch --array=0-9 scripts/train_interactive_domain_shift.sh

Train a single model

Train one model interactively:

python src/askdagger_cliport/train_interactive.py

Evaluate it:

python src/askdagger_cliport/eval.py

Available arguments for training and evaluation are defined in src/askdagger_cliport/cfg.

Interactive training after BC pretraining

You can also start from an offline Behavioral Cloning (BC) checkpoint and continue interactively. This example uses 50 offline demos (25 train, 25 val).

Generate demonstrations:

python src/askdagger_cliport/demos.py mode=train n=25
python src/askdagger_cliport/demos.py mode=val n=25

Pretrain with BC:

python src/askdagger_cliport/train.py \
  train.n_demos=25 train.n_val=25 train.n_steps=200 train.save_steps=[200]

Continue interactively with ASkDAgger:

python src/askdagger_cliport/train_interactive.py train_demos=25 train_steps=200

Evaluate:

python src/askdagger_cliport/eval.py train_demos=25

Notebook and Colab

A Jupyter notebook walks through the interactive training procedure and visualizes the novice's actions and demonstrations. Start Jupyter locally:

jupyter-lab $ASKDAGGER_ROOT

and open notebooks/askdagger_cliport.ipynb.

The notebook is also available on Colab.

Citation

If you find this work useful, please consider citing:

@article{luijkx2025askdagger,
  title   = {{ASkDAgger}: Active Skill-level Data Aggregation for Interactive Imitation Learning},
  author  = {Luijkx, Jelle and Ajanovi{\'c}, Zlatan and Ferranti, Laura and Kober, Jens},
  journal = {Transactions on Machine Learning Research (TMLR)},
  year    = {2025},
  url     = {https://openreview.net/forum?id=987Az9f8fT}
}

Acknowledgements

This work uses code from the following open-source projects and datasets.

CLIPortrepo · Apache 2.0 The code under src/ is primarily based on the CLIPort codebase. We added new files for interactive training — interactive_agent.py, pier.py, sag.py, train_interactive.py, uncertainty_quantification.py — removed some layers from clip_lingunet_lat.py and resnet_lat.py to reduce the GPU memory footprint, and replaced ReLU activations with LeakyReLU to mitigate vanishing-gradient issues observed during interactive training. Other changes are minor adjustments to support interactive training, demonstration relabeling, and prioritization with PIER.

CLIPort-Batchifyrepo · Apache 2.0 We implemented batch training for CLIPort following the changes in this repo.

Google Ravens (TransporterNets)repo · Apache 2.0 We use the tasks as adapted for CLIPort to include unseen objects as distractors, and added separate packing-seen-shapes and packing-unseen-shapes tasks.

OpenAI CLIPrepo · MIT Used as adapted for CLIPort, with minor bug fixes.

Google Scanned Objectsdataset · CC BY 4.0 We use the objects as adapted for CLIPort with the center-of-mass fixed to the geometric center for selected objects.

U-Netrepo · GPL 3.0 Used as-is in src/askdagger_cliport/models/core/unet.py. Note: this file is GPL 3.0.