GitHub - autonomousvision/lead: A research framework for autonomous driving in CARLA, features TransFuser v6. Accompanies the paper "LEAD: Minimizing Learner-Expert Asymmetry in End-to-End Driving"

Minimizing Learner–Expert Asymmetry in End-to-End Driving

video.mp4

Updates

[2026/01/18] Deactivated Kalman filter

By default, we deactivate the Kalman filter used for ego state estimation and GPS target-point smoothing to evaluate the policy in a fully end-to-end setting. While this may slightly reduce closed-loop performance, it avoids unrealistically noise-free target points. To turn the kalman filter on, set use_kalman_filter=True in config_closed_loop.py.
[2026/01/13] CARLA dataset and training documentation released

We publicly release a CARLA dataset generated with the same pipeline as described in the paper. Note that due to subsequent refactoring and code cleanup, the released dataset differs from the original dataset used in our experiments. Performance on the new dataset is similar to the reported performance.
[2026/01/05] Deactivated stop-sign heuristic

By default, we deactivate explicit stop-sign handling to evaluate the policy in a fully end-to-end setting. This may slightly reduce closed-loop performance compared to earlier runs. To turn the heuristic on, set slower_for_stop_sign=True in config_closed_loop.py.
[2026/01/05] RoutePlanner bug fix

Fixed an index error that caused the driving policy to crash at the end of routes in Town13. Driving scores have been updated accordingly.
[2025/12/24] Initial release

Paper, checkpoints, expert driver, and inference code are now available.

Quick Start

1. Environment initialization

Clone the repository and map the project root to your environment

git clone https://github.com/autonomousvision/lead.git
cd lead

# Setup environment, important!
echo -e "export LEAD_PROJECT_ROOT=$(pwd)" >> ~/.bashrc  # Set project root variable
echo "source $(pwd)/scripts/main.sh" >> ~/.bashrc       # Persist more environment variables
source ~/.bashrc                                        # Reload config

Please verify that ~/.bashrc reflects these paths correctly.

2. Install dependencies

We utilize Miniconda, conda-lock and uv:

# Install conda-lock and create conda environment
pip install conda-lock && conda-lock install -n lead conda-lock.yml

# Activate conda environment
conda activate lead

# Install dependencies and setup git hooks
pip install uv && uv pip install -r requirements.txt && uv pip install -e .

# Install other tools needed for development
conda install -c conda-forge ffmpeg parallel tree gcc zip unzip

# Optional: Activate git hooks
pre-commit install

While waiting for dependencies installation, we recommend setting up CARLA and downloading checkpoints on parallel:

# Download and setup CARLA at 3rd_party/CARLA_0915
bash scripts/setup_carla.sh

3. Download checkpoints

Pre-trained checkpoints are hosted on HuggingFace for reproducibility. These checkpoints follow the TFv6 architecture, but differ in their sensor configurations, vision backbones or dataset composition.

Description	Bench2Drive	Longest6 v2	Town13	Checkpoint
Full TransFuser V6	95.2	62	5.24	Link
ResNet34 backbone with 60M parameters	94.7	57	5.01	Link
Rear camera as additional input	95.1	53	TBD	Link
Radar sensor removed	94.7	52	TBD	Link
Vision only driving	91.6	43	TBD	Link
Removed Town13 from training set	93.1	52	3.52	Link

To download checkpoints:

# Either download one for test purpose
bash scripts/download_one_checkpoint.sh

# Or clone them all (>10GB)
git clone https://huggingface.co/ln2697/tfv6 outputs/checkpoints
cd outputs/checkpoints
git lfs pull

4. Setup VSCode/PyCharm

For VSCode, install recommended extensions when prompted. We support debugging of data collection, training and evaluation out of the box.

For PyCharm, you need to add CARLA Python API 3rd_party/CARLA_0915/PythonAPI/carla to your Python path Settings... → Python → Interpreter → Show All... → Show Interpreter Paths.

5. Evaluate model

To initiate closed-loop evaluation and verify the setup, execute the following:

# Start driving environment
bash scripts/start_carla.sh

# Start policy on one route
python lead/leaderboard_wrapper.py \
--checkpoint outputs/checkpoints/tfv6_resnet34 \
--routes data/benchmark_routes/bench2drive/23687.xml \
--bench2drive

Driving logs will be saved to outputs/local_evaluation with the following structure:

outputs/local_evaluation/1_town15_construction
├── 1_town15_construction_debug.mp4
├── 1_town15_construction_demo.mp4
├── 1_town15_construction_input.mp4
├── checkpoint_endpoint.json
├── debug_images
├── demo_images
├── input_images
├── input_log
├── infractions.json
├── metric_info.json
└── qualitative_results.mp4

Launch the interactive infraction dashboard to analyze driving failures more conveniently:

python lead/infraction_webapp/app.py

Navigate to http://localhost:5000, fill the input field with outputs/local_evaluation to access the infraction dashboard, useful for analyzing large-scale evaluations

webapp.mp4

Tip

Disable video recording in config_closed_loop by turning off produce_demo_video and produce_debug_video.
If memory is limited, modify the file prefixes to load only the first checkpoint seed. By default, the pipeline loads all three seeds as an ensemble.
To save time, decrease video FPS in config_closed_loop by increasing produce_frame_frequency.

Training

For a more detailed documentation, take a look at the documentation page. First, download the CARLA dataset from HuggingFace using git lfs:

# Download all routes
git clone https://huggingface.co/datasets/ln2697/lead_carla data/carla_leaderboard2/zip
cd data/carla_leaderboard2/zip
git lfs pull

# Or download a single route for testing
bash scripts/download_one_route.sh

# Upzip the routes
bash scripts/unzip_routes.sh

# Build data cache
python scripts/build_cache.py

Start pretraining:

# Train on a single GPU
python3 lead/training/train.py logdir=outputs/local_training/pretrain

# Or Torch DDP
bash scripts/pretrain_ddp.sh

Training logs and checkpoints will be saved to outputs/local_training/pretrain. To fine-tune the pretrained model with planning decoder enabled:

# Single GPU
python3 lead/training/train.py \
logdir=outputs/local_training/posttrain \
load_file=outputs/local_training/pretrain/model_0030.pth \
use_planning_decoder=true

# Distributed Torch DDP
bash scripts/posttrain_ddp.sh

Post-training checkpoints will be saved to outputs/local_training/posttrain.

For distributed training on SLURM, see this documentation page. For a complete SLURM workflow of pre-training, post-training, evaluation, see this example.

Data Collection

To collect your own dataset, you can run the rule-based expert driver. To setup own camera/lidar/radar calibration, see config_base.py and config_expert.py.

# Start CARLA
bash scripts/start_carla.sh

# Collect data
python lead/leaderboard_wrapper.py \
--expert \
--routes data/data_routes/lead/noScenarios/short_route.xml

Collected data will be saved to outputs/expert_evaluation/ with the following sensor outputs:

├── bboxes/                  # Per-frame 3D bounding boxes for all actors
├── depth/                   # Compressed and quantized depth maps
├── depth_perturbated        # Depth from a perturbated ego state
├── hdmap/                   # Ego-centric rasterized HD map
├── hdmap_perturbated        # HD map aligned to perturbated ego pose
├── lidar/                   # LiDAR point clouds
├── metas/                   # Per-frame metadata and ego state
├── radar/                   # Radar detections
├── radar_perturbated        # Radar detections from perturbated ego state
├── rgb/                     # RGB images
├── rgb_perturbated          # RGB images from perturbated ego state
├── semantics/               # Semantic segmentation maps
├── semantics_perturbated    # Semantics from perturbated ego state
└── results.json             # Route-level summary and evaluation metadata

For large-scale data collection on SLURM clusters, see the data collection documentation. The Jupyter notebooks provide some example scripts to visualize the collected data:

Benchmarking

For a more detailed documentation, take a look at the evaluation documentation.

Start the CARLA simulator before running evaluation:

bash scripts/start_carla.sh

# Bench2Drive
python lead/leaderboard_wrapper.py \
--checkpoint outputs/checkpoints/tfv6_resnet34 \
--routes data/benchmark_routes/bench2drive/23687.xml \
--bench2drive

# Longest6 v2
python lead/leaderboard_wrapper.py \
--checkpoint outputs/checkpoints/tfv6_resnet34 \
--routes data/benchmark_routes/longest6/00.xml

# Town13
python lead/leaderboard_wrapper.py \
--checkpoint outputs/checkpoints/tfv6_resnet34 \
--routes data/benchmark_routes/Town13/0.xml

# Clean CARLA
bash scripts/clean_carla.sh

Results will be saved to outputs/local_evaluation/ with videos, infractions, and metrics. For distributed evaluation across multiple routes and benchmarks, see the SLURM evaluation documentation. For large-scale evaluation we also provide a WandB logger.

Project Structure

The project is organized into several key directories:

lead - Main Python package containing model architecture, training, inference, and expert driver
3rd_party - Third-party dependencies (CARLA, benchmarks, evaluation tools)
data - Route definitions for training and evaluation. Sensor data will also be stored here.
scripts - Utility scripts for data processing, training, and evaluation
outputs - Model checkpoints, evaluation results, and visualizations
notebooks - Jupyter notebooks for data inspection and analysis
slurm - SLURM job scripts for large-scale experiments

For a detailed breakdown of the codebase organization, see the project structure documentation.

Common Issues

Most issues can be solved by:

Delete and rebuild training cache / buckets.
Restart CARLA simulator.

When debugging policy / expert, the script scripts/reset_carla_world.py can be handy to reset the current map without restarting the simulator. The latter can time-costly, especially on larger maps.

Beyond CARLA: Cross-Benchmark Deployment

The LEAD pipeline and TFv6 models are deployed as reference implementations and benchmark entries across multiple autonomous driving simulators and evaluation suites:

Waymo Vision-based End-to-End Driving Challenge (DiffusionLTF) Strong baseline entry for the inaugural end-to-end driving challenge hosted by Waymo, achieving 2nd place in the final leaderboard.
NAVSIM v1 (LTFv6) Latent TransFuser v6 is an updated reference baseline for the navtest split, improving PDMS by +3 points over the Latent TransFuser baseline, used to evaluate navigation and control under diverse driving conditions.
NAVSIM v2 (LTFv6) The same Latent TransFuser v6 improves EPMDS by +6 points over the Latent TransFuser baseline, targeting distribution shift and scenario complexity.
NVIDIA AlpaSim Simulator (TransFuserModel) Adapting the NAVSIM's Latent TransFuser v6 checkpoints, AlpaSim also features an official TransFuser driver, serving as a baseline policy for closed-loop simulation.

Further Documentation

For more detailed instructions, see the full documentation. In particular:

Acknowledgements

Special thanks to carla_garage for the foundational codebase. We also thank the creators of the numerous open-source projects we use:

PDM-Lite, leaderboard, scenario_runner, NAVSIM, Waymo Open Dataset

Other helpful repositories:

SimLingo, PlanT2, Bench2Drive Leaderboard, Bench2Drive, CaRL

Long Nguyen led development of the project. Kashyap Chitta, Bernhard Jaeger, and Andreas Geiger contributed through technical discussion and advisory feedback. Daniel Dauner provided guidance with NAVSIM.

Citation

If you find this work useful, please consider giving this repository a star ⭐ and citing our work in your research:

@article{Nguyen2025ARXIV,
  title={LEAD: Minimizing Learner-Expert Asymmetry in End-to-End Driving},
  author={Nguyen, Long and Fauth, Micha and Jaeger, Bernhard and Dauner, Daniel and Igl, Maximilian and Geiger, Andreas and Chitta, Kashyap},
  journal={arXiv preprint arXiv:2512.20563},
  year={2025}
}

License

This project is released under the MIT License. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.idea		.idea
.vscode		.vscode
3rd_party		3rd_party
data		data
docs		docs
lead		lead
notebooks		notebooks
scripts		scripts
slurm		slurm
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
conda-lock.yml		conda-lock.yml
environment.yml		environment.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Minimizing Learner–Expert Asymmetry in End-to-End Driving

Table of Contents

Updates

Quick Start

1. Environment initialization

2. Install dependencies

3. Download checkpoints

4. Setup VSCode/PyCharm

5. Evaluate model

Training

Data Collection

Benchmarking

Project Structure

Common Issues

Beyond CARLA: Cross-Benchmark Deployment

Further Documentation

Acknowledgements

Citation

License

About

Uh oh!

Contributors 4

Languages

License

autonomousvision/lead

Folders and files

Latest commit

History

Repository files navigation

Minimizing Learner–Expert Asymmetry in End-to-End Driving

Table of Contents

Updates

Quick Start

1. Environment initialization

2. Install dependencies

3. Download checkpoints

4. Setup VSCode/PyCharm

5. Evaluate model

Training

Data Collection

Benchmarking

Project Structure

Common Issues

Beyond CARLA: Cross-Benchmark Deployment

Further Documentation

Acknowledgements

Citation

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Contributors 4

Languages