Claas A. Voelcker

GDC 4.306

2317 SPEEDWAY

AUSTIN, TX 78712

I am a PostDoc focused on Reinforcement and Machine Learning at the University of Texas at Austin, where I work with Peter Stone and Amy Zhang. Previously I received a PhD from the University of Toronto and the Vector Institute, where I was fortunate to be advised by Profs. Amir-massoud Farahmand and Igor Gilitschenski.

Originally from Germany, I received a Bachelor and Master degree from the University of Darmstadt with Honors. There, I had the great pleasure to be supervised and mentored by Profs. Kristian Kersting and Jan Peters.

I am proud to serve as a core organizer for Queer in AI, where I help promote the interests of queer researchers and practitioners at AI /ML conferences and in the wider community.

Research Vision

To make good decisions, intelligent agents need to evaluate the consequences and quality of their actions. In Reinforcement Learning, this quality is captured by the value function. My driving research question is how we enable autonomous AI to learn good value functions and to accurately estimate the impact of their actions. To achieve this, I have worked on a variety of techniques that make value learning fast, efficient, and accurate.

Works that train and leverage value functions

In Update-free Steering we show how value functions can be used to improve pre-trained robotics policies at execution time.
In REPPO we present an algorithm that is able to leverage strong action value function learning for lightning fast on-policy improvements in hard robotics tasks.
In MAD-TD we look at how simulated data from a learned world model can improve an agent’s value estimation.
In When does self-prediction help we take a look at different auxiliary tasks and explain how they help stabilize value learning.
In Dissecting Deep RL we investigate architectural regularizations that prevent agents from overestimating the value of their actions,

news

Mar 12, 2026	New papers! We pre-published “Update-Free On-Policy Steering via Verifiers”, a method that uses on-policy value functions to steer pre-trained robotics policies in real! We are also very grateful that “Relative ENtropy Pathwise Policy Optimization” was accepted to ICLR 2026. See you in Rio!
Feb 12, 2026	I gave a talk on REPPO at the BeNeRL Seminar. You can find my slides here.
Nov 04, 2025	I finally started my postdoc position at the University of Texas at Austin! So excited for the coming years filled with RL and robotics discoveries.
Oct 02, 2025	Our new paper Relative Entropy Pathwise Policy Optimization has a blog post that goes through everything you need to know to about implementing it yourself and understanding the technical bits and pieces.
Jul 01, 2025	Our paper Calibrated Value-Aware Model Learning with Probabilistic Environment Models will be presented at ICML 2025 in Vancouver next week! Let me know if you want to meet up for a coffee.

latest posts

Mar 14, 2026	A very opinionated (and incomplete) guide to choosing your RL algorithm
Oct 02, 2025	Relative Entropy Pathwise Policy Optimization - Technical Overview
Oct 02, 2025	REPPO - Why build a new algorithm

selected publications

2026

Relative Entropy Pathwise Policy Optimization

Claas A. Voelcker, Axel Brunnbauer, Marcel Hussing, Michal Nauman, Pieter Abbeel, and 3 more authors

International Conference on Learning Representations, Apr 2026

Bib Website

@article{voelcker2025reppo,
  title = {Relative Entropy Pathwise Policy Optimization},
  author = {Voelcker, Claas A. and Brunnbauer, Axel and Hussing, Marcel and Nauman, Michal and Abbeel, Pieter and Grosu, Radu and Farahmand, Amir-massoud and Gilitschenski, Igor},
  journal = {International Conference on Learning Representations},
  year = {2026},
  month = apr,
}

Update-Free On-Policy Steering via Verifiers

Maria Attarian, Ian Vyse, Claas Voelcker, Jasper Gerigk, Evgenii Opryshko, and 4 more authors

Mar 2026

2025

MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL

Claas A. Voelcker, Marcel Hussing, Eric Eaton, Amir-massoud Farahmand, and Igor Gilitschenski

International Conference on Learning Representations, Apr 2025

Spotlight Bib PDF

spotlight

@article{voelcker2024mad,
  title = {MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL},
  author = {Voelcker, Claas A. and Hussing, Marcel and Eaton, Eric and Farahmand, Amir-massoud and Gilitschenski, Igor},
  journal = {International Conference on Learning Representations},
  year = {2025},
  month = apr,
}

Calibrated Value-Aware Model Learning with Probabilistic Environment Models

Claas A Voelcker, Anastasiia Pedan, Arash Ahmadian, Romina Abachi, Igor Gilitschenski, and 1 more author

International Conference on Machine Learning, Jul 2025

Bib PDF

@article{voelcker2025calibrated,
  title = {Calibrated Value-Aware Model Learning with Probabilistic Environment Models},
  author = {Voelcker, Claas A and Pedan, Anastasiia and Ahmadian, Arash and Abachi, Romina and Gilitschenski, Igor and Farahmand, Amir-massoud},
  journal = {International Conference on Machine Learning},
  year = {2025},
  month = jul,
  url = {https://openreview.net/forum?id=fgO1R1iVEi}
}

2024

Dissecting Deep RL with High Update Ratios: Combatting Value Overestimation and Divergence

Marcel Hussing, Claas A. Voelcker, Igor Gilitschenski, Amir-massoud Farahmand, and Eric Eaton

Reinforcement Learning Conference, Aug 2024

Bib PDF

@article{hussing2024dissecting,
  title = {Dissecting Deep RL with High Update Ratios: Combatting Value Overestimation and Divergence},
  author = {Hussing, Marcel and Voelcker, Claas A. and Gilitschenski, Igor and Farahmand, Amir-massoud and Eaton, Eric},
  journal = {Reinforcement Learning Conference},
  year = {2024},
  month = aug
}

When does Self-Prediction help? Understanding Auxiliary Tasks in Reinforcement Learning

Claas A. Voelcker, Tyler Kastner, Igor Gilitschenski, and Amir-massoud Farahmand

Reinforcement Learning Conference, Aug 2024

Bib PDF

@article{voelcker2024does,
  title = {When does Self-Prediction help? Understanding Auxiliary Tasks in Reinforcement Learning},
  author = {Voelcker, Claas A. and Kastner, Tyler and Gilitschenski, Igor and Farahmand, Amir-massoud},
  journal = {Reinforcement Learning Conference},
  year = {2024},
  month = aug
}

2023

λ-AC: Learning latent decision-aware models for reinforcement learning in continuous state-spaces

Claas A. Voelcker, Arash Ahmadian, Romina Abachi, Igor Gilitschenski, and Amir-massoud Farahmand

arXiv preprint arXiv:2306.17366, Nov 2023

Bib PDF

@article{voelcker2023lambda,
  title = {$\lambda$-AC: Learning latent decision-aware models for reinforcement learning in continuous state-spaces},
  author = {Voelcker, Claas A. and Ahmadian, Arash and Abachi, Romina and Gilitschenski, Igor and Farahmand, Amir-massoud},
  journal = {arXiv preprint arXiv:2306.17366},
  year = {2023},
  month = nov
}

2022

Value Gradient weighted Model-Based Reinforcement Learning

Claas A. Voelcker, Victor Liao, Animesh Garg, and Amir-massoud Farahmand

In International Conference on Learning Representations, Apr 2022

Spotlight Bib PDF

spotlight

@inproceedings{voelcker2022value,
  title = {Value Gradient weighted Model-Based Reinforcement Learning},
  author = {Voelcker, Claas A. and Liao, Victor and Garg, Animesh and Farahmand, Amir-massoud},
  booktitle = {International Conference on Learning Representations},
  year = {2022},
  url = {https://openreview.net/forum?id=4-D6CZkRXxI},
  month = apr
}