Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Posts
publications
Paper Title Number 1
Published in Journal 1, 2009
This paper is about the number 1. The number 2 is left for future work.
Recommended citation: Your Name, You. (2009). "Paper Title Number 1." Journal 1. 1(1).
Download Paper | Download Slides
Paper Title Number 2
Published in Journal 1, 2010
This paper is about the number 2. The number 3 is left for future work.
Recommended citation: Your Name, You. (2010). "Paper Title Number 2." Journal 1. 1(2).
Download Paper | Download Slides
Paper Title Number 3
Published in Journal 1, 2015
This paper is about the number 3. The number 4 is left for future work.
Recommended citation: Your Name, You. (2015). "Paper Title Number 3." Journal 1. 1(3).
Download Paper | Download Slides
Paper Title Number 4
Published in GitHub Journal of Bugs, 2024
This paper is about fixing template issue #693.
Recommended citation: Your Name, You. (2024). "Paper Title Number 3." GitHub Journal of Bugs. 1(3).
Download Paper
PipelineRL: Faster On-policy Reinforcement Learning for Long Sequence Generation
Published in arXiv preprint arXiv:2509.19128, 2025
We present PipelineRL, a fully pipelined on-policy reinforcement learning stack for large language models that keeps accelerators saturated without generating stale, off-policy trajectories. The system overlaps rollout, preference modeling, and policy updates to deliver up to 2× faster convergence on long-context reasoning benchmarks while retaining stability guarantees for PPO-style algorithms.
Recommended citation: Alexandre Piché, Ehsan Kamalloo, Rafael Pardinas, Xiaoyin Chen, Dzmitry Bahdanau. (2025). "PipelineRL: Faster On-policy Reinforcement Learning for Long Sequence Generation." arXiv preprint arXiv:2509.19128.
Download Paper
talks
AI safety and reward misspecification
Published:




