Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

publications

Paper Title Number 4

Published in GitHub Journal of Bugs, 2024

This paper is about fixing template issue #693.

Recommended citation: Your Name, You. (2024). "Paper Title Number 3." GitHub Journal of Bugs. 1(3).
Download Paper

PipelineRL: Faster On-policy Reinforcement Learning for Long Sequence Generation

Published in arXiv preprint arXiv:2509.19128, 2025

We present PipelineRL, a fully pipelined on-policy reinforcement learning stack for large language models that keeps accelerators saturated without generating stale, off-policy trajectories. The system overlaps rollout, preference modeling, and policy updates to deliver up to 2× faster convergence on long-context reasoning benchmarks while retaining stability guarantees for PPO-style algorithms.

Recommended citation: Alexandre Piché, Ehsan Kamalloo, Rafael Pardinas, Xiaoyin Chen, Dzmitry Bahdanau. (2025). "PipelineRL: Faster On-policy Reinforcement Learning for Long Sequence Generation." arXiv preprint arXiv:2509.19128.
Download Paper

talks