Scaling Latent Reasoning via Looped Language Models

Rui-Jie Zhu, Zixuan Wang, Kai Hua, Tianyu Zhang, Ziniu Li, Haoran Que, Boyi Wei, Zixin Wen, Fan Yin, He Xing, Lu Li, Jiajun Shi, Kaijing Ma, Shanda Li, Taylor Kergan, Andrew Smith, Xingwei Qu, Mude Hui, Bohong Wu, Qiyang Min, Hongzhi Huang, Xun Zhou, Wei Ye, Jiaheng Liu, Jian Yang, Yunfeng Shi, Chenghua Lin, Enduo Zhao, Tianle Cai, Ge Zhang, Wenhao Huang, Yoshua Bengio, Jason Eshraghian

October, 2025

Abstract

Modern LLMs are trained to think primarily via explicit text generation, such as chain-of-thought (CoT), which defers reasoning to post-training and under-leverages pre-training data. We present and open-source Ouro, named after the recursive Ouroboros, a family of pre-trained Looped Language Models (LoopLM) that build reasoning into pre-training through (i) iterative computation in latent space, (ii) an entropy-regularized objective for learned depth allocation, and (iii) scaling to 7.7T tokens. Ouro 1.4B and 2.6B models match the performance of up to 12B SOTA LLMs across diverse benchmarks, not through increased knowledge capacity but through superior knowledge manipulation. LoopLM produces reasoning traces that align more closely with final outputs than explicit CoT, highlighting LoopLM as a promising scaling direction for the reasoning era.

Type

Industry research project

Publication

In arXiv 2510.25741

Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.

Create your slides in Markdown - click the Slides button to check out the example.

Paper

LLM Generative AI Multimodal

Scaling Latent Reasoning via Looped Language Models

Abstract

Tianyu Zhang

Ph.D. Student in Machine Learning