[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-2348":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":15,"stars7d":16,"stars30d":17,"stars90d":14,"forks30d":14,"starsTrendScore":18,"compositeScore":19,"rankGlobal":9,"rankLanguage":9,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":23,"hasPages":23,"topics":24,"createdAt":9,"pushedAt":9,"updatedAt":25,"readmeContent":26,"aiSummary":27,"trendingCount":14,"starSnapshotCount":14,"syncStatus":13,"lastSyncTime":28,"discoverSource":29},2348,"Awesome-Loop-Models","huskydoge\u002FAwesome-Loop-Models","huskydoge","A curated list of papers and selected technical blogs on Loop Models.",null,"Python",161,5,2,0,7,12,64,21,68.73,"MIT License",false,"main",true,[],"2026-06-12 04:00:14","\n\u003Cdiv align=\"center\">\n\n# Awesome Loop Models\n\n[![Awesome](https:\u002F\u002Fawesome.re\u002Fbadge.svg)](https:\u002F\u002Fawesome.re)\n[![Submit](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F📄%20Submit-blue?style=flat-square)](https:\u002F\u002Fhuskydoge.github.io\u002FAwesome-Loop-Models\u002Fsubmit.html)\n[![Website](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🌐%20Live%20Website-Link-blue?style=flat-square)](https:\u002F\u002Fhuskydoge.github.io\u002FAwesome-Loop-Models\u002Findex.html)\n[![License: MIT](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-yellow.svg)](https:\u002F\u002Fopensource.org\u002Flicenses\u002FMIT)\n\n\u003Cimg src=\"assets\u002Fcover.png\" alt=\"Loop architecture concept diagram\" width=\"100%\" \u002F>\n\n### 🌐 [**Interactive Browser**](https:\u002F\u002Fhuskydoge.github.io\u002FAwesome-Loop-Models\u002Findex.html) · 🧾 [**PR Submission Guide**](https:\u002F\u002Fhuskydoge.github.io\u002FAwesome-Loop-Models\u002Fsubmit.html)\n\n*Search, filter, and explore loop-model papers and selected technical blogs with links to arXiv, code, OpenReview, and more.*\n\n*Use the PR Submission Guide to generate YAML for papers or blogs, then copy the path and YAML into your fork \u002F branch for the final pull request step.*\n\n\u003C\u002Fdiv>\n\nA curated list of papers and selected long-form technical blogs on **Loop Models** — architectures where, within a single forward process, a shared learned internal layer, block, module, or operator is reused.\n\n---\n\n## News\n\n- **2026-04-24** — Awesome Loop Models is released. [Announcement](https:\u002F\u002Fx.com\u002Fhuskydogewoof\u002Fstatus\u002F2047655947942744285)\n\n---\n\n## What Counts as a Loop Model?\n\nThis repository uses a strict definition:\n\n> By \"loop model,\" we mean that, within a single forward pass of a model, a shared learned internal layer, block, module, or operator is reused.\n\nThis repo therefore includes papers that focus on loop models themselves, their mechanisms, applications, and designs. It excludes papers that are primarily about broader-scale iteration patterns that do not directly connect to loop models as defined above, such as agent loops, repeated full-model calls, external solver rounds, energy-based models, or plain sequence-time recurrence.\n\n> Admittedly, loop models are deeply connected to the broader field of architecture and algorithm design (Diffusion, Energy-Based Models, etc.). We also welcome work that explicitly connects adjacent topics to loop models.\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fscope.png\" alt=\"Scope scale from agent loop to loop models\" width=\"100%\" \u002F>\n\u003C\u002Fp>\n\n> Only the **rightmost end** of this scale is in scope for the main paper list.\n\n## How the Repository Is Organized\n\nThe public browsing layer uses exactly three flat paper categories:\n- **Theoretical and Mechanical Analysis** — analytical papers whose main reader takeaway is understanding: theory, mechanism analysis, probing, diagnostics, or formal properties\n- **Architecture and Algorithm Designs** — papers that propose loop-model architectures or algorithms, often for better performance, efficiency, training, inference, or memory use\n- **Applications Focused** — papers whose main reader takeaway is loop-model performance on concrete external domains or tasks, such as robotics, VLA, multimodal tasks, tabular data, or graph data\n\nIn addition, selected long-form technical posts live in a separate flat **Blogs** section. Blogs can carry tags, but they do not use the paper taxonomy.\n\n> The paper categories are intentionally coarse. Foundation status plus Loop Mechanism \u002F focus \u002F domain tags carry secondary structure without introducing a separate lineage-tag axis.\n\nTop-level categories do the minimum amount of work. Finer distinctions are pushed into:\n- **Loop Mechanism** (`mechanism_tags`) — loop-form labels only: `hierarchical-loop`, `flat-loop`, `parallel-loop`, or `implicit-layer`\n- `focus_tags` — whether the paper mainly studies `objective-loss`, `training-algorithm`, `architecture`, `data`, or `inference-algorithm`\n- `domain_tags` — problem\u002Fdomain labels such as `language-modeling`, `robotics-vla`, `multimodal`, `tabular-data`, or `graph-data`\n- `tags` — optional aliases or model identifiers kept in YAML \u002F README metadata, such as `DEQ`, `UT`, `ACT`, or `Ouro`\n\nA paper can also carry `foundation: true` as a secondary badge when it is a canonical anchor such as ACT, Universal Transformers, or DEQ. Foundation is no longer a separate top-level shelf.\n\nIn the interactive browser, the visible tag filters are **Loop Mechanism**, `focus_tags`, and `domain_tags`. Alias-style `tags` are not shown as browser filter chips there.\n\nSee [TAGS.md](TAGS.md) for the current tag inventory and preferred spellings before proposing a new tag.\n\nSee [TAXONOMY.md](TAXONOMY.md) for the full inclusion rule, paper category definitions, tie-break rules, and the flat Blogs-section rule.\n\n---\n\n## Table of Contents\n\n- [Theoretical and Mechanical Analysis](#theoretical-and-mechanical-analysis) (21)\n- [Architecture and Algorithm Designs](#architecture-and-algorithm-designs) (57)\n- [Applications Focused](#applications-focused) (8)\n- [Blogs](#blogs) (6)\n\n> The paper shelves are intentionally coarse: Theoretical and Mechanical Analysis, Architecture and Algorithm Designs, and Applications Focused. Foundation status plus Loop Mechanism \u002F focus \u002F domain tags carry secondary structure without introducing lineage buckets.\n> Blogs are a separate flat section: they can carry tags, but they do not use the paper taxonomy.\n\n---\n\n\u003C!-- AUTO-GENERATED by scripts\u002Fbuild.py on 2026-05-28 07:26 UTC — DO NOT EDIT the lists below manually. Edit papers\u002F*.yaml or blogs\u002F*.yaml and run `python3 scripts\u002Fbuild.py` instead. -->\n\n## Theoretical and Mechanical Analysis\n\nTheoretical and Mechanical Analysis collects papers whose primary contribution is analysis: why loop models work, what formal properties they have, and what mechanisms they exhibit.\n\n- \u003Cdetails>\n  \u003Csummary>[05\u002F26\u002F2026] \u003Cstrong>Stabilizing Recurrent Dynamics for Test-Time Scalable Latent Reasoning in Looped Language Models\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.26733\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.26733-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2605.26733\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2605.26733-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Xiao-Wen Yang, Ziyu Han, Xi-Hua Zhang, Wen-Da Wei, Jie-Jing Shao, Lan-Zhe Guo, Yu-Feng Li · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> training-algorithm · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning · theory · scaling\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Analyzes why Looped Language Models can collapse at larger recurrence depths and proposes STARS, a spectral-radius-regularized training framework that pushes latent dynamics toward stable fixed points for reliable test-time scaling.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[05\u002F20\u002F2026] \u003Cstrong>Interaction Locality in Hierarchical Recursive Reasoning\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.20784\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.20784-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2605.20784\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2605.20784-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Yosuke Miyanishi, Tetsuro Morimura · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> hierarchical-loop · flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> reasoning · algorithmic-reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Proposes interaction locality as a mechanistic measurement framework for HRM and TRM, showing how repeated recursive updates accumulate local writes into broader solution structure on grid reasoning benchmarks.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[05\u002F18\u002F2026] \u003Cstrong>One Model, Two Roles: Emergent Specialization in a Shared Recurrent Transformer\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.17811\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.17811-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2605.17811\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2605.17811-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Jucheng Shen, Barbara Su, Anastasios Kyrillidis · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop · hierarchical-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> reasoning · algorithmic-reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Analyzes Asymmetric Input Recurrence, a two-state shared-weight recurrent Transformer where the same model updates L\u002FH states, showing that state identity and input-injection asymmetry induce distinct proposal-vs-uncertainty roles on Sudoku-Extreme and Maze.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[05\u002F08\u002F2026] \u003Cstrong>Bifurcation Models: Learning Set-Valued Solution Maps with Weight-Tied Dynamics\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.07277\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.07277-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2605.07277\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2605.07277-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Caleb Jore, Jialin Liu · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop · implicit-layer\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> theory · algorithmic-reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Studies weight-tied dynamics for set-valued solution maps, proving that regular equilibrium dynamics can represent multiple branches while repeated shared-operator iterations discover multiple valid equilibria on Ising and Allen-Cahn tasks.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[05\u002F07\u002F2026] \u003Cstrong>Is One Layer Enough? Understanding Inference Dynamics in Tabular Foundation Models\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.06510\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.06510-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2605.06510\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2605.06510-7c3aed.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Famirbalef\u002Fis_one_layer_enough\u002Fstargazers\">\u003Cimg alt=\"GitHub stars\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Famirbalef\u002Fis_one_layer_enough?style=social\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Amir Rezaei Balef, Mykhailo Koshil, Katharina Eggensperger · ICML 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> tabular-data · reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Analyzes layerwise inference dynamics in tabular foundation models and uses the observed depth redundancy to build a looped single-layer model that preserves comparable performance with about 20% of the original parameters.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[05\u002F07\u002F2026] \u003Cstrong>Transformers Efficiently Perform In-Context Logistic Regression via Normalized Gradient Descent\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.06609\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.06609-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2605.06609\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2605.06609-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Chenyang Zhang, Yuan Cao · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm · training-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> theory · reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Proves that softmax transformers can implement in-context logistic regression by treating layers as normalized-gradient-descent steps, then trains one self-attention layer and applies it recurrently as a looped model with convergence and OOD guarantees.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[04\u002F28\u002F2026] \u003Cstrong>On Halting vs Converging in Recurrent Graph Neural Networks\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.25551\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2604.25551-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2604.25551\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2604.25551-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Jeroen Bollen, Stijn Vansummeren · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> theory · algorithmic-reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Analyzes recurrent graph neural networks that repeatedly apply message passing until convergence or halting, proving expressiveness relationships between converging, output-converging, and halting RGNN variants.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[04\u002F23\u002F2026] \u003Cstrong>Universal Transformers Need Memory: Depth-State Trade-offs in Adaptive Recursive Reasoning\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.21999\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2604.21999-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2604.21999\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2604.21999-7c3aed.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fche-shr-cat\u002Futm-jax\u002Fstargazers\">\u003Cimg alt=\"GitHub stars\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fche-shr-cat\u002Futm-jax?style=social\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Grigory Sapunov · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> reasoning · algorithmic-reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Studies a single-block Universal Transformer with ACT on Sudoku-Extreme, showing that learned memory tokens are necessary for non-trivial recursive-depth reasoning and that ACT initialization can trap the model in shallow computation.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[04\u002F22\u002F2026] \u003Cstrong>How Much Is One Recurrence Worth? Iso-Depth Scaling Laws for Looped Language Models\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.21106\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2604.21106-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2604.21106\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2604.21106-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Kristian Schwethelm, Daniel Rueckert, Georgios Kaissis · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · scaling · efficiency\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Measures the parameter value of recurrence in looped language models with iso-depth scaling laws, estimating how extra recurrent passes trade off against unique depth and training compute.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[04\u002F16\u002F2026] \u003Cstrong>Stability and Generalization in Looped Transformers\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.15259\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2604.15259-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2604.15259\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2604.15259-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Asher Labovich · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop · implicit-layer\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> reasoning · theory\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Analyzes stability and generalization in looped transformers through a fixed-point framework, characterizing when recall and normalization yield reachable, input-dependent, and trainable loop dynamics.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[04\u002F15\u002F2026] \u003Cstrong>Hierarchical vs. Flat Iteration in Shared-Weight Transformers\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.14442\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2604.14442-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2604.14442\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2604.14442-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Sang-Il Han · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop · hierarchical-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · scaling\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Empirically compares hierarchical shared-weight recurrence against flat shared-weight iteration and independent-layer stacking, revealing a persistent representational gap for the recurrent hierarchy.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[04\u002F13\u002F2026] \u003Cstrong>A Mechanistic Analysis of Looped Reasoning Language Models\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.11791\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2604.11791-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2604.11791\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2604.11791-7c3aed.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fx.com\u002FHughBlayney\u002Fstatus\u002F2046558050882899995?s=20\">\u003Cimg alt=\"Twitter\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTwitter-%40HughBlayney-1d9bf0.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Hugh Blayney, Álvaro Arroyo, Johan Obando-Ceron, Pablo Samuel Castro, Aaron Courville, Michael M. Bronstein, Xiaowen Dong · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> implicit-layer\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Analyzes looped reasoning LLMs mechanistically, showing recurrent cycles converge to layer-specific fixed points and that feedforward-like inference stages repeat across latent recurrences.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[04\u002F10\u002F2026] \u003Cstrong>Relational Preference Encoding in Looped Transformer Internal States\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.09870\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2604.09870-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2604.09870\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2604.09870-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Jan Kirin · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> training-algorithm · architecture\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · alignment\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Probes looped transformer hidden states during iterative refinement, showing that human-preference information is encoded primarily in relational differences between loop states rather than independent per-state scores.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[04\u002F09\u002F2026] \u003Cstrong>Loop, Think, &amp; Generalize: Implicit Reasoning in Recurrent-Depth Transformers\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.07822\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2604.07822-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2604.07822\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2604.07822-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Harsh Kohli, Srinivasan Parthasarathy, Huan Sun, Yuekun Yao · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop · implicit-layer\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Studies implicit reasoning in recurrent-depth transformers, showing that iterating shared transformer layers can unlock systematic generalization and depth extrapolation while also exposing overthinking limits.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[02\u002F05\u002F2026] \u003Cstrong>Inverse Depth Scaling From Most Layers Being Similar\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.05970\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2602.05970-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2602.05970\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2602.05970-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Yizhou Liu, Sara Kangaslahti, Ziming Liu, Jeff Gore · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · theory\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Community Comments:\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fx.com\u002Fhuskydogewoof\u002Fstatus\u002F2034158020322574556?s=20\">X Comment\u003C\u002Fa>\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Analyzes LLMs and toy residual networks to show loss scales inversely with depth when many layers are functionally similar and primarily reduce error via ensemble averaging.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[09\u002F27\u002F2025] \u003Cstrong>Two-Scale Latent Dynamics for Recurrent-Depth Transformers\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.23314\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2509.23314-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2509.23314\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2509.23314-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Francesco Pappone, Donato Crisostomi, Emanuele Rodolà · 2025\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Analyzes recurrent-depth transformers through a two-scale latent-dynamics lens, showing shrinking and increasingly orthogonal loop updates and deriving a second-order early-exit criterion that improves latency-quality trade-offs.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[07\u002F02\u002F2025] \u003Cstrong>Latent Chain-of-Thought? Decoding the Depth-Recurrent Transformer\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.02199\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2507.02199-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2507.02199\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2507.02199-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Wenquan Lu, Yuechuan Yang, Kyle Lee, Yanshu Li, Enqi Liu · 2025\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Probes a depth-recurrent Transformer to test whether latent chain-of-thought structure emerges across recurrence steps, finding limited evidence and recurrence-depth-dependent interpretability effects.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[02\u002F24\u002F2025] \u003Cstrong>Reasoning with Latent Thoughts: On the Power of Looped Transformers\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.17416\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2502.17416-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2502.17416\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2502.17416-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Nikunj Saunshi, Nishanth Dikkala, Zhiyuan Li, Sanjiv Kumar, Sashank J. Reddi · ICLR 2025\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> training-algorithm · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Community Comments:\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fx.com\u002Freza_byt\u002Fstatus\u002F2045168844658950392?s=20\">Reza Bayat reading list (#7)\u003C\u002Fa>\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Studies looped transformers as reasoning models, showing effective-depth scaling, latent-thought simulation of chain-of-thought, and a looping-based regularizer that improves the reasoning-versus-memorization trade-off.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[10\u002F02\u002F2024] \u003Cstrong>On Expressive Power of Looped Transformers: Theoretical Analysis and Enhancement via Timestep Encoding\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.01405\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2410.01405-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2410.01405\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2410.01405-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Kevin Xu, Issei Sato · 2024\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning · theory\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Analyzes the expressive power of looped transformers, derives approximation-rate limits, and shows that timestep encoding improves their function-approximation behavior.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[11\u002F21\u002F2023] \u003Cstrong>Looped Transformers are Better at Learning Learning Algorithms\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2311.12424\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2311.12424-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2311.12424\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2311.12424-7c3aed.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FLeiay\u002Flooped_transformer\u002Fstargazers\">\u003Cimg alt=\"GitHub stars\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FLeiay\u002Flooped_transformer?style=social\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fopenreview.net\u002Fforum?id=HHbRxoDTxE\">\u003Cimg alt=\"OpenReview\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FOpenReview-Paper-8E44AD.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Liu Yang, Kangwook Lee, Robert Nowak, Dimitris Papailiopoulos · ICLR 2024\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · training-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> algorithmic-reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Community Comments:\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fx.com\u002Fhuskydogewoof\u002Fstatus\u002F2033023167044727049?s=20\">Benhao&#x27;s reading note\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fx.com\u002Freza_byt\u002Fstatus\u002F2045168844658950392?s=20\">Reza Bayat reading list (#5)\u003C\u002Fa>\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Proposes looped-transformer training for in-context data-fitting tasks, showing comparable performance to standard transformers with under 10% of the parameters by better matching iterative learning algorithms.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[01\u002F30\u002F2023] \u003Cstrong>Looped Transformers as Programmable Computers\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2301.13196\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2301.13196-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2301.13196\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2301.13196-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Angeliki Giannou, Shashank Rajput, Jy-yong Sohn, Kangwook Lee, Jason D. Lee, Dimitris Papailiopoulos · 2023\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> algorithmic-reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Community Comments:\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fx.com\u002Freza_byt\u002Fstatus\u002F2045168844658950392?s=20\">Reza Bayat reading list (#4)\u003C\u002Fa>\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Shows that a shallow looped transformer can emulate instruction-set computation and iterative algorithms such as SGD or matrix inversion, with the recurrence acting as a reusable program counter.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n---\n\n## Architecture and Algorithm Designs\n\nArchitecture and Algorithm Designs collects the constructive side of the field: new looped architectures, algorithms, recurrent computation graphs, and efficiency or memory-compression methods.\n\n- \u003Cdetails>\n  \u003Csummary>[05\u002F25\u002F2026] \u003Cstrong>Looped Diffusion Language Models\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.26106\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.26106-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2605.26106\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2605.26106-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Sanghyun Lee, Chunsan Hong, Seungryong Kim, Jonghyun Lee, Jongho Park, Dongmin Park · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · training-algorithm · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning · efficiency · scaling\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces LoopMDM, selectively looping early-middle transformer layers in masked diffusion language models so training gains depth-scaling without extra parameters and inference can vary loop count for compute scaling.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[05\u002F22\u002F2026] \u003Cstrong>Training-Free Looped Transformers\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.23872\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.23872-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2605.23872\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2605.23872-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Lizhang Chen, Jonathan Li, Chen Liang, Ni Lao, Qiang Liu · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning · efficiency · scaling\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Retrofits frozen pretrained transformers with a training-free inference wrapper that repeatedly applies a contiguous mid-stack layer block as damped refinement sub-steps, improving several QA and reasoning benchmarks without fine-tuning.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[05\u002F20\u002F2026] \u003Cstrong>Equilibrium Reasoners: Learning Attractors Enables Scalable Reasoning\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.21488\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.21488-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2605.21488\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2605.21488-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Benhao Huang, Zhengyang Geng, Zico Kolter · ICML 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop · implicit-layer\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> reasoning · algorithmic-reasoning · scaling\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Formalizes Equilibrium Reasoners as learned latent dynamical systems whose repeated update rule converges toward task-conditioned attractors, enabling depth and breadth test-time scaling for reasoning.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[05\u002F20\u002F2026] \u003Cstrong>LT2: Linear-Time Looped Transformers\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.20670\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.20670-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2605.20670\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2605.20670-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Chunyuan Deng, Yizhe Zhang, Rui-Jie Zhu, Yuanyuan Xu, Jiarui Liu, T. S. Eugene Ng, Hanjie Chen · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning · efficiency · scaling\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces LT2, a looped-transformer family that replaces quadratic attention with linear or sparse attention so repeated loop steps refine memory and expand receptive field while keeping inference more scalable.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[05\u002F19\u002F2026] \u003Cstrong>Generative Recursive Reasoning\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.19376\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.19376-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2605.19376\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2605.19376-7c3aed.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fahn-ml.github.io\u002Fgram-website\u002F\">\u003Cimg alt=\"Website\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FWebsite-Link-blue\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Junyeob Baek, Mingyu Jo, Minsu Kim, Mengye Ren, Yoshua Bengio, Sungjin Ahn · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop · parallel-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · objective-loss · training-algorithm · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> reasoning · algorithmic-reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces GRAM, a probabilistic recursive-reasoning framework that models reasoning as stochastic latent trajectories, enabling multi-hypothesis computation, variational training, and inference-time scaling through depth and parallel sampling.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[05\u002F19\u002F2026] \u003Cstrong>Probabilistic Tiny Recursive Model\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.19943\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.19943-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2605.19943\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2605.19943-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Amin Sghaier, Ali Parviz, Alexia Jolicoeur-Martineau · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> hierarchical-loop · flat-loop · parallel-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> reasoning · algorithmic-reasoning · efficiency\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces PTRM, an inference-time scaling framework for Tiny Recursive Models that injects Gaussian noise into recursive latent updates, runs parallel trajectories, and selects the final answer with the model&#x27;s Q head without retraining.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[05\u002F18\u002F2026] \u003Cstrong>HRM-Text: Efficient Pretraining Beyond Scaling\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fsapientinc.github.io\u002FHRM-Text\u002Fassets\u002FHRM_Text.pdf\">\u003Cimg alt=\"Paper\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPaper-sapientinc.github.io-0366d6.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fsapientinc\u002FHRM-Text\u002Fstargazers\">\u003Cimg alt=\"GitHub stars\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsapientinc\u002FHRM-Text?style=social\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fsapientinc\u002FHRM-Text-1B\">\u003Cimg alt=\"HuggingFace\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FHuggingFace-sapientinc%2FHRM--Text--1B-ffb000.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fsapientinc.github.io\u002FHRM-Text\u002F\">\u003Cimg alt=\"Website\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FWebsite-Link-blue\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Guan Wang, Changling Liu, Chenyu Wang, Cai Zhou, Yuhao Sun, Yifei Wu, Shuai Zhen, Luca Scimeca, Yasin Abbasi Yadkori · Preprint 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> hierarchical-loop · flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · training-algorithm · objective-loss · data\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning · efficiency\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces HRM-Text, a 1B Hierarchical Recurrent Model language model that combines dual-timescale recurrent Transformer modules with MagicNorm, warmup deep credit assignment, PrefixLM masking, and task-completion pretraining for efficient training from 40B unique tokens.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[05\u002F15\u002F2026] \u003Cstrong>Looped SSMs: Depth-Recurrence and Input Reshaping for Time Series Classification\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.16048\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.16048-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2605.16048\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2605.16048-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Mónika Farsang, Ramin Hasani, Daniela Rus, Radu Grosu · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> sequence-modeling · efficiency · scaling\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Extends looped-transformer depth recurrence to state-space models by reusing the same SSM block across depth and adding input reshaping, showing tied-depth SSMs match or beat untied SSMs on six time-series benchmarks despite fewer parameters.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[05\u002F12\u002F2026] \u003Cstrong>Solve the Loop: Attractor Models for Language and Reasoning\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.12466\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.12466-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2605.12466\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2605.12466-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Jacob Fein-Ashley, Paria Rashidinejad · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop · implicit-layer\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · training-algorithm · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning · scaling · efficiency\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces Attractor Models, where a backbone proposes output embeddings and an attractor module iteratively solves a fixed point with implicit differentiation, improving looped language modeling and small-model reasoning while allowing adaptive convergence-depth inference.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[05\u002F11\u002F2026] \u003Cstrong>Simply Stabilizing the Loop via Fully Looped Transformer\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.18797\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.18797-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2605.18797\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2605.18797-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Rao Fu, Zixuan Yang, Jiankun Zhang, Jing Ma, Hechang Chen, Yu Li, Yi Chang · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · training-algorithm · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning · scaling · efficiency\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Stabilizes looped transformers with parameter-free fully looped signal routing and attention injection, enabling stable training at higher loop counts while preserving test-time loop-depth control.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[05\u002F10\u002F2026] \u003Cstrong>LoopUS: Recasting Pretrained LLMs into Looped Latent Refinement Models\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.11011\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.11011-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2605.11011\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2605.11011-7c3aed.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fthrillcrazyer.github.io\u002FLoopUS\">\u003Cimg alt=\"Website\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FWebsite-Link-blue\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Taekhyun Park, Yongjae Lee, Dohee Kim, Hyerim Bae · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · training-algorithm · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning · efficiency · scaling\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Converts pretrained LLMs into encoder, looped reasoning block, and decoder components, using selective gating, random deep supervision, and adaptive early exiting to stabilize latent looping without training recurrent models from scratch.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[05\u002F09\u002F2026] \u003Cstrong>Quantum Injection Pathways for Implicit Graph Neural Networks\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.09226\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.09226-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2605.09226\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2605.09226-7c3aed.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FcxMoonGlade\u002FQIP_IGNN\u002Fstargazers\">\u003Cimg alt=\"GitHub stars\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FcxMoonGlade\u002FQIP_IGNN?style=social\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Pengyuan Xu, Tristan Zaborniak, Luis F. Rivera, Hausi A. Müller · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> implicit-layer\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> theory · efficiency\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Formulates quantum-signal injection pathways for graph deep-equilibrium models, comparing fixed, state-dependent, and backbone-dependent coupling inside the fixed-point operator with contraction guarantees and graph-classification experiments.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[05\u002F09\u002F2026] \u003Cstrong>Sparse Layers are Critical to Scaling Looped Language Models\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.09165\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.09165-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2605.09165\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2605.09165-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Ryan Lee, Jacob Biloki, Edward J. Hu, Jonathan May · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · scaling · efficiency · MoE\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Shows that MoE-style sparse layers can make looped language models scale better than dense looped transformers, with routing divergence across repeated shared layers recovering expressivity and loop boundaries serving as effective early-exit points.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[05\u002F08\u002F2026] \u003Cstrong>Memory-Efficient Looped Transformer: Decoupling Compute from Memory in Looped Language Models\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.07721\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.07721-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2605.07721\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2605.07721-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Victor Conchello Vendrell, Arnau Padres Masdemont, Niccolò Grillo, Jordi Ros-Giralt, Arash Behboodi, Fabio Valerio Massoli · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · training-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning · efficiency · memory-efficiency\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Memory-Efficient Looped Transformer enables constant‑memory iterative reasoning by sharing a single KV cache across loops, achieving strong performance without the linear memory scaling of prior looped LLMs.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[04\u002F23\u002F2026] \u003Cstrong>Hyperloop Transformers\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.21254\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2604.21254-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2604.21254\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2604.21254-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Abbas Zeitoun, Lucas Torroba-Hennigen, Yoon Kim · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · efficiency · memory-efficiency\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Community Comments:\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fx.com\u002FTheTuringPost\u002Fstatus\u002F2047720038342476187?s=20\">Turing Posts\u003C\u002Fa>\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces Hyperloop Transformers, a parameter-efficient looped Transformer that applies only a middle block recurrently and adds hyper-connections between loops to improve memory-efficient language modeling.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[04\u002F20\u002F2026] \u003Cstrong>One Step Forward and K Steps Back: Better Reasoning with Denoising Recursion Models\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.18839\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2604.18839-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2604.18839\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2604.18839-7c3aed.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fwwwwwwwwz\u002FDenoisingRecursionModels\u002Fstargazers\">\u003Cimg alt=\"GitHub stars\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fwwwwwwwwz\u002FDenoisingRecursionModels?style=social\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Chris Cameron, Wangzheng Wang, Nikita Ivanov, Ashmita Bhattacharyya, Didier Chételat, Yingxue Zhang · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> training-algorithm · inference-algorithm · architecture\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> reasoning · algorithmic-reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces Denoising Recursion Models, a looped-transformer training method that corrupts targets and trains recursive refinement over multiple steps, improving ARC-AGI reasoning over TRM.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[04\u002F19\u002F2026] \u003Cstrong>LASER: Low-Rank Activation SVD for Efficient Recursion\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.17224\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2604.17224-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2604.17224\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2604.17224-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Ege Çakar, Ketan Ali Raghu, Lia Zheng · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> hierarchical-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> efficiency\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Analyzes Tiny Recursive Model activation geometry during recursive unrolling and introduces LASER, a dynamic low-rank activation compression method that cuts recursive activation memory by ~60% without statistically significant accuracy loss.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[04\u002F14\u002F2026] 🌟 \u003Cstrong>Parcae: Scaling Laws For Stable Looped Language Models\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.12946\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2604.12946-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2604.12946\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2604.12946-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Hayden Prairie, Zachary Novack, Taylor Berg-Kirkpatrick, Daniel Y. Fu · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> objective-loss · architecture\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Community Comments:\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fx.com\u002Fhuskydogewoof\u002Fstatus\u002F2044609402553115070?s=20\">Benhao&#x27;s reading note\u003C\u002Fa>\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces Parcae, a stable looped language model that constrains injection spectral norms to prevent instability and studies isoFLOPs-style training- and test-time scaling laws for quality gains under fixed-parameter budgets.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[04\u002F10\u002F2026] \u003Cstrong>ELT: Elastic Looped Transformers for Visual Generation\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.09168\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2604.09168-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2604.09168\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2604.09168-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Sahil Goyal, Swayam Agrawal, Gautham Govind Anil, Prateek Jain, Sujoy Paul, Aditya Kusupati · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · training-algorithm · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> vision · efficiency\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Community Comments:\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fx.com\u002Fche_shr_cat\u002Fstatus\u002F2050923533199376595?s=20\">Tweet by Grigory Sapunov\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Farxiviq.substack.com\u002Fp\u002Felt-elastic-looped-transformers-for\">Grigory Sapunov&#x27;s reading notes\u003C\u002Fa>\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces Elastic Looped Transformers for image and video generation, using weight-shared recurrent transformer blocks plus Intra-Loop Self Distillation to support any-time inference with dynamic quality-compute trade-offs from a single training run.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[03\u002F23\u002F2026] \u003Cstrong>Thinking Deeper, Not Longer: Depth-Recurrent Transformers for Compositional Generalization\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2603.21676\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2603.21676-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2603.21676\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2603.21676-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Hung-Hsuan Chen · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> reasoning · compositional-reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces a depth-recurrent Transformer for compositional generalization, with silent thinking, LayerScale, and identity-biased recurrence enabling stable deep latent iteration.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[03\u002F20\u002F2026] \u003Cstrong>LoopRPT: Reinforcement Pre-Training for Looped Language Models\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2603.19714\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2603.19714-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2603.19714\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2603.19714-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Guo Tang, Shixin Jiang, Heng Chang, Nuo Chen, Yuhan Li, Huiming Fan, Jia Li, Ming Liu, Bing Qin · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> objective-loss · training-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning · RL\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Proposes LoopRPT, a reinforcement pre-training method for looped language models that assigns learning signals to latent iterations, improving accuracy-compute trade-offs and strengthening early-stage reasoning on Ouro.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[03\u002F09\u002F2026] \u003Cstrong>Adaptive Loops and Memory in Transformers: Think Harder or Know More?\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2603.08391\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2603.08391-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2603.08391\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2603.08391-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Markus Frey, Behzad Shomali, Ali Hamza Bashir, David Berghaus, Joachim Koehler, Mehdi Ali · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning · efficiency\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces transformers with adaptive per-layer looping and gated memory banks, showing that combining learned halting with extra storage improves reasoning under matched parameter and FLOP budgets.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[03\u002F09\u002F2026] \u003Cstrong>Tiny Autoregressive Recursive Models\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2603.08082\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2603.08082-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2603.08082\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2603.08082-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Paulius Rauba, Claudio Fanconi, Mihaela van der Schaar · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> hierarchical-loop · flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · training-algorithm · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> algorithmic-reasoning · language-modeling\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Community Comments:\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fx.com\u002Fhuskydogewoof\u002Fstatus\u002F2032232642947494107?s=20\">Benhao&#x27;s reading note\u003C\u002Fa>\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Studies autoregressive Tiny Recursive Models under compute-matched baselines, finding that simple two-step refinement helps on small algorithmic tasks while the full Autoregressive TRM shows no reliable gains.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[03\u002F05\u002F2026] \u003Cstrong>Mixture of Universal Experts: Scaling Virtual Width via Depth-Width Transformation\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2603.04971\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2603.04971-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2603.04971\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2603.04971-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Yilong Chen, Naibin Gu, Junyuan Shang, Zhenyu Zhang, Yuchen Feng, Jiawei Sheng, Tingwen Liu, Shuohuan Wang, Yu Sun, Hua Wu, Haifeng Wang · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> objective-loss · architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · efficiency · MoE\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Community Comments:\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fx.com\u002Fhuskydogewoof\u002Fstatus\u002F2031847931993608673?s=20\">Benhao&#x27;s reading note\u003C\u002Fa>\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Proposes MOUE, which reuses a universal layer-agnostic expert pool across layers to transform depth into virtual width and improve MoE performance under fixed activation budgets.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[03\u002F05\u002F2026] \u003Cstrong>Recursive Inference Machines for Neural Reasoning\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2603.05234\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2603.05234-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2603.05234\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2603.05234-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Mieszko Komisarczyk, Saurabh Mathur, Maurice Kraus, Sriraam Natarajan, Kristian Kersting · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> hierarchical-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> reasoning · RL\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Community Comments:\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fx.com\u002Fhuskydogewoof\u002Fstatus\u002F2033283214664515642?s=20\">Benhao&#x27;s reading note\u003C\u002Fa>\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces Recursive Inference Machines, a recurrent reasoning framework that casts TRMs as a special case and improves ARC-AGI, Sudoku, and tabular classification by reweighting the history of loop states.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[03\u002F02\u002F2026] \u003Cstrong>AdaPonderLM: Gated Pondering Language Models with Token-Wise Adaptive Depth\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2603.01914\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2603.01914-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2603.01914\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2603.01914-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Shixiang Song, He Li, Zitong Wang, Boyi Zeng, Feichen Song, Yixuan Wang, Zhiqin John Xu, Ziwei He, Zhouhan Lin · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · training-algorithm · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning · efficiency\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces AdaPonderLM, a self-supervised recurrent language model with token-wise halting gates and KV reuse, allocating more loop steps to hard tokens under a fixed compute budget.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[02\u002F12\u002F2026] \u003Cstrong>SpiralFormer: Looped Transformers Can Learn Hierarchical Dependencies via Multi-Resolution Recursion\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.11698\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2602.11698-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2602.11698\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2602.11698-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Chengting Yu, Xiaobo Shu, Yadao Wang, Yizhen Zhang, Haoyi Wu, You Wu, Rujiao Long, Ziheng Chen, Yuchi Xu, Wenbo Su, Bo Zheng · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> hierarchical-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces SpiralFormer, a looped transformer that applies shared layers under a multi-resolution recursion schedule to learn hierarchical dependencies more efficiently than fixed-resolution recurrent baselines.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[02\u002F11\u002F2026] \u003Cstrong>LoopFormer: Elastic-Depth Looped Transformers for Latent Reasoning via Shortcut Modulation\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.11451\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2602.11451-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2602.11451\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2602.11451-7c3aed.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Farmenjeddi\u002Floopformer\u002Fstargazers\">\u003Cimg alt=\"GitHub stars\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Farmenjeddi\u002Floopformer?style=social\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Ahmadreza Jeddi, Marco Ciccone, Babak Taati · ICLR 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · training-algorithm · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning · efficiency\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces LoopFormer, trained on variable-length trajectories to enable budget-conditioned reasoning. Uses shortcut-consistency regularization to ensure stable internal trajectories across different loop depths.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[02\u002F11\u002F2026] \u003Cstrong>Prioritize the Process, Not Just the Outcome: Rewarding Latent Thought Trajectories Improves Reasoning in Looped Language Models\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.10520\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2602.10520-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2602.10520\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2602.10520-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Jonathan Williams, Esin Tureci · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> objective-loss · training-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces RLTT, a reinforcement-learning objective that assigns reward across the full latent thought trajectory of looped language models rather than only the final latent state.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[02\u002F09\u002F2026] \u003Cstrong>Looping Back to Move Forward: Recursive Transformers for Efficient and Flexible Large Multimodal Models\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.09080\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2602.09080-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2602.09080\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2602.09080-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Ruihan Xu, Yuting Gao, Lan Wang, Jianing Li, Weihao Chen, Qingpei Guo, Ming Yang, Shiliang Zhang · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> hierarchical-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> vision · efficiency\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces RecursiveVLM, a recursive multimodal transformer with a recursive connector and monotonic recursion loss that enables on-demand extra refinement under varying compute budgets.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[02\u002F09\u002F2026] \u003Cstrong>Understanding Dynamic Compute Allocation in Recurrent Transformers\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.08864\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2602.08864-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2602.08864\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2602.08864-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Ibraheem Muhammad Moosa, Suhas Lohit, Ye Wang, Moitreya Chatterjee, Wenpeng Yin · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · algorithmic-reasoning · efficiency\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Community Comments:\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fx.com\u002Fhuskydogewoof\u002Fstatus\u002F2031044736182616081?s=20\">Benhao&#x27;s reading note\u003C\u002Fa>\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Proposes ANIRA, a recurrent Transformer framework for per-token variable-depth computation, and shows adaptive compute can align with token complexity while failing to extrapolate to longer algorithmic inputs.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[01\u002F29\u002F2026] \u003Cstrong>Depth-Recurrent Attention Mixtures: Giving Latent Reasoning the Attention it Deserves\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2601.21582\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2601.21582-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2601.21582\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2601.21582-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Jonas Knupp, Jan Hendrik Metzen, Jeremias Bohn, Georg Groh, Kristian Kersting · 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> parallel-loop · flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning · efficiency\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Community Comments:\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fx.com\u002Fhuskydogewoof\u002Fstatus\u002F2031585611262386670?s=20\">Benhao&#x27;s reading note\u003C\u002Fa>\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces a modular framework combining sequence attention and depth attention for recurrent-depth models, improving FLOP-, parameter-, and memory-efficiency simultaneously.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[01\u002F26\u002F2026] \u003Cstrong>ChainGPT: Dual-Reasoning Model with Recurrent Depth and Multi-Rank State Updates\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fopenreview.net\u002Fpdf?id=kdZbxizwGK\">\u003Cimg alt=\"OpenReview\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FOpenReview-Paper-8E44AD.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Yunao Zheng, Xiaojie Wang, Lei Ren, Chen Wei · ICLR 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · training-algorithm · inference-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces ChainGPT, a dual-reasoning recurrent-depth architecture that combines multi-substep state updates and state-guided sparse attention to move reasoning into latent computation, with adaptive stopping as a supporting mechanism.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[01\u002F26\u002F2026] \u003Cstrong>MoDr: Mixture-of-Depth-Recurrent Transformers for Test-Time Reasoning\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fopenreview.net\u002Fpdf?id=9Pba4rcQbE\">\u003Cimg alt=\"OpenReview\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FOpenReview-Paper-8E44AD.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Xiaojing Zhang, Haifeng Wu, Gang He, Jiyang Shen, Bochen Lyu, Zhanxing Zhu · ICLR 2026\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · inference-algorithm · training-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> language-modeling · reasoning · efficiency · MoE\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Introduces MoDr, which adds multi-branch routing to a depth-recurrent Transformer so looped models can explore solution paths more adaptively at test time.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[12\u002F16\u002F2025] \u003Cstrong>Universal Reasoning Model\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2512.14693\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2512.14693-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2512.14693\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2512.14693-7c3aed.svg\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Zitian Gao, Lynx Chen, Yihao Xiao, He Xing, Ran Tao, Haoming Luo, Joey Zhou, Bryan Dai · 2025\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Loop Mechanism:\u003C\u002Fstrong> flat-loop\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Focus:\u003C\u002Fstrong> architecture · training-algorithm\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>Domains:\u003C\u002Fstrong> algorithmic-reasoning · reasoning\u003C\u002Fdiv>\n  \u003Cdiv>\u003Cstrong>TL;DR:\u003C\u002Fstrong> Proposes URM, a Universal Transformer-based architecture with weight tying that beats standard transformers on reasoning benchmarks through iterative depth computation.\u003C\u002Fdiv>\n  \u003C\u002Fdetails>\n\n- \u003Cdetails>\n  \u003Csummary>[11\u002F11\u002F2025] \u003Cstrong>Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2511.08577\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2511.08577-b31b1b.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fwww.alphaxiv.org\u002Fabs\u002F2511.08577\">\u003Cimg alt=\"AlphaXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAlphaXiv-2511.08577-7c3aed.svg\">\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fthu-nics\u002FTaH\u002Fstargazers\">\u003Cimg alt=\"GitHub stars\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fthu-nics\u002FTaH?style=social\">\u003C\u002Fa>\u003C\u002Fsummary>\n  \u003Cdiv>\u003Cstrong>Authors:\u003C\u002Fstrong> Tianyu Fu, Yichen You, Zekai","Awesome-Loop-Models 是一个精心整理的论文和技术博客列表，专注于循环模型，即在单一前向过程中重复使用共享学习内部层、块、模块或操作符的架构。该项目使用 Python 语言编写，并提供了一个交互式浏览器，支持搜索、过滤和探索相关论文及技术文章，链接涵盖 arXiv、代码库、OpenReview 等资源。此外，还提供了详细的 PR 提交流程指南，方便贡献者添加新内容。该资源适合研究者、开发者以及对循环模型感兴趣的学术界人士，在理解循环模型机制、设计新型架构及其应用方面具有重要参考价值。","2026-06-11 02:49:36","CREATED_QUERY"]