[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-11625":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":15,"stars7d":16,"stars30d":17,"stars90d":14,"forks30d":14,"starsTrendScore":18,"compositeScore":19,"rankGlobal":9,"rankLanguage":9,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":23,"hasPages":21,"topics":24,"createdAt":9,"pushedAt":9,"updatedAt":25,"readmeContent":26,"aiSummary":27,"trendingCount":14,"starSnapshotCount":14,"syncStatus":28,"lastSyncTime":29,"discoverSource":30},11625,"aurora-release","tilde-research\u002Faurora-release","tilde-research","Aurora optimizer release",null,"Python",144,6,32,0,3,5,45,9,2.54,"MIT License",false,"main",true,[],"2026-06-12 02:02:32","# Aurora\n\nAurora is an optimizer for non-square matrices that achieves more effective utilization of MLP neurons. Instead of `polar(G)`, which inherits non-uniform left-singular row norms, Aurora iteratively approximates a projection onto the intersection of the row oblique and Steifel manifolds, giving more balanced updates without sacrificing polar factor precision. For square matrices Aurora reduces to the standard Muon update.\n\nSee the blog for more information:\nhttps:\u002F\u002Fblog.tilderesearch.com\u002Fblog\u002Faurora\n\nAnd Twitter at:\nhttps:\u002F\u002Fx.com\u002Ftilderesearch\u002Fstatus\u002F2052798181558370419\n\n### Code structure\n\n```text\nsrc\u002F\n├── main.py               # Entry point: training loop and CLI\n├── polar.py              # Polar factor via simple-quintic Newton-Schulz\n├── aurora.py             # Aurora update rule\n└── riemannian_aurora.py  # Riemannian Aurora: Riemannian gradient ascent on the balanced Stiefel manifold\n\n```\n\n### Usage\n\n```python\nfrom aurora import aurora\n\n# Inside the training loop, for each weight tensor W with gradient G\n# and a caller-managed momentum buffer m (zeros at init):\naurora(W, G, m, eta=lr, weight_decay=0.025)\n```\n\n### Hyperparameters\n\n- `pp_iterations` (default 2): number of update refinement iterations.\n  Higher values refine the update toward the row-uniform fixed point\n  at the cost of one extra polar call per parameter per iteration.\n- `pp_beta` (default 0.5): damping exponent for the row normalization step, in `(0, 1]`.\n  Default 0.5 gives undamped square-root steps; lower values damp\n  oscillation between odd\u002Feven D iterates.\n- `mu` (default 0.95), `nesterov` (default True), `weight_decay` (default\n  0.025): standard Muon \u002F SGD-momentum hyperparameters.\n\n### Utilities\n\n`polar.py` uses simple-quintic 12-iteration Newton-Schulz with coefficients. Aurora's full `aurora()` step follows: Nesterov momentum → leverage-uniform polar → spectral aspect-ratio scale → decoupled weight decay. Different Newton-Schultz iterations can be added as a drop-in replacement to our `polar` function.\n","Aurora是一个针对非方阵优化的工具，旨在更有效地利用多层感知器（MLP）中的神经元。其核心功能在于通过迭代近似将投影映射到行斜交与Steifel流形的交集上，从而实现更均衡的更新，同时保持极性因子精度。项目采用Python语言编写，并且在处理方阵时可简化为标准的Muon更新。Aurora适用于需要对非方阵进行高效优化的场景，尤其是在深度学习模型训练过程中希望提高权重更新效率的情况。此外，用户可以通过调整`pp_iterations`、`pp_beta`等超参数来控制更新过程中的细节，以满足特定需求。",2,"2026-06-11 03:32:10","CREATED_QUERY"]