[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-78908":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":8,"htmlUrl":8,"language":9,"languages":8,"totalLinesOfCode":8,"stars":10,"forks":11,"watchers":12,"openIssues":13,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":12,"stars7d":15,"stars30d":16,"stars90d":14,"forks30d":14,"starsTrendScore":11,"compositeScore":17,"rankGlobal":8,"rankLanguage":8,"license":18,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":21,"hasPages":19,"topics":22,"createdAt":8,"pushedAt":8,"updatedAt":23,"readmeContent":24,"aiSummary":25,"trendingCount":14,"starSnapshotCount":14,"syncStatus":26,"lastSyncTime":27,"discoverSource":28},78908,"ntkmirror","leochlon\u002Fntkmirror","leochlon",null,"Python",327,47,7,1,0,40,267,5.04,"MIT License",false,"main",true,[],"2026-06-12 02:03:49","# NTK-Mirror\n\n**Hassana Labs** — Leon Chlon ([lc574@cantab.ac.uk](mailto:lc574@cantab.ac.uk))\n\nLoRA-free forward-pass fine-tuning for Hugging Face causal language models.\n\n`ntkmirror` learns a small signed controller on top of a frozen Transformer. It\nadds no LoRA modules and makes no permanent weight edits. The controller is a\nsparse set of shared log-gates on decoder-layer output channels:\n\n```text\nh'_{layer, token, channel} = exp(s_{layer, channel}) h_{layer, token, channel}\n```\n\nThe gates are learned from teacher-forced examples and then attached to the same\nHugging Face model during evaluation or generation.\n\n## Install\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fleochlon\u002Fntkmirror.git\ncd ntkmirror\npip install -e .\n```\n\n## Minimal use\n\nCreate `train.jsonl`:\n\n```jsonl\n{\"prompt\":\"Question: 14 + 27 = ?\\nAnswer:\",\"completion\":\" 41\"}\n{\"prompt\":\"Question: 36 + 18 = ?\\nAnswer:\",\"completion\":\" 54\"}\n```\n\nFit a controller:\n\n```bash\nntkmirror fit \\\n  --model Qwen\u002FQwen2.5-0.5B-Instruct \\\n  --train train.jsonl \\\n  --out controller.pt\n```\n\nEvaluate it:\n\n```bash\nntkmirror eval \\\n  --model Qwen\u002FQwen2.5-0.5B-Instruct \\\n  --controller controller.pt \\\n  --eval eval.jsonl\n```\n\nGenerate with it:\n\n```bash\nntkmirror generate \\\n  --model Qwen\u002FQwen2.5-0.5B-Instruct \\\n  --controller controller.pt \\\n  --prompt \"Question: 47 + 36 = ?\\nAnswer:\"\n```\n\n## One-command demo\n\n```bash\npip install -e .\nbash examples\u002Frun_demo.sh\n```\n\nFor a smaller run:\n\n```bash\nGATES=512 STEPS=40 bash examples\u002Frun_demo.sh\n```\n\n## Python API\n\n```python\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\nfrom ntkmirror import ForwardFineTuner, load_jsonl_examples\n\nmodel_name = \"Qwen\u002FQwen2.5-0.5B-Instruct\"\ntokenizer = AutoTokenizer.from_pretrained(model_name)\nmodel = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=\"auto\").cuda()\n\ntuner = ForwardFineTuner(model, tokenizer, gates=5000)\ntuner.fit(load_jsonl_examples(\"train.jsonl\"), steps=240)\ntuner.save(\"controller.pt\")\n\nprint(tuner.generate(\"Question: 47 + 36 = ?\\nAnswer:\"))\n```\n\n## Data format\n\nPreferred JSONL schema:\n\n```jsonl\n{\"prompt\":\"...context...\",\"completion\":\"...teacher-forced target...\"}\n```\n\nAlso accepted:\n\n```jsonl\n{\"instruction\":\"...\",\"response\":\"...\"}\n{\"question\":\"...\",\"answer\":\"...\"}\n{\"text\":\"...\"}\n```\n\n## Important defaults\n\n| Option | Default | Meaning |\n|---|---:|---|\n| `--gates` | `5000` | number of layer-channel log-gates |\n| `--steps` | `240` | AdamW steps on gate parameters only |\n| `--lr` | `5e-3` | controller learning rate |\n| `--max-log-gate` | `0.05` | bound on each signed log-gate |\n| `--layers` | `all` | decoder layers to score and gate |\n| `--score-batches` | `16` | batches used to select gates |\n\n## Compose two task controllers\n\nControllers are saved in signed log-gate coordinates, so composition is simple:\nadd the signed log-gates, clip to a safe budget, and attach the resulting\ncontroller. This is the activation-space analogue of adding task directions,\nexcept the addition happens in log-mask\u002Fmirror coordinates rather than LoRA\nweight space.\n\n```bash\nntkmirror compose \\\n  --controllers runs\u002Fgsm8k_controller.pt runs\u002Fmbpp_controller.pt \\\n  --out runs\u002Fgsm8k_plus_mbpp.pt \\\n  --report runs\u002Fcomposition_report.json\n\nntkmirror inspect \\\n  --controllers runs\u002Fgsm8k_controller.pt runs\u002Fmbpp_controller.pt runs\u002Fgsm8k_plus_mbpp.pt\n```\n\nA disjoint-task runner is included:\n\n```bash\npip install -e '.[datasets]'\nbash scripts\u002Frun_disjoint_composition.sh\n```\n\nIt builds GSM8K and MBPP JSONL subsets, fits one controller per task, composes\nthem, and evaluates base \u002F task-A \u002F task-B \u002F composed controllers on both eval\nsets. See `docs\u002Fcomposability.md`.\n\n\n## Persistent controller memory\n\nA memory item can be stored as a controller: one controller per conversation,\ndocument, user preference, task style, or procedure. At inference time,\n`ntkmirror` retrieves relevant items, composes their signed log-gates, and\nattaches the composed controller before generation. This injects retrieved\ncontext through the forward pass without appending the memory text to the prompt.\n\nFit-and-store a memory controller:\n\n```bash\nntkmirror memory add \\\n  --model Qwen\u002FQwen2.5-0.5B-Instruct \\\n  --store runs\u002Fmemory \\\n  --id arithmetic-carrying \\\n  --train examples\u002Fmath_train.jsonl \\\n  --text \"worked addition arithmetic with carrying\" \\\n  --tags math,arithmetic\n```\n\nOr register an existing controller:\n\n```bash\nntkmirror memory add \\\n  --store runs\u002Fmemory \\\n  --id arithmetic-carrying \\\n  --controller runs\u002Farithmetic.pt \\\n  --text \"two-digit addition with carrying: add ones, carry, then tens\"\n```\n\nRetrieve, compose, and generate:\n\n```bash\nntkmirror memory search \\\n  --store runs\u002Fmemory \\\n  --query \"solve an addition problem with carrying\"\n\nntkmirror memory generate \\\n  --model Qwen\u002FQwen2.5-0.5B-Instruct \\\n  --store runs\u002Fmemory \\\n  --query \"addition with carrying\" \\\n  --prompt \"Problem: 47 + 36 = ?\\nSolution:\"\n```\n\nTry the demo:\n\n```bash\nbash examples\u002Frun_memory_demo.sh\n```\n\nThe default retriever is a dependency-free lexical TF-IDF scorer. That is\nintentional for first-run UX: the main bottleneck in controller memory is\nretrieval quality, not controller storage. For production, replace the retriever\nwith an embedding or hybrid vector-store layer and keep the same `compose_states`\ninterface. See `docs\u002Fpersistent_memory.md`.\n\n## What this repo is not\n\nThis is the simple deployable package. It intentionally does not expose the full\nresearch harness for NTK-vector diagnostics, oracle SGD-displacement fitting, or\nmatrix-free theorem checks. Those are useful for papers; they make a bad first\nuser experience.\n\n## Notes for benchmark claims\n\nAlways report the base model, controller, and LoRA on the same train\u002Feval\nmanifest. For exact-answer tasks, report exact accuracy and teacher-forced NLL.\nFor system claims, report adaptation time and peak memory. See\n`docs\u002Fmethod.md` for failure modes.\n\n## Citation\n\n```bibtex\n@software{chlon2026ntkmirror,\n  author       = {Leon Chlon},\n  title        = {{NTK-Mirror: LoRA-free forward-pass fine-tuning via signed log-mask controllers}},\n  year         = {2026},\n  organization = {Hassana Labs},\n  url          = {https:\u002F\u002Fgithub.com\u002Fleochlon\u002Fntkmirror}\n}\n```\n\n## License\n\nMIT © 2026 Hassana Labs — Leon Chlon.\n","NTK-Mirror 是一个用于Hugging Face因果语言模型的无LoRA前向传递微调工具。该项目通过在冻结的Transformer上学习一个小规模的控制器来实现模型的微调，该控制器由一系列稀疏共享的日志门控组成，这些门控作用于解码器层输出通道上，且不会对模型权重进行永久性修改。这种技术特别适用于需要快速、轻量级地调整预训练模型以适应特定任务或数据集的场景，比如教育、客服等领域的自然语言处理任务。使用Python开发，并采用MIT许可证发布，使得NTK-Mirror易于集成到现有项目中。",2,"2026-06-11 03:57:17","CREATED_QUERY"]