[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72314":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":9,"rankLanguage":9,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":22,"topics":25,"createdAt":9,"pushedAt":9,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":15,"starSnapshotCount":15,"syncStatus":29,"lastSyncTime":30,"discoverSource":31},72314,"tinker-cookbook","thinking-machines-lab\u002Ftinker-cookbook","thinking-machines-lab","Post-training with Tinker",null,"Python",3452,443,26,18,0,24,60,209,72,109.94,"Apache License 2.0",false,"main",true,[],"2026-06-12 04:01:04","\u003Ch1 align=\"center\">Tinker Cookbook\u003C\u002Fh1>\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"assets\u002Ftinker-cover.png\" width=\"60%\" \u002F>\n\u003C\u002Fdiv>\n\n\u003Cdiv align=\"center\">\n\n[![pytest](https:\u002F\u002Fgithub.com\u002Fthinking-machines-lab\u002Ftinker-cookbook\u002Factions\u002Fworkflows\u002Fpytest.yaml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fthinking-machines-lab\u002Ftinker-cookbook\u002Factions\u002Fworkflows\u002Fpytest.yaml)\n[![pyright](https:\u002F\u002Fgithub.com\u002Fthinking-machines-lab\u002Ftinker-cookbook\u002Factions\u002Fworkflows\u002Fpyright.yaml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fthinking-machines-lab\u002Ftinker-cookbook\u002Factions\u002Fworkflows\u002Fpyright.yaml)\n[![smoke-test-recipes](https:\u002F\u002Fgithub.com\u002Fthinking-machines-lab\u002Ftinker-cookbook\u002Factions\u002Fworkflows\u002Fsmoke-test-recipes.yaml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fthinking-machines-lab\u002Ftinker-cookbook\u002Factions\u002Fworkflows\u002Fsmoke-test-recipes.yaml)\n[![PyPI](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Ftinker-cookbook)](https:\u002F\u002Fpypi.org\u002Fproject\u002Ftinker-cookbook\u002F)\n\n\u003C\u002Fdiv>\n\nWe provide two libraries for the broader community to customize their language models: `tinker` and `tinker-cookbook`.\n\n- `tinker` is a training SDK for researchers and developers to fine-tune language models. You send API requests to us and we handle the complexities of distributed training.\n- `tinker-cookbook` includes realistic examples of fine-tuning language models. It builds on the Tinker API and provides common abstractions to fine-tune language models.\n\n## Installation\n\n1. Sign up for Tinker [here](https:\u002F\u002Fauth.thinkingmachines.ai\u002Fsign-up).\n2. Once you have access, create an API key from the [console](https:\u002F\u002Ftinker-console.thinkingmachines.ai) and export it as environment variable `TINKER_API_KEY`.\n3. Install `tinker-cookbook` (includes the `tinker` SDK as a dependency):\n   ```bash\n   # Latest stable release from PyPI\n   uv pip install tinker-cookbook\n\n   # Or install the nightly build\n   uv pip install 'tinker-cookbook @ git+https:\u002F\u002Fgithub.com\u002Fthinking-machines-lab\u002Ftinker-cookbook.git@nightly'\n   ```\n\n## Tinker\n\nHere we introduce a few Tinker primitives — the basic components to fine-tune LLMs (see the [quickstart guide](https:\u002F\u002Ftinker-docs.thinkingmachines.ai\u002Ftinker\u002Fquickstart\u002F) for more details):\n\n```python\nimport tinker\nservice_client = tinker.ServiceClient()\ntraining_client = service_client.create_lora_training_client(\n  base_model=\"meta-llama\u002FLlama-3.2-1B\", rank=32,\n)\ntraining_client.forward_backward(...)\ntraining_client.optim_step(...)\ntraining_client.save_state(...)\ntraining_client.load_state(...)\n\nsampling_client = training_client.save_weights_and_get_sampling_client()\nsampling_client.sample(...)\n```\n\nSee [tinker_cookbook\u002Frecipes\u002Fsl_loop.py](tinker_cookbook\u002Frecipes\u002Fsl_loop.py) and [tinker_cookbook\u002Frecipes\u002Frl_loop.py](tinker_cookbook\u002Frecipes\u002Frl_loop.py) for minimal examples of using these primitives to fine-tune LLMs.\n\n### Tutorials\n\nNew to Tinker? The [`tutorials\u002F`](tutorials\u002F) directory contains 20+ progressive [marimo](https:\u002F\u002Fmarimo.io\u002F) notebooks that walk through core concepts — rendering, loss functions, completers, weight management — and advanced topics such as custom RL environments, DPO, RLHF, and weight export. Run any tutorial with `marimo edit tutorials\u002F101_hello_tinker.py`. See the [tutorials README](tutorials\u002FREADME.md) for the full list, or browse rendered versions on the [Tinker docs site](https:\u002F\u002Ftinker-docs.thinkingmachines.ai\u002Ftutorials).\n\nTo download the weights of any model:\n```python\nrest_client = service_client.create_rest_client()\nfuture = rest_client.get_checkpoint_archive_url_from_tinker_path(sampling_client.model_path)\nwith open(f\"model-checkpoint.tar.gz\", \"wb\") as f:\n    f.write(future.result())\n```\n\n### Tinker Cookbook\n\nBesides these primitives, we also offer **Tinker Cookbook** (a.k.a. this repo), a library of a wide range of abstractions to help you customize training environments.\n[`tinker_cookbook\u002Frecipes\u002Fsl_basic.py`](tinker_cookbook\u002Frecipes\u002Fsl_basic.py) and [`tinker_cookbook\u002Frecipes\u002Frl_basic.py`](tinker_cookbook\u002Frecipes\u002Frl_basic.py) contain minimal examples to configure supervised learning and reinforcement learning.\n\nWe also include more complete examples in the [`tinker_cookbook\u002Frecipes\u002F`](tinker_cookbook\u002Frecipes\u002F) folder:\n- **[Chat SFT](tinker_cookbook\u002Frecipes\u002Fchat_sl\u002F)**: supervised fine-tuning on conversational datasets (e.g., Tulu3).\n- **[Math RL](tinker_cookbook\u002Frecipes\u002Fmath_rl\u002F)**: reinforcement learning for mathematical reasoning with verifiable rewards.\n- **[Code RL](tinker_cookbook\u002Frecipes\u002Fcode_rl\u002F)**: RL on competitive programming with sandboxed code execution (DeepCoder replication).\n- **[Preference learning](tinker_cookbook\u002Frecipes\u002Fpreference\u002F)**: DPO and a three-stage RLHF pipeline (SFT, reward model, RL).\n- **[Distillation](tinker_cookbook\u002Frecipes\u002Fdistillation\u002F)**: on-policy and off-policy knowledge distillation with single- and multi-teacher configurations.\n- **[Tool use](tinker_cookbook\u002Frecipes\u002Fsearch_tool\u002F)**: RL for retrieval-augmented generation (Search-R1 replication).\n- **[Multi-agent](tinker_cookbook\u002Frecipes\u002Fmultiplayer_rl\u002F)**: multi-agent RL with self-play and cross-play.\n\nThe [recipes README](tinker_cookbook\u002Frecipes\u002FREADME.md) covers all available recipes, including Harbor RL, rubric-based grading, VLM classification, and SDFT. Each recipe includes a `README.md` with implementation details, launch commands, and expected results.\n\n### Evaluation (experimental)\n\nTinker Cookbook includes a [benchmark framework](tinker_cookbook\u002Feval\u002F) for evaluating trained models:\n\n```python\nfrom tinker_cookbook.eval.benchmarks import run_benchmarks, BenchmarkConfig\n\nresults = await run_benchmarks(\n    [\"gsm8k\", \"mmlu_pro\", \"ifeval\"],\n    sampling_client, renderer,\n    BenchmarkConfig(save_dir=\"evals\u002Fstep500\"),\n)\n```\n\nThe framework currently supports 12 benchmarks (GSM8K, MATH-500, MMLU-Pro, MMLU-Redux, GPQA, IFEval, MBPP, C-Eval, SuperGPQA, IFBench, AIME 2025, AIME 2026) with verified scores against published results, plus experimental benchmarks such as LiveCodeBench, Terminal Bench, and SWE-bench. Benchmarks can also serve as inline training evaluators via `BenchmarkEvaluator`.\n\n**Note:** Benchmark scores are sensitive to evaluation configuration — system prompts, `max_tokens`, temperature, and timeout settings can shift results significantly. We document our exact settings alongside all reported scores. This framework is under active development; feedback and contributions are welcome. See the [eval README](tinker_cookbook\u002Feval\u002FREADME.md) for verified scores, configuration details, and instructions for adding new benchmarks.\n\n### Documentation\n\nFor the full Tinker documentation, visit [tinker-docs.thinkingmachines.ai](https:\u002F\u002Ftinker-docs.thinkingmachines.ai).\n\n### Utilities\n\nTinker Cookbook also provides reusable building blocks:\n- [`renderers`](tinker_cookbook\u002Frenderers\u002F) — bidirectional conversion between token sequences and structured chat messages\n- [`hyperparam_utils`](tinker_cookbook\u002Fhyperparam_utils.py) — learning rate and hyperparameter scaling for LoRA training\n- [`eval`](tinker_cookbook\u002Feval\u002F) — benchmark framework and inline training evaluators (see [Evaluation](#evaluation-experimental) above)\n\n## Claude Code Skills\n\nTinker Cookbook ships with [Claude Code skills](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fclaude-code\u002Fskills) that teach Claude how to use the Tinker API. Install them so Claude can help you write training code in any project:\n\n```\n\u002Fplugin marketplace add thinking-machines-lab\u002Ftinker-cookbook\n```\n\nThen install the **tinker** plugin from the Discover tab (`\u002Fplugin` → Discover). Once installed, two skills are available:\n\n| Command | What it does |\n|---|---|\n| `\u002Ftinker:research` | Plan and run post-training experiments — SFT, RL, DPO, distillation, evaluation, hyperparameters, model selection, and more |\n| `\u002Ftinker:debug` | Diagnose slow training, hangs, output mismatches, renderer issues, and errors |\n\nSkills also trigger automatically based on context — ask Claude to \"set up SFT training\" and it will load the right skill without a slash command. Skills update automatically when the repo is updated.\n\n## Development Setup\n\n```bash\nuv sync --extra dev\npre-commit install\n```\n\nThis installs dev dependencies and registers pre-commit hooks that run `ruff` formatting and linting on every commit. CI enforces these checks on all pull requests.\n\n## Contributing\n\nThis project is built in the spirit of open science and collaborative development. We believe that the best tools emerge through community involvement and shared learning.\n\nWe welcome PR contributions after our private beta is over. If you have any feedback, please email us at tinker@thinkingmachines.ai.\n\n## Citation\nIf you use Tinker for your research, please cite it as:\n```\nThinking Machines Lab, 2025. Tinker. https:\u002F\u002Fthinkingmachines.ai\u002Ftinker\u002F.\n```\n\nOr use this BibTeX citation:\n```\n@misc{tml2025tinker,\n  author = {Thinking Machines Lab},\n  title = {Tinker},\n  year = {2025},\n  url = {https:\u002F\u002Fthinkingmachines.ai\u002Ftinker\u002F},\n}\n```\n","Tinker Cookbook 是一个用于语言模型微调的工具库。它基于 Tinker API 提供了多种实用示例和常见抽象，以简化语言模型的定制过程，支持监督学习和强化学习两种微调方式。项目采用 Python 编写，并通过 PyPI 发布，具备自动化测试流程确保代码质量。适用于需要对现有语言模型进行特定任务优化的研究人员和开发者，特别是那些希望利用分布式训练能力但不想深入底层实现细节的用户。",2,"2026-06-11 03:41:20","high_star"]