[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-73762":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":23,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":41,"readmeContent":42,"aiSummary":43,"trendingCount":16,"starSnapshotCount":16,"syncStatus":44,"lastSyncTime":45,"discoverSource":46},73762,"AI-Research-SKILLs","Orchestra-Research\u002FAI-Research-SKILLs","Orchestra-Research","Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code\u002Fcodex\u002Fgemini agent will be an AI research agent with full horsepower. Maintained by Orchestra Research.","http:\u002F\u002Forchestra-research.com",null,"TeX",9518,720,48,9,0,145,370,681,435,114.57,"MIT License",false,"main",true,[27,28,29,30,31,32,33,34,35,36,37,38,39,40],"ai","ai-research","claude","claude-code","claude-skills","codex","gemini","gpt-5","grpo","huggingface","machine-leanring","megatron","skills","vllm","2026-06-12 04:01:11","# AI Research `Skills` Library\n\n> **The most comprehensive open-source skills library enabling AI agents to autonomously conduct AI research — from idea to paper**\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"docs\u002Fassets\u002Fpromo.gif\" alt=\"AI Research Skills Demo\" width=\"700\">\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fopensource.org\u002Flicenses\u002FMIT\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-yellow.svg\" alt=\"License: MIT\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002F@orchestra-research\u002Fai-research-skills\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fnpm\u002Fv\u002F@orchestra-research\u002Fai-research-skills.svg\" alt=\"npm version\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fwww.orchestra-research.com\u002Fperspectives\u002Fai-research-skills\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FBlog-Read%20More-orange.svg\" alt=\"Blog Post\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fjoin.slack.com\u002Ft\u002Forchestrarese-efu1990\u002Fshared_invite\u002Fzt-3iu6gr8io-zJvpkZTPToEviQ9KFZvNSg\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FSlack-Join%20Community-4A154B.svg?logo=slack\" alt=\"Slack\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fx.com\u002Forch_research\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTwitter-Follow-1DA1F2.svg?logo=x\" alt=\"Twitter\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fwww.linkedin.com\u002Fcompany\u002Forchestra-research\u002F\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLinkedIn-Follow-0A66C2.svg?logo=linkedin\" alt=\"LinkedIn\">\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cdiv align=\"center\">\n\n### **98 Skills Powering AI Research in 2026**\n\n\u003C\u002Fdiv>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>View All 23 Categories\u003C\u002Fb>\u003C\u002Fsummary>\n\n\u003Cdiv align=\"center\">\n\n| | | |\n|:---:|:---:|:---:|\n| **Autoresearch** (1) | **Ideation** (2) | **ML Paper Writing** (2) |\n| **Model Architecture** (5) | **Fine-Tuning** (4) | **Post-Training** (8) |\n| **Distributed Training** (6) | **Optimization** (6) | **Inference** (4) |\n| **Tokenization** (2) | **Data Processing** (2) | **Evaluation** (3) |\n| **Safety & Alignment** (4) | **Agents** (4) | **RAG** (5) |\n| **Multimodal** (7) | **Prompt Engineering** (4) | **MLOps** (3) |\n| **Observability** (2) | **Infrastructure** (3) | **Mech Interp** (4) |\n| **Emerging Techniques** (6) | **Agent-Native Research Artifact** (3) | |\n\n\u003C\u002Fdiv>\n\n\u003C\u002Fdetails>\n\n---\n\n## Table of Contents\n\n- [Our Mission](#our-mission)\n- [Path Towards AI Research Agent](#path-towards-ai-research-agent)\n- [Available AI Research Engineering Skills](#available-ai-research-engineering-skills)\n- [Demos](#demos)\n- [Skill Structure](#skill-structure)\n- [Roadmap](#roadmap)\n- [Repository Structure](#repository-structure)\n- [Use Cases](#use-cases)\n- [Contributors](#contributors)\n- [Citation](#citation)\n- [Community](#community)\n\n\n## Our Mission\n\nWe enable AI agents to **autonomously conduct AI research** — from literature survey and idea generation through experiment execution to paper writing. The library provides both the **research orchestration layer** (autoresearch, ideation, paper writing) and the **engineering skills** (training, evaluation, deployment) needed at each stage.\n\u003Cp align=\"center\">\n  \u003Cimg src=\"docs\u002Fskills.png\" alt=\"AI Research Agent System\" width=\"50%\">\n  \u003Cbr>\n  \u003Cem>System diagram of an AI research agent\u003C\u002Fem>\n\u003C\u002Fp>\n\n## Path Towards AI Research Agent\n\nModern AI research requires mastering dozens of specialized tools and frameworks.\nAI Researchers spend more time debugging infrastructure than testing hypotheses — slowing the pace of scientific discovery.\nWe provide a comprehensive skills library that enables AI agents to autonomously conduct the full research lifecycle — from brainstorming ideas to writing the paper.\n  - Autonomous Research - The **autoresearch** skill orchestrates the entire research workflow using a two-loop architecture, routing to domain skills as needed\n  - Specialized Expertise - Each domain skill provides deep, production-ready knowledge of a specific framework (Megatron-LM, vLLM, TRL, etc.)\n  - End-to-End Coverage - 98 skills spanning the full AI research lifecycle, from ideation and literature survey to experiments and paper writing\n  - Research-Grade Quality - Documentation sourced from official repos, real GitHub issues, and battle-tested production workflows\n\n## Available AI Research Engineering Skills\n\n**Quality over quantity**: Each skill provides comprehensive, expert-level guidance with real code examples, troubleshooting guides, and production-ready workflows.\n\n### 📦 Quick Install (Recommended)\n\n**For humans** — interactive installer with one command:\n\n```bash\nnpx @orchestra-research\u002Fai-research-skills\n```\n\n**For AI agents** — point your agent to the welcome doc and it handles the rest:\n\n```\nRead https:\u002F\u002Fwww.orchestra-research.com\u002Fai-research-skills\u002Fwelcome.md and follow the instructions to install and use AI Research Skills.\n```\n\nThis installs all 98 skills, loads the **autoresearch** orchestration layer, and starts autonomous research.\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>What the installer does\u003C\u002Fb>\u003C\u002Fsummary>\n\n- **Auto-detects** your installed coding agents (Claude Code, Hermes Agent, OpenCode, Cursor, Gemini CLI, etc.)\n- **Installs** skills to `~\u002F.orchestra\u002Fskills\u002F` with symlinks to each agent (falls back to copy on Windows)\n- **Offers** everything, quickstart bundle, by category, or individual skills\n- **Updates** installed skills with latest versions\n- **Uninstalls** all or selected skills\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>CLI Commands\u003C\u002Fb>\u003C\u002Fsummary>\n\n```bash\n# Interactive installer (recommended)\nnpx @orchestra-research\u002Fai-research-skills\n\n# Direct commands\nnpx @orchestra-research\u002Fai-research-skills list      # View installed skills\nnpx @orchestra-research\u002Fai-research-skills update    # Update installed skills\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Claude Code Marketplace (Alternative)\u003C\u002Fb>\u003C\u002Fsummary>\n\nInstall skill categories directly using the **Claude Code CLI**:\n\n```bash\n# Add the marketplace\n\u002Fplugin marketplace add orchestra-research\u002FAI-research-SKILLs\n\n# Install by category (23 categories available)\n\u002Fplugin install fine-tuning@ai-research-skills        # Axolotl, LLaMA-Factory, PEFT, Unsloth\n\u002Fplugin install post-training@ai-research-skills      # TRL, GRPO, OpenRLHF, SimPO, verl, slime, miles, torchforge\n\u002Fplugin install inference-serving@ai-research-skills  # vLLM, TensorRT-LLM, llama.cpp, SGLang\n\u002Fplugin install distributed-training@ai-research-skills\n\u002Fplugin install optimization@ai-research-skills\n```\n\n\u003C\u002Fdetails>\n\n### All 23 Categories (98 Skills)\n\n| Category | Skills | Included |\n|----------|--------|----------|\n| **Autoresearch** | **1** | **Autonomous research orchestration — central layer that manages the full lifecycle and routes to all other skills** |\n| Ideation | 2 | Research Brainstorming, Creative Thinking |\n| ML Paper Writing | 2 | ML Paper Writing (LaTeX templates, citation verification), Academic Plotting |\n| Model Architecture | 5 | LitGPT, Mamba, NanoGPT, RWKV, TorchTitan |\n| Tokenization | 2 | HuggingFace Tokenizers, SentencePiece |\n| Fine-Tuning | 4 | Axolotl, LLaMA-Factory, PEFT, Unsloth |\n| Mech Interp | 4 | TransformerLens, SAELens, pyvene, nnsight |\n| Data Processing | 2 | NeMo Curator, Ray Data |\n| Post-Training | 8 | TRL, GRPO, OpenRLHF, SimPO, verl, slime, miles, torchforge |\n| Safety | 4 | Constitutional AI, LlamaGuard, NeMo Guardrails, Prompt Guard |\n| Distributed | 6 | DeepSpeed, FSDP, Accelerate, Megatron-Core, Lightning, Ray Train |\n| Infrastructure | 3 | Modal, Lambda Labs, SkyPilot |\n| Optimization | 6 | Flash Attention, bitsandbytes, GPTQ, AWQ, HQQ, GGUF |\n| Evaluation | 3 | lm-eval-harness, BigCode, NeMo Evaluator |\n| Inference | 4 | vLLM, TensorRT-LLM, llama.cpp, SGLang |\n| MLOps | 3 | W&B, MLflow, TensorBoard |\n| Agents | 4 | LangChain, LlamaIndex, CrewAI, AutoGPT |\n| RAG | 5 | Chroma, FAISS, Pinecone, Qdrant, Sentence Transformers |\n| Prompt Eng | 4 | DSPy, Instructor, Guidance, Outlines |\n| Observability | 2 | LangSmith, Phoenix |\n| Multimodal | 7 | CLIP, Whisper, LLaVA, BLIP-2, SAM, Stable Diffusion, AudioCraft |\n| Emerging | 6 | MoE, Model Merging, Long Context, Speculative Decoding, Distillation, Pruning |\n| Agent-Native Research Artifact | 3 | ARA Compiler, Research Manager, Rigor Reviewer |\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>View All 98 Skills in Details\u003C\u002Fb>\u003C\u002Fsummary>\n\n### 🔬 Autoresearch (1 skill) — Central Orchestration Layer\n- **[Autoresearch](0-autoresearch-skill\u002F)** - Autonomous research orchestration using a two-loop architecture (inner optimization + outer synthesis). Manages the full lifecycle from literature survey to paper writing, routing to all domain-specific skills. Supports Claude Code \u002Floop and OpenClaw heartbeat for continuous operation (390 lines + 3 refs)\n\n### 🏗️ Model Architecture (5 skills)\n- **[LitGPT](01-model-architecture\u002Flitgpt\u002F)** - Lightning AI's 20+ clean LLM implementations with production training recipes (462 lines + 4 refs)\n- **[Mamba](01-model-architecture\u002Fmamba\u002F)** - State-space models with O(n) complexity, 5× faster than Transformers (253 lines + 3 refs)\n- **[RWKV](01-model-architecture\u002Frwkv\u002F)** - RNN+Transformer hybrid, infinite context, Linux Foundation project (253 lines + 3 refs)\n- **[NanoGPT](01-model-architecture\u002Fnanogpt\u002F)** - Educational GPT in ~300 lines by Karpathy (283 lines + 3 refs)\n- **[TorchTitan](01-model-architecture\u002Ftorchtitan\u002F)** - PyTorch-native distributed training for Llama 3.1 with 4D parallelism\n\n### 🔤 Tokenization (2 skills)\n- **[HuggingFace Tokenizers](02-tokenization\u002Fhuggingface-tokenizers\u002F)** - Rust-based, \u003C20s\u002FGB, BPE\u002FWordPiece\u002FUnigram algorithms (486 lines + 4 refs)\n- **[SentencePiece](02-tokenization\u002Fsentencepiece\u002F)** - Language-independent, 50k sentences\u002Fsec, used by T5\u002FALBERT (228 lines + 2 refs)\n\n### 🎯 Fine-Tuning (4 skills)\n- **[Axolotl](03-fine-tuning\u002Faxolotl\u002F)** - YAML-based fine-tuning with 100+ models (156 lines + 4 refs)\n- **[LLaMA-Factory](03-fine-tuning\u002Fllama-factory\u002F)** - WebUI no-code fine-tuning (78 lines + 5 refs)\n- **[Unsloth](03-fine-tuning\u002Funsloth\u002F)** - 2x faster QLoRA fine-tuning (75 lines + 4 refs)\n- **[PEFT](03-fine-tuning\u002Fpeft\u002F)** - Parameter-efficient fine-tuning with LoRA, QLoRA, DoRA, 25+ methods (431 lines + 2 refs)\n\n### 🔬 Mechanistic Interpretability (4 skills)\n- **[TransformerLens](04-mechanistic-interpretability\u002Ftransformer-lens\u002F)** - Neel Nanda's library for mech interp with HookPoints, activation caching (346 lines + 3 refs)\n- **[SAELens](04-mechanistic-interpretability\u002Fsaelens\u002F)** - Sparse Autoencoder training and analysis for feature discovery (386 lines + 3 refs)\n- **[pyvene](04-mechanistic-interpretability\u002Fpyvene\u002F)** - Stanford's causal intervention library with declarative configs (473 lines + 3 refs)\n- **[nnsight](04-mechanistic-interpretability\u002Fnnsight\u002F)** - Remote interpretability via NDIF, run experiments on 70B+ models (436 lines + 3 refs)\n\n\n### 📊 Data Processing (2 skills)\n- **[Ray Data](05-data-processing\u002Fray-data\u002F)** - Distributed ML data processing, streaming execution, GPU support (318 lines + 2 refs)\n- **[NeMo Curator](05-data-processing\u002Fnemo-curator\u002F)** - GPU-accelerated data curation, 16× faster deduplication (375 lines + 2 refs)\n\n### 🎓 Post-Training (8 skills)\n- **[TRL Fine-Tuning](06-post-training\u002Ftrl-fine-tuning\u002F)** - Transformer Reinforcement Learning (447 lines + 4 refs)\n- **[GRPO-RL-Training](06-post-training\u002Fgrpo-rl-training\u002F)** (TRL) - Group Relative Policy Optimization with TRL (569 lines, **gold standard**)\n- **[OpenRLHF](06-post-training\u002Fopenrlhf\u002F)** - Full RLHF pipeline with Ray + vLLM (241 lines + 4 refs)\n- **[SimPO](06-post-training\u002Fsimpo\u002F)** - Simple Preference Optimization, no reference model needed (211 lines + 3 refs)\n- **[verl](06-post-training\u002Fverl\u002F)** - ByteDance's HybridFlow RL framework, FSDP\u002FMegatron + vLLM\u002FSGLang backends (389 lines + 2 refs)\n- **[slime](06-post-training\u002Fslime\u002F)** - THUDM's Megatron+SGLang framework powering GLM-4.x models (464 lines + 2 refs)\n- **[miles](06-post-training\u002Fmiles\u002F)** - Enterprise fork of slime with FP8, INT4, speculative RL for MoE training (315 lines + 2 refs)\n- **[torchforge](06-post-training\u002Ftorchforge\u002F)** - Meta's PyTorch-native RL with Monarch+TorchTitan+vLLM (380 lines + 2 refs)\n\n### 🛡️ Safety & Alignment (4 skills)\n- **[Constitutional AI](07-safety-alignment\u002Fconstitutional-ai\u002F)** - AI-driven self-improvement via principles (282 lines)\n- **[LlamaGuard](07-safety-alignment\u002Fllamaguard\u002F)** - Safety classifier for LLM inputs\u002Foutputs (329 lines)\n- **[NeMo Guardrails](07-safety-alignment\u002Fnemo-guardrails\u002F)** - Programmable guardrails with Colang (289 lines)\n- **[Prompt Guard](07-safety-alignment\u002Fprompt-guard\u002F)** - Meta's 86M prompt injection & jailbreak detector, 99%+ TPR, \u003C2ms GPU (313 lines)\n\n### ⚡ Distributed Training (6 skills)\n- **[Megatron-Core](08-distributed-training\u002Fmegatron-core\u002F)** - NVIDIA's framework for training 2B-462B param models with 47% MFU on H100 (359 lines + 4 refs)\n- **[DeepSpeed](08-distributed-training\u002Fdeepspeed\u002F)** - Microsoft's ZeRO optimization (137 lines + 9 refs)\n- **[PyTorch FSDP2](08-distributed-training\u002Fpytorch-fsdp2\u002F)** - Fully Sharded Data Parallel v2 with `fully_shard` and DTensor (231 lines + 12 refs)\n- **[Accelerate](08-distributed-training\u002Faccelerate\u002F)** - HuggingFace's 4-line distributed training API (324 lines + 3 refs)\n- **[PyTorch Lightning](08-distributed-training\u002Fpytorch-lightning\u002F)** - High-level training framework with Trainer class (339 lines + 3 refs)\n- **[Ray Train](08-distributed-training\u002Fray-train\u002F)** - Multi-node orchestration and hyperparameter tuning (399 lines + 1 ref)\n\n### 🚀 Optimization (6 skills)\n- **[Flash Attention](10-optimization\u002Fflash-attention\u002F)** - 2-4x faster attention with memory efficiency (359 lines + 2 refs)\n- **[bitsandbytes](10-optimization\u002Fbitsandbytes\u002F)** - 8-bit\u002F4-bit quantization for 50-75% memory reduction (403 lines + 3 refs)\n- **[GPTQ](10-optimization\u002Fgptq\u002F)** - 4-bit post-training quantization, 4× memory reduction, \u003C2% accuracy loss (443 lines + 3 refs)\n- **[AWQ](10-optimization\u002Fawq\u002F)** - Activation-aware weight quantization, 4-bit with minimal accuracy loss (310 lines + 2 refs)\n- **[HQQ](10-optimization\u002Fhqq\u002F)** - Half-Quadratic Quantization, no calibration data needed, multi-backend (370 lines + 2 refs)\n- **[GGUF](10-optimization\u002Fgguf\u002F)** - llama.cpp quantization format, K-quant methods, CPU\u002FMetal inference (380 lines + 2 refs)\n\n### 📊 Evaluation (3 skills)\n- **[lm-evaluation-harness](11-evaluation\u002Flm-evaluation-harness\u002F)** - EleutherAI's standard for benchmarking LLMs across 60+ tasks (482 lines + 4 refs)\n- **[BigCode Evaluation Harness](11-evaluation\u002Fbigcode-evaluation-harness\u002F)** - Code model benchmarking with HumanEval, MBPP, MultiPL-E, pass@k metrics (406 lines + 3 refs)\n- **[NeMo Evaluator](11-evaluation\u002Fnemo-evaluator\u002F)** - NVIDIA's enterprise platform for 100+ benchmarks across 18+ harnesses with multi-backend execution (454 lines + 4 refs)\n\n### ☁️ Infrastructure (3 skills)\n- **[Modal](09-infrastructure\u002Fmodal\u002F)** - Serverless GPU cloud with Python-native API, T4-H200 on-demand (342 lines + 2 refs)\n- **[SkyPilot](09-infrastructure\u002Fskypilot\u002F)** - Multi-cloud orchestration across 20+ providers with spot recovery (390 lines + 2 refs)\n- **[Lambda Labs](09-infrastructure\u002Flambda-labs\u002F)** - Reserved\u002Fon-demand GPU cloud with H100\u002FA100, persistent filesystems (390 lines + 2 refs)\n\n### 🔥 Inference & Serving (4 skills)\n- **[vLLM](12-inference-serving\u002Fvllm\u002F)** - High-throughput LLM serving with PagedAttention (356 lines + 4 refs, **production-ready**)\n- **[TensorRT-LLM](12-inference-serving\u002Ftensorrt-llm\u002F)** - NVIDIA's fastest inference, 24k tok\u002Fs, FP8\u002FINT4 quantization (180 lines + 3 refs)\n- **[llama.cpp](12-inference-serving\u002Fllama-cpp\u002F)** - CPU\u002FApple Silicon inference, GGUF quantization (251 lines + 3 refs)\n- **[SGLang](12-inference-serving\u002Fsglang\u002F)** - Structured generation with RadixAttention, 5-10× faster for agents (435 lines + 3 refs)\n\n### 🤖 Agents (4 skills)\n- **[LangChain](14-agents\u002Flangchain\u002F)** - Most popular agent framework, 500+ integrations, ReAct pattern (658 lines + 3 refs, **production-ready**)\n- **[LlamaIndex](14-agents\u002Fllamaindex\u002F)** - Data framework for LLM apps, 300+ connectors, RAG-focused (535 lines + 3 refs)\n- **[CrewAI](14-agents\u002Fcrewai\u002F)** - Multi-agent orchestration, role-based collaboration, autonomous workflows (498 lines + 3 refs)\n- **[AutoGPT](14-agents\u002Fautogpt\u002F)** - Autonomous AI agent platform, visual workflow builder, continuous execution (400 lines + 2 refs)\n\n### 🔍 RAG (5 skills)\n- **[Chroma](15-rag\u002Fchroma\u002F)** - Open-source embedding database, local\u002Fcloud, 24k stars (385 lines + 1 ref)\n- **[FAISS](15-rag\u002Ffaiss\u002F)** - Facebook's similarity search, billion-scale, GPU acceleration (295 lines)\n- **[Sentence Transformers](15-rag\u002Fsentence-transformers\u002F)** - 5000+ embedding models, multilingual, 15k stars (370 lines)\n- **[Pinecone](15-rag\u002Fpinecone\u002F)** - Managed vector database, auto-scaling, \u003C100ms latency (410 lines)\n- **[Qdrant](15-rag\u002Fqdrant\u002F)** - High-performance vector search, Rust-powered, hybrid search with filtering (493 lines + 2 refs)\n\n### 🎨 Multimodal (7 skills)\n- **[CLIP](18-multimodal\u002Fclip\u002F)** - OpenAI's vision-language model, zero-shot classification, 25k stars (320 lines)\n- **[Whisper](18-multimodal\u002Fwhisper\u002F)** - Robust speech recognition, 99 languages, 73k stars (395 lines)\n- **[LLaVA](18-multimodal\u002Fllava\u002F)** - Vision-language assistant, image chat, GPT-4V level (360 lines)\n- **[Stable Diffusion](18-multimodal\u002Fstable-diffusion\u002F)** - Text-to-image generation via HuggingFace Diffusers, SDXL, ControlNet (380 lines + 2 refs)\n- **[Segment Anything](18-multimodal\u002Fsegment-anything\u002F)** - Meta's SAM for zero-shot image segmentation with points\u002Fboxes (500 lines + 2 refs)\n- **[BLIP-2](18-multimodal\u002Fblip-2\u002F)** - Vision-language pretraining with Q-Former, image captioning, VQA (500 lines + 2 refs)\n- **[AudioCraft](18-multimodal\u002Faudiocraft\u002F)** - Meta's MusicGen\u002FAudioGen for text-to-music and text-to-sound (470 lines + 2 refs)\n\n### 🎯 Prompt Engineering (4 skills)\n- **[DSPy](16-prompt-engineering\u002Fdspy\u002F)** - Declarative prompt programming with optimizers, Stanford NLP, 22k stars (438 lines + 3 refs)\n- **[Instructor](16-prompt-engineering\u002Finstructor\u002F)** - Structured LLM outputs with Pydantic validation, 15k stars (726 lines + 3 refs)\n- **[Guidance](16-prompt-engineering\u002Fguidance\u002F)** - Constrained generation with regex\u002Fgrammars, Microsoft Research, 18k stars (485 lines + 3 refs)\n- **[Outlines](16-prompt-engineering\u002Foutlines\u002F)** - Structured text with FSM, zero-overhead, 8k stars (601 lines + 3 refs)\n\n### 📊 MLOps (3 skills)\n- **[Weights & Biases](13-mlops\u002Fweights-and-biases\u002F)** - Experiment tracking, sweeps, artifacts, model registry (427 lines + 3 refs)\n- **[MLflow](13-mlops\u002Fmlflow\u002F)** - Model registry, tracking, deployment, autologging (514 lines + 3 refs)\n- **[TensorBoard](13-mlops\u002Ftensorboard\u002F)** - Visualization, profiling, embeddings, scalars\u002Fimages (538 lines + 3 refs)\n\n### 👁️ Observability (2 skills)\n- **[LangSmith](17-observability\u002Flangsmith\u002F)** - LLM observability, tracing, evaluation, monitoring for AI apps (422 lines + 2 refs)\n- **[Phoenix](17-observability\u002Fphoenix\u002F)** - Open-source AI observability with OpenTelemetry tracing and LLM evaluation (380 lines + 2 refs)\n\n### 🔬 Emerging Techniques (6 skills)\n- **[MoE Training](19-emerging-techniques\u002Fmoe-training\u002F)** - Mixture of Experts training with DeepSpeed, Mixtral 8x7B, 5× cost reduction (515 lines + 3 refs)\n- **[Model Merging](19-emerging-techniques\u002Fmodel-merging\u002F)** - Combine models with TIES, DARE, SLERP using mergekit (528 lines + 3 refs)\n- **[Long Context](19-emerging-techniques\u002Flong-context\u002F)** - Extend context windows with RoPE, YaRN, ALiBi, 32k-128k tokens (624 lines + 3 refs)\n- **[Speculative Decoding](19-emerging-techniques\u002Fspeculative-decoding\u002F)** - 1.5-3.6× faster inference with Medusa, Lookahead (379 lines)\n- **[Knowledge Distillation](19-emerging-techniques\u002Fknowledge-distillation\u002F)** - Compress models 70B→7B with MiniLLM, temperature scaling (424 lines)\n- **[Model Pruning](19-emerging-techniques\u002Fmodel-pruning\u002F)** - 50% sparsity with Wanda, SparseGPT, \u003C1% accuracy loss (417 lines)\n\n### 📝 ML Paper Writing (2 skills)\n- **[ML Paper Writing](20-ml-paper-writing\u002F)** - Write publication-ready papers for NeurIPS, ICML, ICLR, ACL, AAAI, COLM with LaTeX templates, citation verification, and writing best practices (532 lines + 5 refs)\n- **[Academic Plotting](20-ml-paper-writing\u002Facademic-plotting\u002F)** - Generate publication-quality figures for ML papers: architecture diagrams via Gemini AI and data-driven charts via matplotlib\u002Fseaborn with venue-specific styling (479 lines + 3 refs)\n\n### 💡 Ideation (2 skills)\n- **[Research Brainstorming](21-research-ideation\u002Fbrainstorming-research-ideas\u002F)** - Structured ideation frameworks for discovering high-impact research directions with 10 complementary lenses (384 lines)\n- **[Creative Thinking](21-research-ideation\u002Fcreative-thinking-for-research\u002F)** - Cognitive science frameworks (bisociation, structure-mapping, constraint manipulation) for genuinely novel research ideas (366 lines)\n\n### 🧬 Agent-Native Research Artifact (3 skills)\n- **[ARA Compiler](22-agent-native-research-artifact\u002Fcompiler\u002F)** - Compiles any research input (PDF papers, repos, experiment logs, raw notes) into a complete Agent-Native Research Artifact with claims, exploration graph, evidence, and code stubs (245 lines + 3 refs)\n- **[ARA Research Manager](22-agent-native-research-artifact\u002Fresearch-manager\u002F)** - Post-task research recorder that runs at session end to extract decisions, experiments, dead ends, and pivots from conversation history into the `ara\u002F` directory with user-vs-AI provenance tags (324 lines + 3 refs)\n- **[ARA Rigor Reviewer](22-agent-native-research-artifact\u002Frigor-reviewer\u002F)** - ARA Seal Level 2 semantic epistemic review scoring six dimensions of research rigor (evidence relevance, falsifiability, scope, coherence, exploration integrity, methodology) with severity-ranked findings (322 lines + 1 ref)\n\n\n\u003C\u002Fdetails>\n\n## Demos\n\nAll 87 skills in this repo are automatically synced to [Orchestra Research](https:\u002F\u002Fwww.orchestra-research.com\u002Fresearch-skills), where you can add them to your projects with one click and use them with AI research agents.\n\n**See skills in action → [demos\u002F](demos\u002FREADME.md)**\n\nWe maintain a curated collection of demo repositories showing how to use skills for real AI research tasks:\n\n| Demo | Skills Used | What It Does |\n|------|-------------|--------------|\n| **[Norm Heterogeneity → LoRA Brittleness](demos\u002Fautoresearch-norm-heterogeneity\u002F)** | Autoresearch, ML Paper Writing, Ideation | Agent autonomously discovered norm heterogeneity predicts fine-tuning difficulty (r=-0.99), pivoting from a null result on ETF overlaps |\n| **[RL Algorithm Brain Scan](demos\u002Fautoresearch-rl-brain-scan\u002F)** | Autoresearch, GRPO, TRL, SAELens, TransformerLens, ML Paper Writing | Agent found DPO is a rank-1 perturbation (95.6% recovery from one SVD direction) while online RL is distributed and structure-preserving |\n| **[NeMo Eval: GPQA Benchmark](https:\u002F\u002Fgithub.com\u002FzechenzhangAGI\u002FNemo-Eval-Skill-Demo)** | NeMo Evaluator | Compare Llama 8B\u002F70B\u002F405B on graduate-level science questions |\n| **[LoRA Without Regret Reproduction](https:\u002F\u002Fwww.orchestra-research.com\u002Fperspectives\u002FLLM-with-Orchestra)** | GRPO, TRL | Reproduce SFT + GRPO RL experiments via prompting |\n| **[Layer-Wise Quantization Experiment](https:\u002F\u002Fgithub.com\u002FAmberLJC\u002Fllama-quantization-experiment)** | llama.cpp, GGUF | Investigate optimal layer precision allocation—early layers at Q8 achieve 1.9× compression with 1.3% perplexity loss |\n| **[Cross-Lingual Alignment Analysis](https:\u002F\u002Fgithub.com\u002FAmberLJC\u002Ffaiss-demo)** | FAISS | Quantify how well multilingual embeddings align semantic concepts across 8 languages using FAISS similarity search |\n| **[Scientific Plotting Demo](demos\u002Fscientific-plotting-demo\u002F)** | Academic Plotting | Generate publication-quality figures for the Andes QoE-aware LLM serving paper — Gemini AI architecture diagrams + matplotlib data charts (CDF, multi-panel grids, bar charts) |\n\n**Featured Demos**: Two papers produced entirely by AI agents using the **autoresearch** skill. The [Norm Heterogeneity paper](demos\u002Fautoresearch-norm-heterogeneity\u002F) demonstrates autonomous research pivoting — the agent refuted its own hypothesis and discovered a stronger finding. The [RL Brain Scan paper](demos\u002Fautoresearch-rl-brain-scan\u002F) demonstrates multi-skill orchestration — the agent trained RL models, analyzed internals with interpretability tools, and synthesized the insight that \"DPO is rank-1 alignment.\" Both papers written end-to-end by the agent.\n\n## Skill Structure\n\nEach skill follows a battle-tested format for maximum usefulness:\n\n```\nskill-name\u002F\n├── SKILL.md                    # Quick reference (50-150 lines)\n│   ├── Metadata (name, description, version)\n│   ├── When to use this skill\n│   ├── Quick patterns & examples\n│   └── Links to references\n│\n├── references\u002F                 # Deep documentation (300KB+)\n│   ├── README.md              # From GitHub\u002Fofficial docs\n│   ├── api.md                 # API reference\n│   ├── tutorials.md           # Step-by-step guides\n│   ├── issues.md              # Real GitHub issues & solutions\n│   ├── releases.md            # Version history & breaking changes\n│   └── file_structure.md      # Codebase navigation\n│\n├── scripts\u002F                    # Helper scripts (optional)\n└── assets\u002F                     # Templates & examples (optional)\n```\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Quality Standards\u003C\u002Fb>\u003C\u002Fsummary>\n\n- 300KB+ documentation from official sources\n- Real GitHub issues & solutions (when available)\n- Code examples with language detection\n- Version history & breaking changes\n- Links to official docs\n\n\u003C\u002Fdetails>\n\n## Roadmap\n\nWe're building towards 80 comprehensive skills across the full AI research lifecycle. See our [detailed roadmap](docs\u002FROADMAP.md) for the complete development plan.\n\n[View Full Roadmap →](docs\u002FROADMAP.md)\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>View Detailed Statistics\u003C\u002Fb>\u003C\u002Fsummary>\n\n| Metric | Current | Target |\n|--------|---------|--------|\n| **Skills** | **87** (high-quality, standardized YAML) | 80 ✅ |\n| **Avg Lines\u002FSkill** | **420 lines** (focused + progressive disclosure) | 200-600 lines |\n| **Documentation** | **~130,000 lines** total (SKILL.md + references) | 100,000+ lines |\n| **Gold Standard Skills** | **65** with comprehensive references | 50+ |\n| **Contributors** | 1 | 100+ |\n| **Coverage** | Architecture, Tokenization, Fine-Tuning, Mechanistic Interpretability, Data Processing, Post-Training, Safety, Distributed, Optimization, Evaluation, Infrastructure, Inference, Agents, RAG, Multimodal, Prompt Engineering, MLOps, Observability, ML Paper Writing, Ideation, Autoresearch | Full Lifecycle ✅ |\n\n**Recent Progress**: npm package `@orchestra-research\u002Fai-research-skills` for one-command installation across all coding agents\n\n**Philosophy**: Quality > Quantity. Following [Anthropic official best practices](anthropic_official_docs\u002Fbest_practices.md) - each skill provides 200-500 lines of focused, actionable guidance with progressive disclosure.\n\n\u003C\u002Fdetails>\n\n\n\n## Repository Structure\n\n```\nclaude-ai-research-skills\u002F\n├── README.md                    ← You are here\n├── CONTRIBUTING.md              ← Contribution guide\n├── demos\u002F                       ← Curated demo gallery (links to demo repos)\n├── docs\u002F\n├── 0-autoresearch-skill\u002F        (1 skill ✓ - Autonomous research orchestration)\n├── 01-model-architecture\u002F       (5 skills ✓ - LitGPT, Mamba, RWKV, NanoGPT, TorchTitan)\n├── 02-tokenization\u002F             (2 skills ✓ - HuggingFace Tokenizers, SentencePiece)\n├── 03-fine-tuning\u002F              (4 skills ✓ - Axolotl, LLaMA-Factory, Unsloth, PEFT)\n├── 04-mechanistic-interpretability\u002F (4 skills ✓ - TransformerLens, SAELens, pyvene, nnsight)\n├── 05-data-processing\u002F          (2 skills ✓ - Ray Data, NeMo Curator)\n├── 06-post-training\u002F            (8 skills ✓ - TRL, GRPO, OpenRLHF, SimPO, verl, slime, miles, torchforge)\n├── 07-safety-alignment\u002F         (4 skills ✓ - Constitutional AI, LlamaGuard, NeMo Guardrails, Prompt Guard)\n├── 08-distributed-training\u002F     (6 skills ✓ - Megatron-Core, DeepSpeed, FSDP, Accelerate, Lightning, Ray Train)\n├── 09-infrastructure\u002F           (3 skills ✓ - Modal, SkyPilot, Lambda Labs)\n├── 10-optimization\u002F             (6 skills ✓ - Flash Attention, bitsandbytes, GPTQ, AWQ, HQQ, GGUF)\n├── 11-evaluation\u002F               (3 skills ✓ - lm-evaluation-harness, BigCode, NeMo Evaluator)\n├── 12-inference-serving\u002F        (4 skills ✓ - vLLM, TensorRT-LLM, llama.cpp, SGLang)\n├── 13-mlops\u002F                    (3 skills ✓ - Weights & Biases, MLflow, TensorBoard)\n├── 14-agents\u002F                   (4 skills ✓ - LangChain, LlamaIndex, CrewAI, AutoGPT)\n├── 15-rag\u002F                      (5 skills ✓ - Chroma, FAISS, Sentence Transformers, Pinecone, Qdrant)\n├── 16-prompt-engineering\u002F       (4 skills ✓ - DSPy, Instructor, Guidance, Outlines)\n├── 17-observability\u002F            (2 skills ✓ - LangSmith, Phoenix)\n├── 18-multimodal\u002F               (7 skills ✓ - CLIP, Whisper, LLaVA, Stable Diffusion, SAM, BLIP-2, AudioCraft)\n├── 19-emerging-techniques\u002F      (6 skills ✓ - MoE, Model Merging, Long Context, Speculative Decoding, Distillation, Pruning)\n├── 20-ml-paper-writing\u002F         (2 skills ✓ - ML Paper Writing with LaTeX templates, Academic Plotting)\n├── 21-research-ideation\u002F           (2 skills ✓ - Research Brainstorming, Creative Thinking)\n├── 22-agent-native-research-artifact\u002F (3 skills ✓ - ARA Compiler, Research Manager, Rigor Reviewer)\n└── packages\u002Fai-research-skills\u002F (npm package for one-command installation)\n```\n\n## Use Cases\n\n### For Researchers\n\"I need to fine-tune Llama 3 with custom data\"\n→ **03-fine-tuning\u002Faxolotl\u002F** - YAML configs, 100+ model support\n\n### For ML Engineers\n\"How do I optimize inference latency?\"\n→ **12-inference-serving\u002Fvllm\u002F** - PagedAttention, batching\n\n### For Students\n\"I want to learn how transformers work\"\n→ **01-model-architecture\u002Flitgpt\u002F** - Clean implementations\n\n### For Teams\n\"We need to scale training to 100 GPUs\"\n→ **08-distributed-training\u002Fdeepspeed\u002F** - ZeRO stages, 3D parallelism\n\n## License\n\nMIT License - See [LICENSE](LICENSE) for details.\n\n**Note**: Individual skills may reference libraries with different licenses. Please check each project's license before use.\n\n## Citation\n\nIf you use AI Research Skills in your work or find it helpful for a publication, we'd appreciate a citation:\n\n**BibTeX**\n```bibtex\n@software{ai_research_skills,\n  title     = {AI Research Skills Library},\n  author    = {{Orchestra Research}},\n  year      = {2025},\n  url       = {https:\u002F\u002Fgithub.com\u002Forchestra-research\u002FAI-research-SKILLs},\n  note      = {Open-source skills library enabling AI agents to autonomously conduct AI research}\n}\n```\n\n**APA**\n> Orchestra Research. (2025). *AI Research Skills Library* [Computer software]. https:\u002F\u002Fgithub.com\u002Forchestra-research\u002FAI-research-SKILLs\n\n**Chicago**\n> Orchestra Research. \"AI Research Skills Library.\" GitHub, 2025. https:\u002F\u002Fgithub.com\u002Forchestra-research\u002FAI-research-SKILLs.\n\n**IEEE**\n> Orchestra Research, \"AI Research Skills Library,\" 2025. [Online]. Available: https:\u002F\u002Fgithub.com\u002Forchestra-research\u002FAI-research-SKILLs\n\n> **Tip**: You can also click **\"Cite this repository\"** in the GitHub sidebar for auto-formatted citations.\n\n## Acknowledgments\n\nBuilt with:\n- **[Claude Code](https:\u002F\u002Fwww.claude.com\u002Fproduct\u002Fclaude-code)** - AI pair programming\n- **[Skill Seeker](https:\u002F\u002Fgithub.com\u002Fyusufkaraaslan\u002FSkill_Seekers)** - Automated doc scraping\n- **Open Source AI Community** - For amazing tools and docs\n\nSpecial thanks to:\n- EleutherAI, HuggingFace, NVIDIA, Lightning AI, Meta AI, Anthropic\n- All researchers who maintain excellent documentation\n \n## Contributors\n\nThanks to all the people who have contributed to the AI Research Skills Library:\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Forchestra-research\u002FAI-research-SKILLs\u002Fgraphs\u002Fcontributors\">\n  \u003Cimg src=\"https:\u002F\u002Fcontrib.rocks\u002Fimage?repo=orchestra-research\u002FAI-research-SKILLs\" \u002F>\n\u003C\u002Fa> \n\nWe welcome contributions from the AI research community! See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines on:\n\n- Adding new skills\n- Improving existing skills\n- Quality standards and best practices\n- Submission process\n\n## Recent Updates\n\n\u003Cdetails open>\n\u003Csummary>\u003Cb>April 2026 - v1.6.0 🧬 Agent-Native Research Artifact (ARA) — 23rd Category, 98 Skills\u003C\u002Fb>\u003C\u002Fsummary>\n\n- 🧬 **NEW CATEGORY**: `22-agent-native-research-artifact\u002F` (the 23rd category) — three skills that turn research outputs into a falsifiable, agent-traversable artifact:\n  - 🛠️ **[ARA Compiler](22-agent-native-research-artifact\u002Fcompiler\u002F)** — compiles any input (PDF papers, GitHub repos, experiment logs, raw notes) into a structured ARA with cognitive layer (claims, concepts, heuristics), physical layer (configs, code stubs), exploration graph (research DAG), and grounded evidence\n  - 📋 **[ARA Research Manager](22-agent-native-research-artifact\u002Fresearch-manager\u002F)** — post-task epilogue that scans conversation history at session end and writes decisions, experiments, dead ends, claims, heuristics, and pivots into the `ara\u002F` directory with `user` \u002F `ai-suggested` \u002F `ai-executed` \u002F `user-revised` provenance tags\n  - 🔍 **[ARA Rigor Reviewer](22-agent-native-research-artifact\u002Frigor-reviewer\u002F)** — Seal Level 2 semantic epistemic review scoring six dimensions of research rigor (evidence relevance, falsifiability, scope calibration, argument coherence, exploration integrity, methodological rigor) and emitting a severity-ranked report with a Strong Accept-to-Reject recommendation\n- 🔗 Sourced from the [Agent-Native-Research-Artifact-Init](https:\u002F\u002Fgithub.com\u002FOrchestra-Research\u002FAgent-Native-Research-Artifact-Init) reference repo, restructured to AI-research-SKILLs standards (kebab-case names, third-person descriptions, Title-Case tags, one-level-deep references)\n- 🧩 Plugin entry `agent-native-research-artifact` added to `.claude-plugin\u002Fmarketplace.json`; CLI category registered as `22-agent-native-research-artifact` with three individual skill entries in the npm installer\n- 🔄 Auto-syncs to Orchestra marketplace via `sync-skills.yml` on push; npm package republished as `@orchestra-research\u002Fai-research-skills@1.6.0` via `publish-npm.yml` on version bump\n- 📊 **98 total skills** across **23 categories** — full lifecycle from idea → paper → falsifiable, auditable artifact\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>March 2026 - v1.4.0 🔬 Autoresearch & 86 Skills — Full Research Lifecycle\u003C\u002Fb>\u003C\u002Fsummary>\n\n- 🔬 **NEW SKILL**: **Autoresearch** — autonomous research orchestration using a two-loop architecture (inner optimization loop + outer synthesis loop)\n- 🧠 Manages the full research lifecycle: literature survey → ideation → experiments → synthesis → paper writing\n- 🔄 Routes to all 86 domain skills automatically — agents don't need to know which skill to use\n- ⏰ Mandatory `\u002Floop` (Claude Code) and cron job (OpenClaw) for continuous autonomous operation\n- 📊 Generates research presentations (HTML\u002FPDF) with optimization trajectory plots for human review\n- 📝 Findings.md as persistent project memory across sessions with \"Lessons and Constraints\" tracking\n- 🗂️ Structured workspace: research-state.yaml, findings.md, research-log.md, literature\u002F, experiments\u002F, src\u002F, data\u002F, to_human\u002F\n- 📄 **Two demo papers produced by autoresearch**: [Norm Heterogeneity → LoRA Brittleness](demos\u002Fautoresearch-norm-heterogeneity\u002F) and [RL Algorithm Brain Scan](demos\u002Fautoresearch-rl-brain-scan\u002F)\n- 🚀 WELCOME.md for cold-start agent bootstrap — one URL to go from zero to autonomous research\n- 📦 npm v1.4.x with Windows symlink fallback, all 22 categories installable\n- 🤖 **Supported agents**: Claude Code, Hermes Agent, OpenCode, OpenClaw, Cursor, Codex, Gemini CLI, Qwen Code\n- 📊 **87 total skills** across **22 categories** — complete research lifecycle coverage\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>February 2026 - v0.15.0 🛡️ Prompt Guard & 83 Skills\u003C\u002Fb>\u003C\u002Fsummary>\n\n- 🛡️ **NEW SKILL**: Prompt Guard - Meta's 86M prompt injection & jailbreak detector\n- ⚡ 99%+ TPR, \u003C1% FPR, \u003C2ms GPU latency, multilingual (8 languages)\n- 🔒 3 workflows: user input filtering, third-party data filtering, batch RAG processing\n- 📊 **83 total skills** across 20 categories\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>January 2026 - v0.14.0 📦 npm Package & 82 Skills\u003C\u002Fb>\u003C\u002Fsummary>\n\n- 📦 **NEW**: `npx @orchestra-research\u002Fai-research-skills` - One-command installation for all coding agents\n- 🤖 **Supported agents**: Claude Code, OpenCode, Cursor, Codex, Gemini CLI, Qwen Code\n- ✨ Interactive installer with category\u002Findividual skill selection\n- 🔄 Update installed skills, selective uninstall\n- 📊 **82 total skills** (5 new post-training skills: verl, slime, miles, torchforge + TorchTitan)\n- 🏗️ Megatron-Core moved to Distributed Training category\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>January 2026 - v0.13.0 📝 ML Paper Writing & Demos Gallery\u003C\u002Fb>\u003C\u002Fsummary>\n\n- 📝 **NEW CATEGORY**: ML Paper Writing (20th category, 77th skill)\n- 🎯 Write publication-ready papers for NeurIPS, ICML, ICLR, ACL, AAAI, COLM\n- 📚 Writing philosophy from top researchers (Neel Nanda, Farquhar, Gopen & Swan, Lipton, Perez)\n- 🔬 Citation verification workflow - never hallucinate references\n- 📄 LaTeX templates for 6 major conferences\n- 🎪 **NEW**: Curated demos gallery (`demos\u002F`) showcasing skills in action\n- 🔗 Demo repos: NeMo Evaluator benchmark, LoRA Without Regret reproduction\n- 📖 936-line comprehensive SKILL.md with 4 workflows\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>January 2026 - v0.12.0 📊 NeMo Evaluator SDK\u003C\u002Fb>\u003C\u002Fsummary>\n\n- 📊 **NEW SKILL**: NeMo Evaluator SDK for enterprise LLM benchmarking\n- 🔧 NVIDIA's evaluation platform with 100+ benchmarks from 18+ harnesses (MMLU, HumanEval, GSM8K, safety, VLM)\n- ⚡ Multi-backend execution: local Docker, Slurm HPC, Lepton cloud\n- 📦 Container-first architecture for reproducible evaluation\n- 📝 454 lines SKILL.md + 4 comprehensive reference files (~48KB documentation)\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>December 2025 - v0.11.0 🔬 Mechanistic Interpretability\u003C\u002Fb>\u003C\u002Fsummary>\n\n- 🔬 **NEW CATEGORY**: Mechanistic Interpretability (4 skills)\n- 🔍 TransformerLens skill: Neel Nanda's library for mech interp with HookPoints, activation caching, circuit analysis\n- 🧠 SAELens skill: Sparse Autoencoder training and analysis for feature discovery, monosemanticity research\n- ⚡ pyvene skill: Stanford's causal intervention library with declarative configs, DAS, activation patching\n- 🌐 nnsight skill: Remote interpretability via NDIF, run experiments on 70B+ models without local GPUs\n- 📝 ~6,500 new lines of documentation across 16 files\n- **76 total skills** (filling the missing 04 category slot)\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>November 25, 2025 - v0.10.0 🎉 70 Skills Complete!\u003C\u002Fb>\u003C\u002Fsummary>\n\n- 🎉 **ROADMAP COMPLETE**: Reached 70-skill milestone!\n- 🚀 Added 4 skills: Lambda Labs, Segment Anything (SAM), BLIP-2, AudioCraft\n- ☁️ Lambda Labs skill: Reserved\u002Fon-demand GPU cloud with H100\u002FA100, persistent filesystems, 1-Click Clusters\n- 🖼️ SAM skill: Meta's Segment Anything for zero-shot image segmentation with points\u002Fboxes\u002Fmasks\n- 👁️ BLIP-2 skill: Vision-language pretraining with Q-Former, image captioning, VQA\n- 🎵 AudioCraft skill: Meta's MusicGen\u002FAudioGen for text-to-music and text-to-sound generation\n- 📝 ~10,000 new lines of documentation across 12 files\n- **70 total skills** (100% roadmap complete!)\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>November 25, 2025 - v0.9.0\u003C\u002Fb>\u003C\u002Fsummary>\n\n- 🚀 Added 2 infrastructure skills: Modal, SkyPilot\n- ☁️ Modal skill: Serverless GPU cloud with Python-native API, T4-H200 on-demand, auto-scaling\n- 🌐 SkyPilot skill: Multi-cloud orchestration across 20+ providers with spot recovery\n- ✨ New Infrastructure category (2 skills - serverless GPU and multi-cloud orchestration)\n- 📝 ~2,500 new lines of documentation across 6 files\n- **66 total skills** (94% towards 70-skill target)\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>November 25, 2025 - v0.8.0\u003C\u002Fb>\u003C\u002Fsummary>\n\n- 🚀 Added 5 high-priority skills: HQQ, GGUF, Phoenix, AutoGPT, Stable Diffusion\n- ⚡ HQQ skill: Half-Quadratic Quantization without calibration data, multi-backend support\n- 📦 GGUF skill: llama.cpp quantization format, K-quant methods, CPU\u002FMetal inference\n- 👁️ Phoenix skill: Open-source AI observability with OpenTelemetry tracing and LLM evaluation\n- 🤖 AutoGPT skill: Autonomous AI agent platform with visual workflow builder\n- 🎨 Stable Diffusion skill: Text-to-image generation via Diffusers, SDXL, ControlNet, LoRA\n- 📝 ~9,000 new lines of documentation across 15 files\n- **64 total skills** (91% towards 70-skill target)\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>November 25, 2025 - v0.7.0\u003C\u002Fb>\u003C\u002Fsummary>\n\n- 🚀 Added 5 high-priority skills: PEFT, CrewAI, Qdrant, AWQ, LangSmith\n- ✨ New Observability category with LangSmith for LLM tracing and evaluation\n- 🎯 PEFT skill: Parameter-efficient fine-tuning with LoRA, QLoRA, DoRA, 25+ methods\n- 🤖 CrewAI skill: Multi-agent orchestration with role-based collaboration\n- 🔍 Qdrant skill: High-performance Rust vector search with hybrid filtering\n- ⚡ AWQ skill: Activation-aware 4-bit quantization with minimal accuracy loss\n- 📝 ~8,000 new lines of documentation across 15 files\n- **59 total skills** (84% towards 70-skill target)\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>November 15, 2025 - v0.6.0\u003C\u002Fb>\u003C\u002Fsummary>\n\n- 📊 Added 3 comprehensive MLOps skills: Weights & Biases, MLflow, TensorBoard\n- ✨ New MLOps category (3 skills - experiment tracking, model registry, visualization)\n- 📝 ~10,000 new lines of documentation across 13 files\n- 🔧 Comprehensive coverage: experiment tracking, hyperparameter sweeps, model registry, profiling, embeddings visualization\n- **54 total skills** (77% towards 70-skill target)\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>November 12, 2025 - v0.5.0\u003C\u002Fb>\u003C\u002Fsummary>\n\n- 🎯 Added 4 comprehensive prompt engineering skills: DSPy, Instructor, Guidance, Outlines\n- ✨ New Prompt Engineering category (4 skills - DSPy, Instructor, Guidance, Outlines)\n- 📝 ~10,000 new lines of documentation across 16 files\n- 🔧 Comprehensive coverage: declarative programming, structured outputs, constrained generation, FSM-based generation\n- **47 total skills** (67% towards 70-skill target)\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>November 9, 2025 - v0.4.0\u003C\u002Fb>\u003C\u002Fsummary>\n\n- 🤖 Added 11 comprehensive skills: LangChain, LlamaIndex, Chroma, FAISS, Sentence Transformers, Pinecone, CLIP, Whisper, LLaVA\n- ✨ New Agents category (2 skills - LangChain, LlamaIndex)\n- 🔍 New RAG category (4 skills - Chroma, FAISS, Sentence Transformers, Pinecone)\n- 🎨 New Multimodal category (3 skills - CLIP, Whisper, LLaVA)\n- 📝 ~15,000 new lines of documentation\n- **43 total skills** (61% towards 70-skill target)\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>November 8, 2025 - v0.3.0\u003C\u002Fb>\u003C\u002Fsummary>\n\n- 🚀 Added 8 comprehensive skills: TensorRT-LLM, llama.cpp, SGLang, GPTQ, HuggingFace Tokenizers, SentencePiece, Ray Data, NeMo Curator\n- ⚡ Completed Inference & Serving category (4\u002F4 skills)\n- 🔤 New Tokenization category (2 skills)\n- 📊 New Data Processing category (2 skills)\n- 📝 9,617 new lines of documentation across 30 files\n- **32 total skills** (45% towards 70-skill target)\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>November 6, 2025 - v0.2.0\u003C\u002Fb>\u003C\u002Fsummary>\n\n- Added 10 skills from GitHub (Megatron-Core, Lightning, Ray Train, etc.)\n- Improved skill structure with comprehensive references\n- Created strategic roadmap to 70 skills\n- Added contribution guidelines\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>November 3, 2025 - v0.1.0\u003C\u002Fb>\u003C\u002Fsummary>\n\n- 🎉 Initial release with 5 fine-tuning skills\n\n\u003C\u002Fdetails>\n\n## Community\n\nJoin our community to stay updated, ask questions, and connect with other AI researchers:\n\n- **[SkillEvolve Meta-Skill](https:\u002F\u002Fgithub.com\u002FSkill-Evolve\u002Fmeta-skill)** - Connect your agent to the collective intelligence of the community. Captures techniques discovered during sessions and shares them back as curated skills.\n- **[Slack Community](https:\u002F\u002Fjoin.slack.com\u002Ft\u002Forchestrarese-efu1990\u002Fshared_invite\u002Fzt-3iu6gr8io-zJvpkZTPToEviQ9KFZvNSg)** - Chat with the team and other users\n- **[Twitter\u002FX](https:\u002F\u002Fx.com\u002Forch_research)** - Follow for updates and announcements\n- **[LinkedIn](https:\u002F\u002Fwww.linkedin.com\u002Fcompany\u002Forchestra-research\u002F)** - Connect professionally\n\n## Star History\n\n\u003Ca href=\"https:\u002F\u002Fstar-history.com\u002F#orchestra-research\u002FAI-research-SKILLs&Date\">\n \u003Cpicture>\n   \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=orchestra-research\u002FAI-research-SKILLs&type=Date&theme=dark\" \u002F>\n   \u003Csource media=\"(prefers-color-scheme: light)\" srcset=\"https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=orchestra-research\u002FAI-research-SKILLs&type=Date\" \u002F>\n   \u003Cimg alt=\"Star History Chart\" src=\"https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=orchestra-research\u002FAI-research-SKILLs&type=Date\" \u002F>\n \u003C\u002Fpicture>\n\u003C\u002Fa>\n","AI Research SKILLs 是一个全面的开源库，旨在为任何AI模型提供研究和工程技能，使Claude代码\u002FCodex\u002FGemini代理能够自主进行全马力的AI研究。该项目包含98项技能，覆盖从创意生成、模型训练到论文撰写的整个研究流程，支持自动研究、模型架构设计、微调、分布式训练、优化等核心功能。采用TeX语言编写，并在MIT许可证下发布。它适用于需要加速AI研究过程或提升现有AI系统能力的各种场景，如学术研究机构、企业研发部门等。",2,"2026-06-11 03:47:15","high_star"]