[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-85116":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":14,"stars7d":14,"stars30d":14,"stars90d":14,"forks30d":14,"starsTrendScore":14,"compositeScore":15,"rankGlobal":9,"rankLanguage":9,"license":16,"archived":17,"fork":17,"defaultBranch":18,"hasWiki":19,"hasPages":17,"topics":20,"createdAt":9,"pushedAt":9,"updatedAt":29,"readmeContent":30,"aiSummary":9,"trendingCount":14,"starSnapshotCount":14,"syncStatus":12,"lastSyncTime":31,"discoverSource":32},85116,"agentic-engineering-handbook","keyuchen21\u002Fagentic-engineering-handbook","keyuchen21","The definitive OpenAI, Claude, MCP, Harness, Evals, and Production Agent Systems learning roadmap.",null,"Python",63,2,51,0,37.43,"MIT License",false,"main",true,[21,22,23,24,25,26,27,28],"agentic-engineering","agents","ai-agents","anthropic","claude-code","llm","mcp","openai","2026-06-15 10:04:28","# Agentic Engineering Handbook\n\n> The definitive OpenAI, Anthropic, Google, MCP, Harness, Evals, and Production Agent Systems learning roadmap.\n\n[![License: MIT](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-yellow.svg)](LICENSE)\n[![Last Updated](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLast%20Updated-2026--06--09-blue.svg)](#)\n\nIf this repository helps you, consider giving it a ⭐\n\n---\n\n## Why This Repository?\n\nThe AI industry has entered the **Agentic Era**. Building production-grade AI systems now requires mastering agents, tool use, MCP, memory, long-running workflows, coding agents, agent harnesses, evals, and safety — but the knowledge is scattered across OpenAI blogs, Anthropic engineering posts, SDK docs, cookbooks, and research papers.\n\nThis repository consolidates **129 curated resources** into one structured learning roadmap.\n\n**The goal: Become a world-class Agentic Engineer.**\n\n---\n\n## Learning Roadmap\n\n### Phase 0 — Agent Loop From Scratch\n\nIf you treat Claude Code as a coding CLI, many capabilities can feel like magic: it reads files, runs commands, edits code, delegates work, and stays oriented during complex tasks.\n\nFrom an engineering perspective, the core is much simpler:\n\n**model + tools + one loop.**\n\nUnderstanding that loop makes the rest of the system easier to reason about:\n\n- When the agent should plan first, and when it should act immediately\n- Why an explicit todo list reduces drift in longer tasks\n- Why subagents improve exploration while protecting the main context\n- How skills, MCP, and hooks each add capability around the same core loop\n\nThese pages are based on the upstream English Markdown tutorials from [shareAI-lab\u002Fmini-claude-code](https:\u002F\u002Fgithub.com\u002FshareAI-lab\u002Fmini-claude-code), with added Study Notes and inline source code for this handbook.\n\n| Step | Page | Code |\n|------|------|------|\n| v0 | [Bash is All You Need](tutorials\u002Fagent-loop\u002Fv0-bash-is-all-you-need.md) | [v0_bash_agent.py](tutorials\u002Fagent-loop\u002Fv0_bash_agent.py) |\n| v1 | [Model as Agent](tutorials\u002Fagent-loop\u002Fv1-model-as-agent.md) | [v1_basic_agent.py](tutorials\u002Fagent-loop\u002Fv1_basic_agent.py) |\n| v2 | [Structured Planning](tutorials\u002Fagent-loop\u002Fv2-structured-planning.md) | [v2_todo_agent.py](tutorials\u002Fagent-loop\u002Fv2_todo_agent.py) |\n| v3 | [Subagent Mechanism](tutorials\u002Fagent-loop\u002Fv3-subagent-mechanism.md) | [v3_subagent.py](tutorials\u002Fagent-loop\u002Fv3_subagent.py) |\n| v4 | [Skills Mechanism](tutorials\u002Fagent-loop\u002Fv4-skills-mechanism.md) | [v4_skills_agent.py](tutorials\u002Fagent-loop\u002Fv4_skills_agent.py) |\n\nSupporting files are included in the same folder: `requirements.txt`, `.env.example`, `v0_bash_agent_mini.py`, and `skills\u002F`.\n\n---\n\n### Phase 1 — Agent Foundations\n\n> Build shared vocabulary for workflow vs agent, tool loop, handoff, guardrails.\n\n#### Key Mental Models\n\n**Should I build an agent?** (4-question checklist from Barry Zhang's talk)\n\n| Question | If No → Workflow | If Yes → Agent |\n|----------|-----------------|----------------|\n| Is the task complex enough? | Decision tree is fully mappable | Ambiguous problem space |\n| Is the task valuable enough? | \u003C$0.10 per run | >$1 per run, cost doesn't matter |\n| Are all core capabilities doable? | Weak links break the chain | Model handles every step well |\n| Is error cost low & detectable? | High cost + hard to detect → human-in-the-loop | Errors caught by tests\u002FCI |\n\n**Think like the agent.** Most failures come from designing with a human perspective. Put yourself inside the agent's context window: you only see ~10K–20K tokens (system prompt + tool descriptions + recent observations). Ask: does the agent have enough information to act correctly at each step?\n\n→ Source: [How We Build Effective Agents](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=D7_ipDqhtwk)\n\n#### Read First\n\n| # | Title | Vendor |\n|---|-------|--------|\n| 1 | [Prompt guidance](https:\u002F\u002Fdevelopers.openai.com\u002Fapi\u002Fdocs\u002Fguides\u002Fprompt-guidance) | OpenAI |\n| 2 | [Function Calling](https:\u002F\u002Fdevelopers.openai.com\u002Fapi\u002Fdocs\u002Fguides\u002Ffunction-calling) | OpenAI |\n| 3 | [Tool use overview](https:\u002F\u002Fplatform.claude.com\u002Fdocs\u002Fen\u002Fagents-and-tools\u002Ftool-use\u002Foverview) | Anthropic |\n| 4 | [Function calling - Gemini API](https:\u002F\u002Fai.google.dev\u002Fgemini-api\u002Fdocs\u002Ffunction-calling) | Google |\n| 5 | [Building effective agents](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fbuilding-effective-agents) | Anthropic |\n| 6 | [New tools for building agents](https:\u002F\u002Fopenai.com\u002Findex\u002Fnew-tools-for-building-agents\u002F) | OpenAI |\n| 7 | [Agents SDK overview](https:\u002F\u002Fdevelopers.openai.com\u002Fapi\u002Fdocs\u002Fguides\u002Fagents) | OpenAI |\n\n#### Then Read\n\n| Title | Vendor |\n|-------|--------|\n| [How We Build Effective Agents: Barry Zhang, Anthropic](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=D7_ipDqhtwk) | Anthropic |\n| [Phistory — Claude Code & Codex CLI System Prompt Diff History](https:\u002F\u002Fphistory.cc\u002F) | Community |\n| [System Prompts](https:\u002F\u002Fplatform.claude.com\u002Fdocs\u002Fen\u002Frelease-notes\u002Fsystem-prompts) | Anthropic |\n| [OpenAI Agents SDK examples](https:\u002F\u002Fopenai.github.io\u002Fopenai-agents-python\u002Fexamples\u002F) | OpenAI |\n| [Structured Outputs for Multi-Agent Systems](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Fexamples\u002Fstructured_outputs_multi_agent) | OpenAI |\n\n#### Build Exercise\n\nBuild a customer service\u002Fticket triage agent: router → specialist → evaluator, with all outputs constrained by structured schemas.\n\n---\n\n### Phase 2 — MCP & Tool Ecosystem\n\n> Understand MCP server\u002Fclient, remote vs local, tool loading, approval, connector boundaries.\n\n#### Read First\n\n| # | Title | Vendor |\n|---|-------|--------|\n| 1 | [Introducing the Model Context Protocol](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fmodel-context-protocol) | Anthropic |\n| 2 | [MCP and Connectors](https:\u002F\u002Fdevelopers.openai.com\u002Fapi\u002Fdocs\u002Fguides\u002Ftools-connectors-mcp) | OpenAI |\n| 3 | [Building MCP servers for ChatGPT Apps and API integrations](https:\u002F\u002Fdevelopers.openai.com\u002Fapi\u002Fdocs\u002Fmcp) | OpenAI |\n\n#### Then Read\n\n| Title | Vendor |\n|-------|--------|\n| [Code execution with MCP: Building more efficient agents](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fcode-execution-with-mcp) | Anthropic |\n| [Model Context Protocol - Codex](https:\u002F\u002Fdevelopers.openai.com\u002Fcodex\u002Fmcp) | OpenAI |\n| [OpenAI Docs MCP](https:\u002F\u002Fdevelopers.openai.com\u002Flearn\u002Fdocs-mcp) | OpenAI |\n| [Build your ChatGPT UI](https:\u002F\u002Fdevelopers.openai.com\u002Fapps-sdk\u002Fbuild\u002Fchatgpt-ui) | OpenAI |\n\n#### Build Exercise\n\nBuild a read-only repo\u002Fdocs MCP server, then create an eval to verify the agent correctly cites documentation.\n\n---\n\n### Phase 3 — Context, Memory & Skills\n\n> Learn to control context window, short\u002Flong-term memory, skills\u002Fplugins, CLAUDE.md\u002FAGENTS.md.\n\n#### Read First\n\n| # | Title | Vendor |\n|---|-------|--------|\n| 1 | [Effective context engineering for AI agents](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Feffective-context-engineering-for-ai-agents) | Anthropic |\n| 2 | [Equipping agents for the real world with Agent Skills](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fequipping-agents-for-the-real-world-with-agent-skills) | Anthropic |\n| 3 | [Agent Skills Specification](https:\u002F\u002Fagentskills.io\u002Fspecification) | Agent Skills |\n| 4 | [Agent Skills](https:\u002F\u002Fplatform.claude.com\u002Fdocs\u002Fen\u002Fagents-and-tools\u002Fagent-skills\u002Foverview) | Anthropic |\n| 5 | [Skills](https:\u002F\u002Fdevelopers.openai.com\u002Fapi\u002Fdocs\u002Fguides\u002Ftools-skills) | OpenAI |\n| 6 | [Building Reliable Agents with Memory and Compaction](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Fexamples\u002Fagents_sdk\u002Fbuilding_reliable_agents_memory_compaction) | OpenAI |\n\n#### Then Read\n\n| Title | Vendor |\n|-------|--------|\n| [Custom instructions with AGENTS.md - Codex](https:\u002F\u002Fdevelopers.openai.com\u002Fcodex\u002Fguides\u002Fagents-md) | OpenAI |\n| [Best practices for Claude Code](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fclaude-code-best-practices) | Anthropic |\n| [Agent Skills - Codex](https:\u002F\u002Fdevelopers.openai.com\u002Fcodex\u002Fskills) | OpenAI |\n| [Skills in OpenAI API](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Fexamples\u002Fskills_in_api) | OpenAI |\n\n#### Build Exercise\n\nImplement the same task as a Skill\u002FPlugin, then measure accuracy and token cost across three variants: no skill, long prompt, and skill-based.\n\n---\n\n### Phase 4 — Harness & Long-Running Agents\n\n> Master agent runtime: event stream, thread, tool execution, state, sandbox, approval, recovery.\n\n#### Read First\n\n| # | Title | Vendor |\n|---|-------|--------|\n| 1 | [Unrolling the Codex agent loop](https:\u002F\u002Fopenai.com\u002Findex\u002Funrolling-the-codex-agent-loop\u002F) | OpenAI |\n| 2 | [Unlocking the Codex harness: how we built the App Server](https:\u002F\u002Fopenai.com\u002Findex\u002Funlocking-the-codex-harness\u002F) | OpenAI |\n| 3 | [Effective harnesses for long-running agents](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Feffective-harnesses-for-long-running-agents) | Anthropic |\n\n#### Then Read\n\n| Title | Vendor |\n|-------|--------|\n| [The next evolution of the Agents SDK](https:\u002F\u002Fopenai.com\u002Findex\u002Fthe-next-evolution-of-the-agents-sdk\u002F) | OpenAI |\n| [Using PLANS.md for multi-hour problem solving](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Farticles\u002Fcodex_exec_plans) | OpenAI |\n| [Harness design for long-running application development](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fharness-design-long-running-apps) | Anthropic |\n| [Scaling Managed Agents: Decoupling the brain from the hands](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fmanaged-agents) | Anthropic |\n\n#### Build Exercise\n\nBuild a mini coding harness: plan file, shell tool, apply patch, test gate, event log, and resume capability.\n\n---\n\n### Phase 5 — Coding & Workspace Agents\n\n> Compare Codex vs Claude Code product\u002FSDK forms; learn multi-agent, IDE, workspace collaboration.\n\n#### Read First\n\n| # | Title | Vendor |\n|---|-------|--------|\n| 1 | [Introducing Codex](https:\u002F\u002Fopenai.com\u002Findex\u002Fintroducing-codex\u002F) | OpenAI |\n| 2 | [Best practices for Claude Code](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fclaude-code-best-practices) | Anthropic |\n| 3 | [Enabling Claude Code to work more autonomously](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fenabling-claude-code-to-work-more-autonomously) | Anthropic |\n\n#### Then Read\n\n| Title | Vendor |\n|-------|--------|\n| [Introducing the Codex app](https:\u002F\u002Fopenai.com\u002Findex\u002Fintroducing-the-codex-app\u002F) | OpenAI |\n| [Introducing workspace agents in ChatGPT](https:\u002F\u002Fopenai.com\u002Findex\u002Fintroducing-workspace-agents-in-chatgpt\u002F) | OpenAI |\n| [Apple's Xcode now supports Claude Agent SDK](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fapple-xcode-claude-agent-sdk) | Anthropic |\n| [Building Consistent Workflows with Codex CLI & Agents SDK](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Fexamples\u002Fcodex\u002Fcodex_mcp_agents_sdk\u002Fbuilding_consistent_workflows_codex_cli_agents_sdk) | OpenAI |\n\n#### Build Exercise\n\nRun both OpenAI\u002FCodex and Claude Code style workflows on the same repo: issue → plan → patch → tests → PR summary.\n\n---\n\n### Phase 6 — Evals, Safety & Production\n\n> Build pre\u002Fpost-launch eval loop, trace loop, safety boundaries, permissions, regression monitoring.\n\n#### Read First\n\n| # | Title | Vendor |\n|---|-------|--------|\n| 1 | [Demystifying evals for AI agents](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fdemystifying-evals-for-ai-agents) | Anthropic |\n| 2 | [Testing Agent Skills Systematically with Evals](https:\u002F\u002Fdevelopers.openai.com\u002Fblog\u002Feval-skills) | OpenAI |\n| 3 | [Build an Agent Improvement Loop with Traces, Evals, and Codex](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Fexamples\u002Fagents_sdk\u002Fagent_improvement_loop) | OpenAI |\n\n#### Then Read\n\n| Title | Vendor |\n|-------|--------|\n| [Running Codex safely at OpenAI](https:\u002F\u002Fopenai.com\u002Findex\u002Frunning-codex-safely\u002F) | OpenAI |\n| [How we contain Claude across products](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fhow-we-contain-claude) | Anthropic |\n| [Evals API Use-case - MCP Evaluation](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Fexamples\u002Fevaluation\u002Fuse-cases\u002Fmcp_eval_notebook) | OpenAI |\n| [Measuring AI agent autonomy in practice](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fmeasuring-agent-autonomy) | Anthropic |\n\n#### Build Exercise\n\nBuild a smoke\u002Fmacro eval suite for your agent: task success rate, tool misuse, prompt injection resistance, latency, cost, and human approval count.\n\n---\n\n## Full Reading Table\n\n> **Priority guide:** P0 = must-read (architectural\u002Fconceptual), P1 = highly useful (implementation detail), P2 = optional context (background\u002Freleases).\n\n| Priority | Title | Vendor | Topic | Key Idea | Date |\n|----------|-------|--------|-------|----------|------|\n| P0 | [OpenAI for Developers in 2025](https:\u002F\u002Fdevelopers.openai.com\u002Fblog\u002Fopenai-for-developers-2025) | OpenAI | Agents; MCP; Platform | Annual overview: systematic walkthrough of Responses API, Agents SDK, AgentKit, Codex, MCP, Apps SDK, and AGENTS.md. | 2025-12-30 |\n| P0 | [New tools for building agents](https:\u002F\u002Fopenai.com\u002Findex\u002Fnew-tools-for-building-agents\u002F) | OpenAI | Agents; Responses API; Tools | Key starting point for OpenAI's agent platform: Responses API, built-in web\u002Ffile\u002Fcomputer tools, Agents SDK, tracing\u002Fobservability. | 2025-03-11 |\n| P0 | [Introducing AgentKit](https:\u002F\u002Fopenai.com\u002Findex\u002Fintroducing-agentkit\u002F) | OpenAI | Agents; Evals; AgentKit | AgentKit, expanded evals, agent RFT: the official agent toolchain from prototype to production. | 2025-10-06 |\n| P0 | [Prompt guidance](https:\u002F\u002Fdevelopers.openai.com\u002Fapi\u002Fdocs\u002Fguides\u002Fprompt-guidance) | OpenAI | Prompting; Models; Agent UX | Official model-specific prompting guidance for outcome-first prompts, reasoning effort, preambles, and validation rules in tool-heavy workflows. | Current docs |\n| P0 | [System Prompts](https:\u002F\u002Fplatform.claude.com\u002Fdocs\u002Fen\u002Frelease-notes\u002Fsystem-prompts) | Anthropic | System prompts; Claude; Behavior | Claude web\u002Fmobile system prompt release notes; useful for studying production prompting patterns and behavioral scaffolding. | Current docs |\n| P0 | [Agents SDK overview](https:\u002F\u002Fdevelopers.openai.com\u002Fapi\u002Fdocs\u002Fguides\u002Fagents) | OpenAI | Agents; SDK | Official SDK entry point: concepts and boundaries of agent, tool, handoff, guardrail, and tracing. | Current docs |\n| P0 | [Introducing the Model Context Protocol](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fmodel-context-protocol) | Anthropic | MCP; Standards | The origin article for MCP: an open standard connecting AI assistants to data, tools, and systems. | 2024-11-25 |\n| P0 | [Building effective agents](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fbuilding-effective-agents) | Anthropic | Agents; Patterns; Frameworks | Essential agent primer: workflow vs agent, prompt\u002Ftool\u002Fretrieval, orchestrator-worker, evaluator-optimizer patterns. | 2024-12-19 |\n| P0 | [New tools and features in the Responses API](https:\u002F\u002Fopenai.com\u002Findex\u002Fnew-tools-and-features-in-the-responses-api\u002F) | OpenAI | MCP; Responses API; Tools | Responses API extended to remote MCP servers, image\u002Fcode\u002Ffile tools; see how OpenAI integrates MCP into its runtime. | 2025-05-21 |\n| P0 | [MCP and Connectors](https:\u002F\u002Fdevelopers.openai.com\u002Fapi\u002Fdocs\u002Fguides\u002Ftools-connectors-mcp) | OpenAI | MCP; Connectors; Responses API | Official guide to connecting remote MCP servers and connectors; includes approvals and security considerations. | Current docs |\n| P0 | [Building MCP servers for ChatGPT Apps and API integrations](https:\u002F\u002Fdevelopers.openai.com\u002Fapi\u002Fdocs\u002Fmcp) | OpenAI | MCP; ChatGPT Apps; API | Official guide to writing MCP servers: supply tools\u002Fknowledge to ChatGPT Apps, deep research, and API integrations. | Current docs |\n| P0 | [Building a Deep Research MCP Server](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Fexamples\u002Fdeep_research_api\u002Fhow_to_build_a_deep_research_mcp_server\u002Freadme) | OpenAI | MCP; Deep research | Minimal implementation of a search\u002Ffetch MCP server for Deep Research. | 2025-06-25 |\n| P0 | [Model Context Protocol - Codex](https:\u002F\u002Fdevelopers.openai.com\u002Fcodex\u002Fmcp) | OpenAI | MCP; Codex | How Codex CLI\u002FIDE connects to MCP servers, adding Figma, browser, docs, and internal tool context to agents. | Current docs |\n| P0 | [Introducing Codex](https:\u002F\u002Fopenai.com\u002Findex\u002Fintroducing-codex\u002F) | OpenAI | Agents; Coding; Sandbox | Cloud-based software engineering agent: parallel tasks, repo sandbox, running tests\u002Flinters\u002Ftype checkers, producing auditable evidence. | 2025-05-16 |\n| P0 | [Unrolling the Codex agent loop](https:\u002F\u002Fopenai.com\u002Findex\u002Funrolling-the-codex-agent-loop\u002F) | OpenAI | Harness; Agent loop; Codex | How Codex CLI chains prompt, tool schema, MCP tools, Responses API, and context management into an agent loop. | 2026-01-23 |\n| P0 | [Unlocking the Codex harness: how we built the App Server](https:\u002F\u002Fopenai.com\u002Findex\u002Funlocking-the-codex-harness\u002F) | OpenAI | Harness; Codex App Server; JSON-RPC | Core harness article: Codex core, App Server, JSON-RPC, streaming progress, approval, diff, and thread management. | 2026-02-04 |\n| P0 | [From model to agent: Equipping the Responses API with a computer environment](https:\u002F\u002Fopenai.com\u002Findex\u002Fequip-responses-api-computer-environment\u002F) | OpenAI | Harness; Responses API; Sandbox | Responses API + shell tool + hosted containers form the agent runtime; essential for understanding the model-to-agent execution environment. | 2026-03-10 |\n| P0 | [Harness engineering: leveraging Codex in an agent-first world](https:\u002F\u002Fopenai.com\u002Findex\u002Fharness-engineering\u002F) | OpenAI | Harness; Agent-first engineering | Design product code, tests, CI, docs, and observability to be agent-readable\u002Fexecutable; learn agent-first repo organization. | 2026-02-11 |\n| P0 | [The next evolution of the Agents SDK](https:\u002F\u002Fopenai.com\u002Findex\u002Fthe-next-evolution-of-the-agents-sdk\u002F) | OpenAI | Harness; Agents SDK; MCP; Skills | Agents SDK harness becomes more complete: memory, sandbox orchestration, Codex-like filesystem tools, MCP, skills, AGENTS.md. | 2026-04-15 |\n| P0 | [Building Consistent Workflows with Codex CLI & Agents SDK](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Fexamples\u002Fcodex\u002Fcodex_mcp_agents_sdk\u002Fbuilding_consistent_workflows_codex_cli_agents_sdk) | OpenAI | MCP; Codex; Agents SDK | Codex CLI as an MCP server integrated with Agents SDK; real multi-agent dev workflow. | 2025-10-01 |\n| P0 | [Building Reliable Agents with Memory and Compaction](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Fexamples\u002Fagents_sdk\u002Fbuilding_reliable_agents_memory_compaction) | OpenAI | Memory; Compaction; Reliability | Memory and compaction design for long-context\u002Fmulti-turn agents. | 2026-05-01 |\n| P0 | [Build an Agent Improvement Loop with Traces, Evals, and Codex](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Fexamples\u002Fagents_sdk\u002Fagent_improvement_loop) | OpenAI | Evals; Traces; Self-improvement | Connect traces, evals, and Codex fixes into an agent improvement loop. | 2026-05-12 |\n| P0 | [Eval Driven System Design - From Prototype to Production](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Ftopic\u002Fevals) | OpenAI | Evals; Production | Use evals as the driving force for system design; ideal for moving agents from demo to production. | 2025-06-02 |\n| P0 | [Testing Agent Skills Systematically with Evals](https:\u002F\u002Fdevelopers.openai.com\u002Fblog\u002Feval-skills) | OpenAI | Evals; Skills; Agents | Systematically test agent skills with evals; establish quality gates before skill release. | 2026-01-22 |\n| P0 | [Evals API Use-case - MCP Evaluation](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Fexamples\u002Fevaluation\u002Fuse-cases\u002Fmcp_eval_notebook) | OpenAI | MCP; Evals | Evaluate QA\u002Fretrieval capabilities with MCP tools; ideal for building an MCP regression suite. | 2025-06-09 |\n| P0 | [Running Codex safely at OpenAI](https:\u002F\u002Fopenai.com\u002Findex\u002Frunning-codex-safely\u002F) | OpenAI | Safety; Sandbox; Codex | How OpenAI runs Codex internally: sandbox, approvals, network policy, agent-native telemetry. | 2026-05-20 |\n| P0 | [Building Governed AI Agents - A Practical Guide to Agentic Scaffolding](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Ftopic\u002Fagents) | OpenAI | Governance; Guardrails; Agents | Governed agent scaffolding: permissions, guardrails, auditing, and organizational policies. | 2026-02-23 |\n| P0 | [Macro Evals for Agentic Systems](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Ftopic\u002Fagents) | OpenAI | Evals; Agentic systems | Evaluate agents at the end-to-end\u002Fmacro level, not just individual step outputs. | 2026-05-19 |\n| P0 | [Best practices for Claude Code](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fclaude-code-best-practices) | Anthropic | Coding agents; Claude Code | Claude Code methodology: verification loop, explore-plan-code, CLAUDE.md, permissions, MCP, subagents, context management. | 2025-04-18 |\n| P0 | [How we built our multi-agent research system](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fmulti-agent-research-system) | Anthropic | Agents; Multi-agent; Research | Claude Research multi-agent architecture: planner + parallel research agents + synthesis; production multi-agent experience. | 2025-06-13 |\n| P0 | [Writing effective tools for AI agents - with AI agents](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fwriting-tools-for-agents) | Anthropic | Tools; MCP; Evals | Tool quality determines agent quality: tool descriptions, context budget, eval, and letting Claude optimize its own tools. | 2025-09-11 |\n| P0 | [Effective context engineering for AI agents](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Feffective-context-engineering-for-ai-agents) | Anthropic | Context; Agents | Context is the agent's core resource: selection, compression, isolation, persistence, and context pollution control. | 2025-09-29 |\n| P0 | [Enabling Claude Code to work more autonomously](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fenabling-claude-code-to-work-more-autonomously) | Anthropic | Claude Code; Agent SDK; Subagents | Claude Agent SDK, subagents, hooks, background tasks, checkpoints, and other autonomous coding agent capabilities. | 2025-09-29 |\n| P0 | [Equipping agents for the real world with Agent Skills](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fequipping-agents-for-the-real-world-with-agent-skills) | Anthropic | Skills; Agents | Agent Skills as modular capability packages: instructions, resources, scripts — reducing context burden and improving reliability. | 2025-10-16 |\n| P0 | [Agent Skills](https:\u002F\u002Fplatform.claude.com\u002Fdocs\u002Fen\u002Fagents-and-tools\u002Fagent-skills\u002Foverview) | Anthropic | Skills; Claude; Progressive disclosure | Official Claude Agent Skills docs: modular instructions, metadata, scripts, resources, and on-demand loading across Claude products. | Current docs |\n| P0 | [Skills](https:\u002F\u002Fdevelopers.openai.com\u002Fapi\u002Fdocs\u002Fguides\u002Ftools-skills) | OpenAI | Skills; API; Shell environments | Official OpenAI API guide for uploading, managing, and attaching reusable Skills to hosted and local shell environments. | Current docs |\n| P0 | [Agent Skills Specification](https:\u002F\u002Fagentskills.io\u002Fspecification) | Agent Skills | Skills; Specification; Progressive disclosure | Complete skill package format: SKILL.md frontmatter, optional scripts\u002Freferences\u002Fassets, file references, and validation. | Current docs |\n| P0 | [Code execution with MCP: Building more efficient agents](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fcode-execution-with-mcp) | Anthropic | MCP; Code execution; Context | Key article on MCP scale challenges: reduce token overhead with code execution\u002Fon-demand tools; learn progressive disclosure. | 2025-11-04 |\n| P0 | [Introducing advanced tool use on Claude Developer Platform](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fadvanced-tool-use) | Anthropic | Tools; MCP; Advanced tool use | Tool search, deferred loading, programmatic tool calling; solving context pollution from large numbers of MCP tools. | 2025-11-24 |\n| P0 | [Effective harnesses for long-running agents](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Feffective-harnesses-for-long-running-agents) | Anthropic | Harness; Long-running agents | Essential harness reading: working across multiple context windows, task logging, external state, agent self-recovery. | 2025-11-26 |\n| P0 | [Demystifying evals for AI agents](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fdemystifying-evals-for-ai-agents) | Anthropic | Evals; Agents | Agent evals are more complex than static evals: multi-turn, tools, state changes, creative solutions, failure taxonomy. | 2026-01-09 |\n| P0 | [Measuring AI agent autonomy in practice](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fmeasuring-agent-autonomy) | Anthropic | Agents; Autonomy; Measurement | Quantify agent autonomy using metrics like task duration and supervision needs; ideal for building autonomy benchmarks. | 2026-02-18 |\n| P0 | [Harness design for long-running application development](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fharness-design-long-running-apps) | Anthropic | Harness; Application development | Harness design patterns for delegating long-running app development tasks to agents; compare with OpenAI Codex harness. | 2026-03-24 |\n| P0 | [Scaling Managed Agents: Decoupling the brain from the hands](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fmanaged-agents) | Anthropic | Managed agents; Harness | Decouple the model brain from execution hands\u002Fharness, keeping interfaces stable as the harness evolves. | 2026-04-08 |\n| P0 | [How we contain Claude across products](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fhow-we-contain-claude) | Anthropic | Safety; Containment; Agents | Blast radius of powerful agent releases, human-in-the-loop, and containment strategies. | 2026-05-25 |\n| P1 | [Structured Outputs for Multi-Agent Systems](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Fexamples\u002Fstructured_outputs_multi_agent) | OpenAI | Agents; Multi-agent; Structured outputs | Use strict schemas to constrain structured messages and handoffs between multiple agents. | 2024-08-06 |\n| P1 | [Introducing computer use, a new Claude 3.5 Sonnet, and Claude 3.5 Haiku](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002F3-5-models-and-computer-use) | Anthropic | Agents; Computer use | Claude computer use beta starting point: the model uses a computer via screenshots and actions. | 2024-10-22 |\n| P1 | [Raising the bar on SWE-bench Verified with Claude 3.5 Sonnet](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fswe-bench-sonnet) | Anthropic | Agents; Coding; Evals | SWE-bench agent scaffolding article: same model performance strongly depends on harness\u002Fscaffolding. | 2025-01-06 |\n| P1 | [Introducing Operator](https:\u002F\u002Fopenai.com\u002Findex\u002Fintroducing-operator\u002F) | OpenAI | Agents; Computer use; Safety | Early product form of browser-based agents: model clicks, types, and executes tasks on web pages, emphasizing user confirmation and safety boundaries. | 2025-01-23 |\n| P1 | [Computer-Using Agent](https:\u002F\u002Fopenai.com\u002Findex\u002Fcomputer-using-agent\u002F) | OpenAI | Agents; Computer use | Understand how CUA combines vision, mouse\u002Fkeyboard actions, and environment feedback into an agent loop; compare with Claude computer use. | 2025-01-23 |\n| P1 | [Claude 3.7 Sonnet and Claude Code](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fclaude-3-7-sonnet) | Anthropic | Agents; Coding; Claude Code | Early release of Claude Code, marking Claude's entry into the agentic coding tool space. | 2025-02-24 |\n| P1 | [The think tool: Enabling Claude to stop and think in complex tool use situations](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fclaude-think-tool) | Anthropic | Tools; Reasoning; Agents | Give the model an explicit think tool in complex tool-use chains; learn tool design for policy-heavy\u002Fmulti-step decisions. | 2025-03-20 |\n| P1 | [Evaluating Agents with Langfuse](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Ftopic\u002Fagents) | OpenAI | Evals; Agents | Observe and evaluate Agents SDK runs with Langfuse; learn tracing\u002Feval workflows. | 2025-03-31 |\n| P1 | [Parallel Agents with the OpenAI Agents SDK](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Fexamples\u002Fagents_sdk\u002Fparallel_agents) | OpenAI | Agents; Parallelism; Agents SDK | Parallel agent patterns: decompose tasks, execute in parallel, aggregate results. | 2025-05-01 |\n| P1 | [Multi-Agent Portfolio Collaboration with OpenAI Agents SDK](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Fexamples\u002Fagents_sdk\u002Fmulti-agent-portfolio-collaboration\u002Fmulti_agent_portfolio_collaboration) | OpenAI | Agents; Multi-agent; Portfolio | Multi-agent collaboration business example: research, analysis, combined output. | 2025-05-28 |\n| P1 | [MCP-Powered Agentic Voice Framework](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Ftopic\u002Fagents) | OpenAI | MCP; Voice; Agents | Voice agent + MCP paradigm: real-time interaction, tool extension, task execution. | 2025-06-17 |\n| P1 | [Deep Research API with the Agents SDK](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Ftopic\u002Fagents) | OpenAI | Agents; Deep research; Agents SDK | Integrate Deep Research API into Agents SDK workflows. | 2025-06-25 |\n| P1 | [Desktop Extensions: One-click MCP server installation for Claude Desktop](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fdesktop-extensions) | Anthropic | MCP; Claude Desktop; Packaging | Package local MCP servers as one-click install extensions; learn MCP distribution\u002Finstallation\u002Flocal permission issues. | 2025-06-26 |\n| P1 | [Building a Supply-Chain Copilot with OpenAI Agent SDK and Databricks MCP Servers](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Ftopic\u002Fagents) | OpenAI | MCP; Agents; Databricks | Enterprise data platform MCP + Agent SDK business agent example. | 2025-07-08 |\n| P1 | [Introducing ChatGPT agent: bridging research and action](https:\u002F\u002Fopenai.com\u002Findex\u002Fintroducing-chatgpt-agent\u002F) | OpenAI | Agents; ChatGPT; Computer use | End-user-facing ChatGPT agent: combining research, browser, computer use, file\u002Fslide capabilities. | 2025-07-17 |\n| P1 | [ChatGPT agent System Card](https:\u002F\u002Fopenai.com\u002Findex\u002Fchatgpt-agent-system-card\u002F) | OpenAI | Agents; Safety; Evals | Learn pre-launch risk classification, evaluation, permissions, human confirmation, and abuse prevention for agent products. | 2025-07-17 |\n| P1 | [Context Engineering - Short-Term Memory Management with Sessions](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Ftopic\u002Fagents) | OpenAI | Context; Sessions; Agents | How short-term memory\u002Fsession state affects agent reliability. | 2025-09-09 |\n| P1 | [Introducing upgrades to Codex](https:\u002F\u002Fopenai.com\u002Findex\u002Fintroducing-upgrades-to-codex\u002F) | OpenAI | Agents; Coding; IDE | Codex evolves from research preview to daily dev tool: CLI, IDE, web\u002Fmobile collaboration, and more independent task execution. | 2025-09-15 |\n| P1 | [Introducing Claude Sonnet 4.5](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fclaude-sonnet-4-5) | Anthropic | Agents; Claude Agent SDK; Computer use | Sonnet 4.5 emphasizes coding, complex agents, computer use, with simultaneous Agent SDK launch. | 2025-09-29 |\n| P1 | [Introducing apps in ChatGPT and the new Apps SDK](https:\u002F\u002Fopenai.com\u002Findex\u002Fintroducing-apps-in-chatgpt\u002F) | OpenAI | MCP; Apps; ChatGPT | Apps SDK extends UI and tool server via MCP; entry point for understanding the ChatGPT app \u002F MCP app ecosystem. | 2025-10-06 |\n| P1 | [Build your ChatGPT UI](https:\u002F\u002Fdevelopers.openai.com\u002Fapps-sdk\u002Fbuild\u002Fchatgpt-ui) | OpenAI | MCP; Apps SDK; UI | Build custom UI components that turn structured MCP tool results into interactive ChatGPT app interfaces. | Current docs |\n| P1 | [Codex is now generally available](https:\u002F\u002Fopenai.com\u002Findex\u002Fcodex-now-generally-available\u002F) | OpenAI | Agents; Coding; Codex SDK | Codex GA, Slack integration, Codex SDK, admin tools; see how coding agents enter enterprise management. | 2025-10-06 |\n| P1 | [Using PLANS.md for multi-hour problem solving](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Farticles\u002Fcodex_exec_plans) | OpenAI | Codex; Long-running; Planning | ExecPlan files and cross-context task management for multi-hour coding-agent work. | 2025-10-07 |\n| P1 | [Beyond permission prompts: making Claude Code more secure and autonomous](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fbeyond-permission-prompts) | Anthropic | Safety; Permissions; Claude Code | From simple permission prompts to fine-grained security policies, reducing autonomous mode risk and interruptions. | 2025-10-20 |\n| P1 | [Introducing Aardvark: OpenAI's agentic security researcher](https:\u002F\u002Fopenai.com\u002Findex\u002Fintroducing-aardvark\u002F) | OpenAI | Agents; Security | Security-domain agent form: continuous scanning, issue verification, fix suggestions; later integrated as Codex Security. | 2025-10-30 |\n| P1 | [Build a coding agent with GPT 5.1](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Ftopic\u002Fagents) | OpenAI | Agents; Coding | Build a coding agent from scratch: understand file editing, command execution, loops, and verification. | 2025-11-13 |\n| P1 | [OpenAI co-founds Agentic AI Foundation](https:\u002F\u002Fopenai.com\u002Findex\u002Fagentic-ai-foundation\u002F) | OpenAI | MCP; Standards; AGENTS.md | MCP, AGENTS.md, and agent standards enter the Linux Foundation\u002FAAIF context; understand ecosystem standardization. | 2025-12-09 |\n| P1 | [Donating MCP and establishing the Agentic AI Foundation](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fdonating-the-model-context-protocol-and-establishing-of-the-agentic-ai-foundation) | Anthropic | MCP; Standards; AAIF | Anthropic donates MCP to Linux Foundation\u002FAAIF; read alongside OpenAI's AAIF article. | 2025-12-09 |\n| P1 | [Context Engineering for Personalization - Long-Term Memory Notes](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Ftopic\u002Fagents) | OpenAI | Context; Long-term memory; Agents | How long-term memory serves as agent personalization\u002Fstate management. | 2026-01-05 |\n| P1 | [Supercharging Codex with JetBrains MCP at Skyscanner](https:\u002F\u002Fdevelopers.openai.com\u002Fblog\u002Fskyscanner-codex-jetbrains-mcp) | OpenAI | MCP; Codex; IDE | Real IDE\u002FMCP case study: how Codex CLI accesses IDE context and dev tools via JetBrains MCP. | 2026-01-11 |\n| P1 | [Designing AI-resistant technical evaluations](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002FAI-resistant-technical-evaluations) | Anthropic | Evals; Technical hiring | How strong agents continuously break technical evaluations; relevant to benchmark contamination prevention and eval design. | 2026-01-21 |\n| P1 | [Inside OpenAI's in-house data agent](https:\u002F\u002Fopenai.com\u002Findex\u002Finside-our-in-house-data-agent\u002F) | OpenAI | Agents; Data; Memory | Internal data agent case study: memory, Codex, data context, reliability; learn enterprise knowledge\u002Fdata agents. | 2026-01-29 |\n| P1 | [Introducing the Codex app](https:\u002F\u002Fopenai.com\u002Findex\u002Fintroducing-the-codex-app\u002F) | OpenAI | Agents; Coding; Multi-agent | Desktop command center for agents: multi-threaded\u002Fparallel long tasks, project-level agent workflows. | 2026-02-02 |\n| P1 | [Apple's Xcode now supports Claude Agent SDK](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fapple-xcode-claude-agent-sdk) | Anthropic | Claude Agent SDK; Xcode; MCP | Embed Claude Agent SDK in Xcode: harness, subagents, background tasks, plugins, MCP. | 2026-02-03 |\n| P1 | [Quantifying infrastructure noise in agentic coding evals](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Finfrastructure-noise) | Anthropic | Evals; Coding agents; Infrastructure | Environment configuration significantly impacts scores in agentic coding evals; control infrastructure noise in both production and benchmarks. | 2026-02-05 |\n| P1 | [Building a C compiler with a team of parallel Claudes](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fbuilding-c-compiler) | Anthropic | Multi-agent; Coding; Long-running | Parallel Claude teams completing large engineering tasks; learn multi-agent division of labor, coordination, and long-running execution. | 2026-02-05 |\n| P1 | [Codex Security: now in research preview](https:\u002F\u002Fopenai.com\u002Findex\u002Fcodex-security-now-in-research-preview\u002F) | OpenAI | Agents; Security; Codex | Productization of an agentic security researcher: vulnerability discovery, verification, fix suggestions, reducing triage noise. | 2026-03-06 |\n| P1 | [Eval awareness in Claude Opus 4.6's BrowseComp performance](https:\u002F\u002Fwww.anthropic.com\u002Fengineering) | Anthropic | Evals; Agent awareness | Risk of models recognizing\u002Fadapting to evaluations; relevant to agent benchmark credibility discussions. | 2026-03-06 |\n| P1 | [How we built Claude Code auto mode: a safer way to skip permissions](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fclaude-code-auto-mode) | Anthropic | Safety; Permissions; Autonomy | Claude Code auto mode risk classification, allow\u002Fblock rules, exception handling, and security testing. | 2026-03-25 |\n| P1 | [Migrate a Legacy Codebase with Sandbox Agents](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Ftopic\u002Fagents) | OpenAI | Agents; Sandbox; Evals | Sandbox agent evaluation and execution patterns in large legacy code migrations. | 2026-04-07 |\n| P1 | [Codex for (almost) everything](https:\u002F\u002Fopenai.com\u002Findex\u002Fcodex-for-almost-everything\u002F) | OpenAI | Agents; Codex; MCP; Plugins | Codex app expanded to Windows\u002FmacOS, computer use, in-app browser, memory, plugins, MCP servers. | 2026-04-16 |\n| P1 | [Computer Use Agents in Daytona Sandboxes](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Fexamples\u002Fagents_sdk\u002Fcomputer_use_with_daytona\u002Fcomputer_use_with_daytona) | OpenAI | Computer use; Sandbox; Agents | Computer-use agents and sandbox runtimes; compare with Operator\u002FCUA\u002FClaude computer use. | 2026-04-19 |\n| P1 | [Introducing workspace agents in ChatGPT](https:\u002F\u002Fopenai.com\u002Findex\u002Fintroducing-workspace-agents-in-chatgpt\u002F) | OpenAI | Agents; Workspace; Governance | Workspace agents: shared agents, permissions, tools, memory, safeguards; ideal for team collaboration agent design. | 2026-04-22 |\n| P1 | [Building workspace agents in ChatGPT to complete repeatable, end-to-end work](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Ftopic\u002Fagents) | OpenAI | Workspace agents; ChatGPT | Practical workspace agents for repeatable end-to-end team workflows. | 2026-04-22 |\n| P1 | [Speeding up agentic workflows with WebSockets in the Responses API](https:\u002F\u002Fopenai.com\u002Findex\u002Fspeeding-up-agentic-workflows-with-websockets\u002F) | OpenAI | Agents; Latency; Responses API | Optimize latency by treating agentic rollouts as long-lived connections\u002Ftasks; learn production agent transport and caching. | 2026-05-01 |\n| P1 | [Agents for financial services](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Ffinance-agents) | Anthropic | Agents; Finance; MCP | Ten ready-to-run agent templates, Claude Code\u002FCowork plugins, Managed Agents cookbooks, MCP app. | 2026-05-05 |\n| P1 | [Migrate from the Claude Agent SDK to the OpenAI Agents SDK](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Fexamples\u002Fagents_sdk\u002Fmigrate-from-claude-agent-sdk\u002Freadme) | OpenAI | Agents SDK; Migration | Compare Claude Agent SDK and OpenAI Agents SDK from a migration perspective; ideal for dual-stack learning. | 2026-05-07 |\n| P1 | [Building a safe, effective sandbox to enable Codex on Windows](https:\u002F\u002Fopenai.com\u002Findex\u002Fbuilding-codex-windows-sandbox\u002F) | OpenAI | Safety; Sandbox; Codex | Coding agent sandbox design on Windows: file access, network restrictions, approval tradeoffs. | 2026-05-13 |\n| P1 | [Building self-improving tax agents with Codex](https:\u002F\u002Fopenai.com\u002Findex\u002Fbuilding-self-improving-tax-agents-with-codex\u002F) | OpenAI | Agents; Evals; Self-improvement | Combine production traces, expert feedback, Codex loop, and eval infrastructure into self-improving business agents. | 2026-05-27 |\n| P1 | [SchemaFlow: Agentic Database Change Impact Analysis, SQL Generation, and Eval Guardrails](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Ftopic\u002Fagents) | OpenAI | Evals; SQL; Agent guardrails | Guardrails and eval guardrails examples for data\u002FSQL agents. | 2026-06-05 |\n| P1 | [Agents SDK quickstart](https:\u002F\u002Fdevelopers.openai.com\u002Fapi\u002Fdocs\u002Fguides\u002Fagents\u002Fquickstart) | OpenAI | Agents; SDK | Quickly build a minimal agent; understand the code patterns of run, tool, and handoff. | Current docs |\n| P1 | [OpenAI Agents SDK examples](https:\u002F\u002Fopenai.github.io\u002Fopenai-agents-python\u002Fexamples\u002F) | OpenAI | Agents SDK; Patterns; Examples | Practical examples for agent patterns, MCP, memory, guardrails, approvals, handoffs, and streaming. | Current docs |\n| P1 | [MCP Apps compatibility in ChatGPT](https:\u002F\u002Fdevelopers.openai.com\u002Fapps-sdk\u002Fmcp-apps-in-chatgpt) | OpenAI | MCP; Apps SDK; UI | Understand MCP Apps UI standards, iframe\u002Fbridge, and compatibility between ChatGPT and other hosts. | Current docs |\n| P1 | [Use Codex with the Agents SDK](https:\u002F\u002Fdevelopers.openai.com\u002Fcodex\u002Fguides\u002Fagents-sdk) | OpenAI | MCP; Codex; Agents SDK | Use Codex as an MCP server for other agents to call; ideal for multi-agent dev workflows. | Current docs |\n| P1 | [Agent approvals and security - Codex](https:\u002F\u002Fdevelopers.openai.com\u002Fcodex\u002Fagent-approvals-security) | OpenAI | Safety; Approvals; Codex | Official reference for Codex approval modes, sandbox, network access; read alongside OpenAI\u002FAnthropic safety articles. | Current docs |\n| P1 | [Agent Skills - Codex](https:\u002F\u002Fdevelopers.openai.com\u002Fcodex\u002Fskills) | OpenAI | Codex; Skills; Plugins | Skills\u002FPlugins as reusable workflow packages; compare with Anthropic Agent Skills. | Current docs |\n| P1 | [Skills in OpenAI API](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Fexamples\u002Fskills_in_api) | OpenAI | Skills; OpenAI API | Cookbook example for using Skills in the OpenAI API and connecting skill bundles to agent workflows. | Current docs |\n| P1 | [Custom instructions with AGENTS.md - Codex](https:\u002F\u002Fdevelopers.openai.com\u002Fcodex\u002Fguides\u002Fagents-md) | OpenAI | AGENTS.md; Context | How AGENTS.md provides persistent project specifications for agents; establish repo-level agent contracts. | Current docs |\n| P1 | [Agents SDK integrations and observability](https:\u002F\u002Fdevelopers.openai.com\u002Fapi\u002Fdocs\u002Fguides\u002Fagents\u002Fintegrations-observability) | OpenAI | Observability; MCP; Tracing | Tracing, MCP integration, provider\u002Fobservability; essential for production agent debugging. | Current docs |\n| P1 | [Secure MCP Tunnel](https:\u002F\u002Fdevelopers.openai.com\u002Fapi\u002Fdocs\u002Fguides\u002Fsecure-mcp-tunnels) | OpenAI | MCP; Security; Private tools | Securely expose private\u002Fintranet MCP servers to supported OpenAI surfaces; ideal for enterprise deployment. | Current docs |\n| P1 | [How Claude Code works](https:\u002F\u002Fcode.claude.com\u002Fdocs\u002Fen\u002Fhow-claude-code-works) | Anthropic | Claude Code; Agentic loop; Harness | Under-the-hood architecture of Claude Code: the agentic loop (gather context → act → verify), built-in tool categories, context window management, and extension points. | Current docs |\n| P0 | [learn-claude-code](https:\u002F\u002Fgithub.com\u002FshareAI-lab\u002Flearn-claude-code) | Community | Harness; Agent loop; Tools; Context | Hands-on 20-lesson tutorial building a Claude Code–like agent harness from scratch: agent loop, tool integration, context compaction, multi-agent coordination, permissions, MCP plugins. | 2026 |\n| P0 | [Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems](https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.14228) | Academic | Agent architecture; Claude Code; Design space | Deep technical analysis of Claude Code's architecture: agentic loop, permission system, context compaction, extensibility (MCP\u002Fplugins\u002Fskills\u002Fhooks), subagent delegation, and comparison with open-source alternatives. | 2026-04-14 |\n| P0 | [Function Calling](https:\u002F\u002Fdevelopers.openai.com\u002Fapi\u002Fdocs\u002Fguides\u002Ffunction-calling) | OpenAI | Tools; Function calling; API | Official guide to function\u002Ftool calling: define functions with JSON schemas, handle model tool calls, execute and return results. | Current docs |\n| P0 | [Tool use overview](https:\u002F\u002Fplatform.claude.com\u002Fdocs\u002Fen\u002Fagents-and-tools\u002Ftool-use\u002Foverview) | Anthropic | Tools; Tool use; API | Connect Claude to external tools and APIs: client vs server tools, the agentic loop, strict schema conformance, and when Claude decides to call tools. | Current docs |\n| P0 | [Function calling - Gemini API](https:\u002F\u002Fai.google.dev\u002Fgemini-api\u002Fdocs\u002Ffunction-calling) | Google | Tools; Function calling; API | Enable Gemini models to connect with external tools via function calling: single-turn, multi-turn, parallel, and sequential function chains. | Current docs |\n| P2 | [Orchestrating Agents: Routines and Handoffs (archived)](https:\u002F\u002Fdevelopers.openai.com\u002Fcookbook\u002Fexamples\u002Forchestrating_agents) | OpenAI | Agents; Handoffs; Orchestration | Historical cookbook for routines and handoffs; useful conceptually, but archived and not the current recommended implementation path. | 2024-10-10 |\n| P2 | [Introducing Contextual Retrieval](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fcontextual-retrieval) | Anthropic | Context; Retrieval; RAG | Not agent-specific, but important for agent RAG\u002Fcontext: prepend context to chunks before retrieval to improve recall. | 2024-09-19 |\n| P2 | [Developing a computer use model](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fdeveloping-computer-use) | Anthropic | Computer use; Agents | More technical explanation of how the computer-use model moves the mouse, clicks, types, and reads screen feedback. | 2024-10-22 |\n| P2 | [Introducing Claude 4](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fclaude-4) | Anthropic | Agents; Coding; Long-running | Overview of Claude Opus\u002FSonnet 4 capabilities: coding, advanced reasoning, agent workflows. | 2025-05-22 |\n| P2 | [Claude for Financial Services](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fclaude-for-financial-services) | Anthropic | Agents; Connectors; Finance | Vertical industry agent\u002Fconnector productization case; understand data, permissions, and tool integration in finance. | 2025-07-15 |\n| P2 | [Advancing Claude for Financial Services](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fadvancing-claude-for-financial-services) | Anthropic | Agents; Skills; Finance | Claude for Excel, real-time data connectors, pre-built Agent Skills for vertical industry productization. | 2025-10-27 |\n| P2 | [Introducing GPT-5.3-Codex](https:\u002F\u002Fopenai.com\u002Findex\u002Fintroducing-gpt-5-3-codex\u002F) | OpenAI | Agents; Coding model; Evals | Codex-native model and long-running coding\u002Fterminal\u002Fagentic benchmarks; understand how model capabilities serve the harness. | 2026-02-05 |\n| P2 | [Introducing OpenAI Frontier](https:\u002F\u002Fopenai.com\u002Findex\u002Fintroducing-openai-frontier\u002F) | OpenAI | Agents; Enterprise; Governance | Enterprise AI coworker\u002Fagent platform: shared context, onboarding, permissions, guardrails, governance. | 2026-02-10 |\n| P2 | [Introducing Claude Sonnet 4.6](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fclaude-sonnet-4-6) | Anthropic | Agents; Planning; Computer use | Sonnet 4.6 emphasizes coding, computer use, long-context reasoning, agent planning. | 2026-02-17 |\n| P2 | [Introducing Claude Opus 4.6](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fclaude-opus-4-6) | Anthropic | Agents; Long-running; Tool use | Model release perspective on long-running tasks, agentic harness, subagents, and tool call capabilities. | 2026-02-25 |\n| P2 | [Introducing Claude Opus 4.7](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fclaude-opus-4-7) | Anthropic | Agents; Long-running; Coding | Stronger software engineering and long-running task performance; track how model capabilities impact agent workloads. | 2026-04-16 |\n| P2 | [An update on recent Claude Code quality reports](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fapril-23-postmortem) | Anthropic | Reliability; Claude Code; Agent SDK | Postmortem on Claude Code\u002FAgent SDK quality regression; learn agent product operations and regression control. | 2026-04-23 |\n| P2 | [Introducing Claude Opus 4.8](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fclaude-opus-4-8) | Anthropic | Agents; Dynamic workflows; Long-running | Dynamic workflows, hundreds of parallel subagents, long-running agentic tasks — latest model\u002Fproduct direction. | 2026-05-28 |\n| P2 | [Codex for every role, tool, and workflow](https:\u002F\u002Fopenai.com\u002Findex\u002Fcodex-for-every-role-tool-workflow\u002F) | OpenAI | Agents; Codex; Plugins | Codex expands from development to knowledge work: role-specific plugins, Sites, annotations, parallel workflows. | 2026-06-02 |\n| P2 | [Codex is becoming a productivity tool for everyone](https:\u002F\u002Fopenai.com\u002Findex\u002Fcodex-for-knowledge-work\u002F) | OpenAI | Agents; Knowledge work | Usage data shows how non-developers use Codex for reports, spreadsheets, research, automation, and lightweight tools. | 2026-06-02 |\n| P2 | [OpenAI Docs MCP](https:\u002F\u002Fdevelopers.openai.com\u002Flearn\u002Fdocs-mcp) | OpenAI | MCP; Docs; Context | Official OpenAI docs MCP server; connect docs directly to local agents\u002FIDEs. | Current docs |\n| P2 | [Codex SDK](https:\u002F\u002Fdevelopers.openai.com\u002Fcodex\u002Fsdk) | OpenAI | Codex SDK; Automation | Programmatically control Codex in CI\u002FCD or internal tools; embed coding agents into existing workflows. | Current docs |\n| P2 | [When AI builds itself](https:\u002F\u002Fwww.anthropic.com\u002Finstitute\u002Frecursive-self-improvement) | Anthropic | Agents; Recursive self-improvement; Safety | How AI systems accelerate their own development through recursive self-improvement; three possible futures and the need for verifiable coordination. | 2026-05 |\n\n---\n\n## Who Is This For?\n\n- AI Engineers\n- Agent Engineers\n- LLM Engineers\n- Platform Engineers\n- Research Engineers\n- AI Startup Founders\n\n---\n\n## Contributing\n\nContributions are welcome. If you find:\n\n- New OpenAI resources\n- New Anthropic resources\n- MCP updates\n- Agent evaluation frameworks\n- Production engineering articles\n\nPlease open a pull request.\n\n---\n\n## Vision\n\n> The goal of this project is to become the **System Design Primer** for Agentic Engineering.\n\nIf you're serious about building production AI agents, start here.\n\n---\n\n## Star History\n\n[![Star History Chart](https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=keyuchen21\u002Fagentic-engineering-handbook&type=Date)](https:\u002F\u002Fstar-history.com\u002F#keyuchen21\u002Fagentic-engineering-handbook&Date)\n\n---\n\n## License\n\n[MIT](LICENSE)\n","2026-06-15 02:30:02","CREATED_QUERY"]