[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-84027":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":9,"languages":9,"totalLinesOfCode":9,"stars":10,"forks":11,"watchers":12,"openIssues":13,"contributorsCount":13,"subscribersCount":13,"size":13,"stars1d":14,"stars7d":15,"stars30d":15,"stars90d":13,"forks30d":13,"starsTrendScore":16,"compositeScore":17,"rankGlobal":9,"rankLanguage":9,"license":9,"archived":18,"fork":18,"defaultBranch":19,"hasWiki":20,"hasPages":18,"topics":21,"createdAt":9,"pushedAt":9,"updatedAt":22,"readmeContent":23,"aiSummary":9,"trendingCount":13,"starSnapshotCount":13,"syncStatus":24,"lastSyncTime":25,"discoverSource":26},84027,"start-ai-engineering","louisfb01\u002Fstart-ai-engineering","louisfb01","A complete guide to start and improve in AI engineering in 2026 without ANY background in the field and stay up-to-date with the latest news and state-of-the-art techniques!",null,137,17,3,0,6,29,41,70.17,false,"main",true,[],"2026-06-12 04:01:42","# Start AI Engineering in 2026 - Build real AI systems, mostly for free!\n\n## A complete guide to start and improve in AI engineering in 2026 without ANY background in the field and stay up-to-date with the latest news and state-of-the-art techniques!\n\nThis guide is intended for anyone with zero or a small background in programming, AI, or machine learning who wants to become a strong AI engineer in 2026. It is organized by how you like to learn: videos, articles, books, docs, courses, and real projects.\n\nThere is no single correct order to follow, but a classic path is from top to bottom. If you dislike books, skip them. If you do not want to follow an online course, skip that too. With enough motivation, projects, and repetition, you can absolutely learn this field.\n\nMost resources listed here are free. Paid resources are clearly labelled, and some paid course and book links are affiliate links that support this guide at no extra cost to you. Thank you, and have fun learning!\n\nDon't be afraid to repeat videos, learn from multiple sources, and build messy projects. Repetition and debugging are where the real learning happens.\n\nMaintainer: [louisfb01](https:\u002F\u002Fgithub.com\u002Flouisfb01), also active on [YouTube](https:\u002F\u002Fwww.youtube.com\u002F@whatsai), [the What's AI Podcast](https:\u002F\u002Fwww.louisbouchard.ai\u002Fpodcast\u002F), and [my personal newsletter](https:\u002F\u002Flouisbouchard.substack.com\u002F) if you want to see and hear more about AI.\n\n[![X: @Whats_AI](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FX-@Whats_AI-000000?logo=x&logoColor=white)](https:\u002F\u002Fx.com\u002FWhats_AI)\n[![LinkedIn: Louis-François Bouchard](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLinkedIn-Louis--Fran%C3%A7ois%20Bouchard-0A66C2?logo=linkedin&logoColor=white)](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fwhats-ai\u002F)\n[![YouTube: What's AI](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FYouTube-What's%20AI-FF0000?logo=youtube&logoColor=white)](https:\u002F\u002Fwww.youtube.com\u002F@WhatsAI)\n\n***Tag Louis-François Bouchard on [X](https:\u002F\u002Fx.com\u002FWhats_AI) or [LinkedIn](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fwhats-ai\u002F) if you share this guide, and feel free to suggest additions through pull requests.***\n\n**If this guide helps you, please star the repo and share it. That is the main way other builders find it.**\n\n### Want to know what this guide is about? Start with this video:\n\n[\u003Cimg src=\"assets\u002Fai-engineering-foundations.webp\" width=\"512\"\u002F>](https:\u002F\u002Fyoutu.be\u002FljOwBCdiHmg)\n\nWatch [AI Engineering Foundations: What Developers Actually Need to Know Today](https:\u002F\u002Fyoutu.be\u002FljOwBCdiHmg) first, then [subscribe to What's AI](https:\u002F\u002Fwww.youtube.com\u002Fc\u002FWhatsAI?sub_confirmation=1) for more AI engineering videos.\n\n*This guide is updated throughout 2026 as the stack moves.*\n\n----\n\n## Table of Contents\n\n* [Prerequisites and learning path](#prerequisites)\n* [Start with short YouTube and video introductions](#youtubevideos)\n* [Books and long-form reading](#readers)\n* [Online courses](#courses)\n* [Practice and projects](#practice)\n* [Prompting and structured outputs](#prompting)\n* [Reasoning models and test-time compute](#reasoning)\n* [Context engineering and long context](#context)\n* [Retrieval-Augmented Generation (RAG)](#rag)\n* [Embeddings, rerankers, and vector databases](#vectors)\n* [Tools, MCP, and computer use](#tools)\n* [Workflows, agents, and multi-agent systems](#agents)\n* [Evaluations, observability, and harnesses](#evals)\n* [Fine-tuning and data curation](#finetuning)\n* [Multimodal and document understanding](#multimodal)\n* [Voice agents and realtime AI](#voice)\n* [Deployment, inference, and open-weight models](#deployment)\n* [AI coding agents and developer tools](#codingagents)\n* [AI safety, security, and guardrails](#aiethics)\n* [Communities, subreddits, and Discords](#communities)\n* [Newsletters, podcasts, and blogs](#moreresources)\n* [People to follow](#peopletofollow)\n* [How to find an AI engineering job](#findajob)\n* [Learn more and build more with AI](#domore)\n\n----\n\n## Prerequisites and learning path\u003Ca name=\"prerequisites\">\u003C\u002Fa>\n\nBefore you start collecting resources, keep the goal clear: this guide is for becoming a better AI engineer, not merely a better agentic coder.\n\n### Quick LLM and coding-agent warning\n\nCoding agents like Codex, Claude Code, Cursor, and similar tools can write code, scaffold apps, and speed up almost every step. You should use them. But AI engineering is the judgment layer behind the work: deciding what to build, what architecture fits, how to evaluate it, where it will fail, and whether it is reliable enough to ship.\n\nThis guide is not about outsourcing your thinking to an agent. It is about using those tools while building the foundations, taste, and decision-making ability to become a true AI engineer.\n\n### What AI engineering means in 2026\n\nIn 2026, AI engineering goes well past prompting. You need context engineering, Retrieval-Augmented Generation (RAG), tools and the Model Context Protocol (MCP), workflow and agent design, evaluations, observability, harnesses, deployment, security, and a working understanding of reasoning models.\n\nThat is also why this guide, and our courses, prioritize learning by building. I learned AI engineering by building, and I now interview and hire AI engineers for consulting work at Towards AI, so this guide is biased toward the decision-making skills I actually look for. You can learn a lot alone with coding agents, but structure and expert feedback help you turn projects into true expertise instead of a pile of fragile demos.\n\n### Suggested learning path\n\nThere is no single correct order. If you want a default path, I would do this:\n\n1. Watch a few foundational videos to pick up vocabulary and intuition.\n2. Pick one free course and one framework whose docs you commit to reading end to end.\n3. Pick one or two books to build a solid foundation you can return to when the tools change.\n4. Optionally take one or two advanced applied courses with real projects, especially if you want a structured path before breaking things on your own.\n5. Build two or three small but real projects that break in interesting ways.\n6. Add evaluations, tracing, and deployment before you call anything production-ready.\n\nAfter that, you should have the foundations of a solid AI engineer ready for many entry-level or transition roles. Most importantly, keep learning and keep an open mind. This field changes fast, and the best engineers stay curious instead of getting religious about one model, framework, or workflow.\n\n### Difficulty guide\n\nResources use compact markers from 1️⃣ to 🔟. 1️⃣ means absolute beginner, like an intro Python course; 3️⃣ is beginner-friendly AI vocabulary; 5️⃣ is practical builder material you can apply in a project; 7️⃣ is production engineering depth; 9️⃣ is advanced systems or research; and 🔟 is the kind of senior-level paper or technique you may want to revisit after you have shipped a few systems. Lower numbers first, scars later.\n\n### Personalize this roadmap with an AI agent\n\nYou can use this guide with your favorite AI agent. Paste the prompt below into Codex, Claude Code, ChatGPT, Cursor, or another assistant, then tell it how you like to learn:\n\n```text\nUse this repo as my AI engineering roadmap: https:\u002F\u002Fgithub.com\u002Flouisfb01\u002Fstart-ai-engineering\n\nCreate a personalized learning plan for me. First ask about my background, coding level, available time, budget, preferred learning style, and goals. Then choose the most relevant resources from the repo, explain why you picked them, order them from easiest to hardest, and turn them into a weekly plan with projects, checkpoints, and what I should be able to build after each stage.\n```\n\n### If you are brand new to code\n\n* 1️⃣ [Learn Python](https:\u002F\u002Fwww.learnpython.org\u002F) - Free interactive tutorial to learn Python fundamentals if you have never touched the language.\n* 1️⃣ [AI Python for Beginners](https:\u002F\u002Fwww.deeplearning.ai\u002Fshort-courses\u002Fai-python-for-beginners\u002F) - DeepLearning.AI. Free short course from Andrew Ng's team, lighter on-ramp than a full bootcamp.\n* 2️⃣ [Python Fundamentals + CS Concepts — A One-Stop Starter Class](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=_uRb5wlFhyw&list=PLO4GrDnQanVfCtcyuJn6zZpwGgoNkAYFp) - Louis-François Bouchard, What's AI. Free playlist covering Python fundamentals and core computer science concepts in one place. The right starting point if you want a single resource before jumping into LLM development.\n* 2️⃣ [Beginner Python for AI Engineering](https:\u002F\u002Facademy.towardsai.net\u002Fcourses\u002Fpython-for-genai?ref=1f9b29) - Towards AI. An LLM-native Python course for people who want to go straight to building with LLMs, not through six months of classical scripting first. *(Paid, $149)*\n\nIf you already know some Python, you can jump into the rest of this guide. You do not need a mathematics PhD or deep research background. You do need basic Python, comfort reading docs, willingness to debug messy systems, and enough curiosity to build things that break. The last point matters more than people expect.\n\n----\n\n## Start with short YouTube and video introductions\u003Ca name=\"youtubevideos\">\u003C\u002Fa>\n\nVideo is still the fastest way to pick up vocabulary and mental models.\n\n### Start here for AI engineering judgment\n\n* 4️⃣ [AI Engineering Foundations: What Developers Actually Need to Know Today](https:\u002F\u002Fyoutu.be\u002FljOwBCdiHmg) - Louis-François Bouchard. A one-hour webinar on what AI engineers need to know today: how LLMs work, their limitations, when to use prompting, RAG, workflows, or agents, and why evaluations and security matter before production.\n\n### Foundational explainer videos\n\n* 2️⃣ [How AI Works in Super Simple Terms](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=q-BiW5srMFQ) - StatQuest with Josh Starmer. The gentlest possible on-ramp: how AI like ChatGPT works explained through a super simple example with no heavy math. Start here if any of the other foundational videos feel overwhelming.\n* 2️⃣ [Mastering AI Jargon - Your Guide to OpenAI & LLM Terms](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=q4G6X09NEu4) - Louis-François Bouchard. A practical glossary for the terms you keep seeing around OpenAI, GPT, LLMs, prompting, and generative AI.\n* 3️⃣ [Intro to Large Language Models](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=zjkBMFhNj_g) - Andrej Karpathy. One hour. Still the cleanest high-level tour of what an LLM is and how it works.\n* 4️⃣ [AI Fundamentals for Builders - Understand transformers and fix LLM limitations](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=R5_udqy1L4s) - Louis-François Bouchard. A builder-focused session on transformer intuition, common LLM limitations, and the techniques used to work around them.\n* 5️⃣ [A Hackers' Guide to Language Models](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=jkrNMKz9pWU) - Jeremy Howard, fast.ai. 90 minutes, practical and builder-oriented, assumes you can code.\n* 6️⃣ [Deep Dive into LLMs like ChatGPT](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=7xTGNNLPyMI) - Andrej Karpathy. 2025. Three and a half hours covering the full LLM training and inference stack, free. The single best investment if you only watch one long video this year.\n\n### YouTube channels worth subscribing to\n\n* 2️⃣ [StatQuest with Josh Starmer](https:\u002F\u002Fwww.youtube.com\u002F@statquest) - Josh Starmer. The clearest visual explanations of ML and neural network concepts on YouTube. Ideal for building solid intuition about how transformers, attention, and training actually work before you start building.\n* 3️⃣ [3Blue1Brown](https:\u002F\u002Fwww.youtube.com\u002F@3blue1brown) - Grant Sanderson. Visual math and deep learning intuition. The neural networks and attention series are widely considered the best visual introductions to these concepts.\n* 3️⃣ [DeepLearning.AI](https:\u002F\u002Fwww.youtube.com\u002F@Deeplearningai) - Andrew Ng's official channel. Free recorded short courses on prompting, RAG, agents, evals, and more. Most of the DeepLearning.AI short courses land here first.\n* 3️⃣ [IBM Technology](https:\u002F\u002Fwww.youtube.com\u002F@IBMTechnology) - Clear concept explainers on LLMs, RAG, agents, and enterprise AI. Good for quickly getting up to speed on a new concept with no background noise.\n* 3️⃣ [Tech With Tim](https:\u002F\u002Fwww.youtube.com\u002F@TechWithTim) - Tim Ruscica. 1.89M subscribers. Beginner-to-intermediate coding and AI projects in Python. Strong for learners who want to build working things (AI games, assistants, chatbots, small ML projects) alongside the theory.\n* 4️⃣ [What's AI](https:\u002F\u002Fwww.youtube.com\u002F@WhatsAI) - Practical AI engineering explainers from Louis-François Bouchard. Useful for RAG, agents, MCP, evals, and learning how to reason about the stack instead of only chasing tools.\n* 4️⃣ [Hugging Face](https:\u002F\u002Fwww.youtube.com\u002F@HuggingFace) - Official tutorials across the open-source AI ecosystem. Covers fine-tuning, inference, datasets, and new model releases.\n* 5️⃣ [LangChain](https:\u002F\u002Fwww.youtube.com\u002F@LangChain) - Official channel for LangChain and LangGraph. Tutorial-first videos on agents, workflows, and graph-based orchestration.\n* 5️⃣ [Jeremy Howard](https:\u002F\u002Fwww.youtube.com\u002F@howardjeremyp) - fast.ai co-founder. Practical, builder-oriented, strong on software craft and AI-assisted coding.\n* 5️⃣ [Two Minute Papers](https:\u002F\u002Fwww.youtube.com\u002F@TwoMinutePapers) - Károly Zsolnai-Fehér. Short, enthusiastic summaries of AI research papers. Good for staying aware of what is being published without reading every paper.\n* 5️⃣ [Bycloud](https:\u002F\u002Fwww.youtube.com\u002F@bycloudAI) - Weekly video essays on AI news and research, aimed at builders.\n* 6️⃣ [Andrej Karpathy](https:\u002F\u002Fwww.youtube.com\u002F@AndrejKarpathy) - Former Tesla AI and OpenAI. Best long-form explanations of how LLMs actually work — essential mental models for anyone building on top of them.\n* 7️⃣ [Umar Jamil](https:\u002F\u002Fwww.youtube.com\u002F@umarjamilai) - Line-by-line implementations of transformers, vision-language models, and LoRA. Strong for understanding what is happening inside a model when you are debugging or fine-tuning.\n* 8️⃣ [Yannic Kilcher](https:\u002F\u002Fwww.youtube.com\u002F@YannicKilcher) - In-depth walkthroughs of new research papers. Essential for staying current with model releases and understanding what papers actually claim vs. what they prove.\n\nPodcasts and longer listening are collected in the Newsletters, podcasts, and blogs section below.\n\n----\n\n## Books and long-form reading\u003Ca name=\"readers\">\u003C\u002Fa>\n\nIf you prefer reading to watching, this path goes very far, especially with these books focusing on actually coding and building.\n\n### Books worth your time\n\n* 5️⃣ [Building LLMs for Production](https:\u002F\u002Famzn.to\u002F4dZ0Mtz) - Towards AI. 465 pages covering prompting, RAG, fine-tuning, reliability, and shipping. Used as an internal reference manual in many companies. The [Academy e-book version](https:\u002F\u002Facademy.towardsai.net\u002Fcourses\u002Fbuildingllmsforproduction?ref=1f9b29) is also available. *(Paid, $29 e-book)*\n* 5️⃣ [Hands-On Large Language Models](https:\u002F\u002Fwww.oreilly.com\u002Flibrary\u002Fview\u002Fhands-on-large-language\u002F9781098150952\u002F) - Jay Alammar and Maarten Grootendorst. Visual, code-first companion that pairs well with Chip Huyen's book. *(Paid)*\n* 5️⃣ [Prompt Engineering for LLMs](https:\u002F\u002Fwww.oreilly.com\u002Flibrary\u002Fview\u002Fprompt-engineering-for\u002F9781098156145\u002F) - John Berryman and Albert Ziegler. Written by GitHub Copilot engineers, with useful field-tested patterns. *(Paid)*\n* 6️⃣ [LLM Engineer's Handbook](https:\u002F\u002Famzn.to\u002F4x5pakJ) - Paul Iusztin and Maxime Labonne. Production-focused, built around a real end-to-end project. Pairs with the companion [code repo](https:\u002F\u002Fgithub.com\u002FPacktPublishing\u002FLLM-Engineers-Handbook). *(Paid)*\n* 7️⃣ [AI Engineering](https:\u002F\u002Fwww.oreilly.com\u002Flibrary\u002Fview\u002Fai-engineering\u002F9781098166298\u002F) - Chip Huyen. The most-read book on O'Reilly for this space. Strong on system design, evaluation, and when each technique earns its place. *(Paid)*\n* 8️⃣ [Build a Large Language Model (From Scratch)](https:\u002F\u002Fwww.manning.com\u002Fbooks\u002Fbuild-a-large-language-model-from-scratch) - Sebastian Raschka. Foundations and intuition. Code a GPT-style LLM from scratch in PyTorch, no libraries that hide the internals. The right book for developers who want to move past calling APIs and actually understand transformers, tokenization, attention, and fine-tuning. Pairs with the companion [LLMs-from-scratch repo](https:\u002F\u002Fgithub.com\u002Frasbt\u002FLLMs-from-scratch). *(Paid)*\n\n### Free long-form explainers that still hold up\n\n* 4️⃣ [The Illustrated Transformer](https:\u002F\u002Fjalammar.github.io\u002Fillustrated-transformer\u002F) - Jay Alammar. The classic visual reference for the transformer architecture. Worth having open when reading about attention, tokenization, or embedding layers.\n* 5️⃣ [Prompt Engineering](https:\u002F\u002Flilianweng.github.io\u002Fposts\u002F2023-03-15-prompt-engineering\u002F) - Lilian Weng. The cleanest overview of prompting techniques from a research perspective.\n* 5️⃣ [Patterns for Building LLM-based Systems & Products](https:\u002F\u002Feugeneyan.com\u002Fwriting\u002Fllm-patterns\u002F) - Eugene Yan. Seven patterns that almost every shipped LLM product ends up using.\n* 6️⃣ [The Illustrated Retrieval Transformer](https:\u002F\u002Fjalammar.github.io\u002Fillustrated-retrieval-transformer\u002F) - Jay Alammar. Useful for intuition on how retrieval-style architectures differ from pure decoder-only models.\n* 7️⃣ [LLM Powered Autonomous Agents](https:\u002F\u002Flilianweng.github.io\u002Fposts\u002F2023-06-23-agent\u002F) - Lilian Weng, OpenAI. Still the reference post on agent design, planning, memory, and tool use.\n* 7️⃣ [The State of LLMs 2025](https:\u002F\u002Fmagazine.sebastianraschka.com\u002Fp\u002Fstate-of-llms-2025) - Sebastian Raschka's year-end synthesis of how the stack actually moved.\n* 8️⃣ [Why We Think](https:\u002F\u002Flilianweng.github.io\u002Fposts\u002F2025-05-01-thinking\u002F) - Lilian Weng on test-time compute and why reasoning models work.\n\n### Essential 2025-2026 articles on AI engineering\n\nA curated short list of valuable long-form articles from 2025-2026. All are substantial reads (10+ minutes) that reward a full sitting. Topic-specific articles are in their respective sections below.\n\n* 4️⃣ [Here's how I use LLMs to help me write code](https:\u002F\u002Fsimonwillison.net\u002F2025\u002FMar\u002F11\u002Fusing-llms-for-code\u002F) - Simon Willison's personal workflow, written for other practitioners. The most-shared write-up on actually working with coding agents.\n* 5️⃣ [Your AI Product Needs Evals](https:\u002F\u002Fhamel.dev\u002Fblog\u002Fposts\u002Fevals\u002F) - Hamel Husain. The canonical starting point for why evals matter and how to begin.\n* 6️⃣ [A Field Guide to Rapidly Improving AI Products](https:\u002F\u002Fhamel.dev\u002Fblog\u002Fposts\u002Ffield-guide\u002F) - Hamel Husain. An end-to-end playbook for going from \"it kinda works\" to a real product. Pairs evals with error analysis and data flywheels. The single best article on *improving* an AI product once it exists.\n* 6️⃣ [Building Effective AI Agents](https:\u002F\u002Fwww.anthropic.com\u002Fresearch\u002Fbuilding-effective-agents) - Anthropic. The reference post on when to use a workflow and when autonomy actually pays its way. Widely treated as required reading.\n* 6️⃣ [Harness Engineering: The Missing Layer Behind AI Agents](https:\u002F\u002Fwww.louisbouchard.ai\u002Fharness-engineering\u002F) - Louis-François Bouchard. The layer between prompt engineering and a working agent: tools, permissions, state, retries, checkpoints, guardrails, and evals. Explains why harnesses, not models, separate demos from products.\n* 6️⃣ [Agents](https:\u002F\u002Fhuyenchip.com\u002F2025\u002F01\u002F07\u002Fagents.html) - Chip Huyen. A long-form primer on agent design, planning, and tool use. One of the most-shared agent posts of 2025.\n* 7️⃣ [Context Engineering for LLMs: Build Reliable, Production-Ready RAG Systems](https:\u002F\u002Fpub.towardsai.net\u002Fcontext-engineering-4a17018c41cf) - A full walkthrough of chunking, hybrid retrieval (BM25 + dense), reranking, and token budgeting. Practical enough to take a RAG prototype to production.\n* 7️⃣ [Effective harnesses for long-running agents](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Feffective-harnesses-for-long-running-agents) - Anthropic. Scaffolding for hour-long agent runs: checkpoints, state, and recovery patterns.\n* 7️⃣ [Agent Observability and Evaluation: A 2026 Developer's Guide](https:\u002F\u002Fpub.towardsai.net\u002Fagent-observability-and-evaluation-a-2026-developers-guide-to-building-reliable-ai-agents-f4547e4beb14) - Divy Yadav's long-form piece on why most teams still have no evals, what to instrument first, and how to close the feedback loop between traces and fixes.\n* 7️⃣ [12-Factor Agents](https:\u002F\u002Fgithub.com\u002Fhumanlayer\u002F12-factor-agents) - Dex Horthy. Widely-cited production-agent checklist covering state, tools, context, and reliability. Referenced across most 2025-2026 agent engineering discussions.\n* 7️⃣ [Systematically Improving RAG](https:\u002F\u002Fjxnl.co\u002Fwriting\u002F2024\u002F05\u002F22\u002Fsystematically-improving-your-rag\u002F) - Jason Liu. A disciplined iteration playbook for RAG, from evals to metadata to user feedback loops. Still the reference piece for RAG consultants.\n* 8️⃣ [How we built our multi-agent research system](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fbuilt-multi-agent-research-system) - Anthropic. Real architecture behind a shipped multi-agent product, including the tradeoffs and failure modes you only see in production.\n* 8️⃣ [The Lethal Trifecta for AI Agents](https:\u002F\u002Fsimonwillison.net\u002F2025\u002FJun\u002F16\u002Fthe-lethal-trifecta\u002F) - Simon Willison. Private data, untrusted content, and external communication — the combination every agent builder needs to internalize before shipping.\n* 8️⃣ [How to Fine-Tune LLMs in 2025 with Hugging Face](https:\u002F\u002Fwww.philschmid.de\u002Ffine-tune-llms-in-2025) - Philipp Schmid. The single best recent how-to on modern fine-tuning.\n\nArticles from Anthropic, OpenAI, and individual practitioners (Shreya Shankar, Paul Iusztin, and others) are also referenced in the topic-specific sections below. Start with the topic you care about most and work outward.\n\nFor ongoing reading, rotate between practitioner blogs, official engineering posts, the [Towards AI publication on Medium](https:\u002F\u002Fpub.towardsai.net\u002F), and the [Towards AI Newsletter](https:\u002F\u002Fnewsletter.towardsai.net\u002F) instead of relying on one source.\n\n### A reading loop that actually works\n\nA common mistake is reading ten articles on the same topic and building nothing. A better loop is: read one conceptual article, read one official docs page, build one tiny version yourself, then reread the article once you have scars. The second pass hits very differently.\n\n----\n\n## Online courses\u003Ca name=\"courses\">\u003C\u002Fa>\n\nIf you want more structure, courses are the fastest route through this material.\n\n### Deep, end-to-end programs\n\n* 2️⃣ [AI for Work](https:\u002F\u002Facademy.towardsai.net\u002Fcourses\u002Fai-business-professionals?ref=1f9b29) - Towards AI. 15 modules for non-developers who want to actually use AI at work. No coding required. *(Paid, $399)*\n* 3️⃣ [10-Hour LLM Fundamentals](https:\u002F\u002Facademy.towardsai.net\u002Fcourses\u002Fllm-primer?ref=1f9b29) - Towards AI. Compact video-first crash course covering when to use prompting, RAG, fine-tuning, or agents. Useful before going deep. *(Paid, $199)*\n* 5️⃣ [Full Stack AI Engineering](https:\u002F\u002Facademy.towardsai.net\u002Fcourses\u002Fbeginner-to-advanced-llm-dev?ref=1f9b29) - Towards AI's flagship program. 90+ lessons across prompting, RAG, fine-tuning, tools, agents, and deployment, built around one production capstone. Designed for people who want a full developer path to AI engineering. *(Paid, $349)*\n* 7️⃣ [Agentic AI Engineering](https:\u002F\u002Facademy.towardsai.net\u002Fcourses\u002Fagent-engineering?ref=1f9b29) - Towards AI. 34 lessons and two production agents (a research agent and a writing workflow), covering context engineering, evaluations, observability, containers, and deployment. For people who already ship LLM apps and want to specialize. *(Paid, $499)*\n\n### Free docs-heavy paths\n\n* 4️⃣ [Hugging Face LLM Course](https:\u002F\u002Fhuggingface.co\u002Flearn\u002Fllm-course\u002Fchapter1\u002F1) - Free. The best free structured path through tokenization, fine-tuning, and modern transformers.\n* 4️⃣ [Anthropic Academy](https:\u002F\u002Fwww.anthropic.com\u002Flearn) - Free. Includes an [Introduction to MCP](https:\u002F\u002Fanthropic.skilljar.com\u002Fintroduction-to-model-context-protocol).\n* 5️⃣ [Hugging Face Agents Course](https:\u002F\u002Fhuggingface.co\u002Flearn\u002Fagents-course\u002F) - Free. Walks through agents, tools, and orchestration using open-source models.\n* 5️⃣ [Hugging Face MCP Course](https:\u002F\u002Fhuggingface.co\u002Flearn\u002Fmcp-course\u002F) - Free. Builds both client and server sides of MCP from scratch.\n* 5️⃣ [LangChain Academy](https:\u002F\u002Facademy.langchain.com\u002F) - Free. The official path through LangChain and LangGraph.\n\n### Useful DeepLearning.AI short courses (free)\n\n* 3️⃣ [ChatGPT Prompt Engineering for Developers](https:\u002F\u002Fwww.deeplearning.ai\u002Fshort-courses\u002Fchatgpt-prompt-engineering-for-developers\u002F) - Andrew Ng and Isa Fulford. Free. The prompting short course most teams already assume you have done.\n* 4️⃣ [Building Systems with the ChatGPT API](https:\u002F\u002Fwww.deeplearning.ai\u002Fshort-courses\u002Fbuilding-systems-with-chatgpt\u002F) - Free. Multi-step chains, moderation, and evals at a beginner level.\n* 5️⃣ [Improving Accuracy of LLM Applications](https:\u002F\u002Fwww.deeplearning.ai\u002Fshort-courses\u002Fimproving-accuracy-of-llm-applications\u002F) - Free. Practical methods for moving from 70% to 95% accuracy.\n* 6️⃣ [Agent Skills with Anthropic](https:\u002F\u002Fwww.deeplearning.ai\u002Fshort-courses\u002Fagent-skills-with-anthropic\u002F) - Free. Agent skills, the Anthropic way.\n* 6️⃣ [Agent Memory: Building Memory-Aware Agents](https:\u002F\u002Fwww.deeplearning.ai\u002Fshort-courses\u002Fagent-memory-building-memory-aware-agents\u002F) - Free. Short, focused course on memory architectures.\n* 6️⃣ [A2A: The Agent2Agent Protocol](https:\u002F\u002Fwww.deeplearning.ai\u002Fshort-courses\u002Fa2a-the-agent2agent-protocol\u002F) - Free. Google's Agent2Agent protocol explained by its designers.\n* 6️⃣ [Semantic Caching for AI Agents](https:\u002F\u002Fwww.deeplearning.ai\u002Fshort-courses\u002Fsemantic-caching-for-ai-agents\u002F) - Free. Cutting cost and latency through caching strategies.\n* 6️⃣ [NVIDIA NeMo Agent Toolkit: Making Agents Reliable](https:\u002F\u002Fwww.deeplearning.ai\u002Fshort-courses\u002Fnvidia-nat-making-agents-reliable\u002F) - Free. Guardrails and reliability at scale.\n* 6️⃣ [Building Coding Agents with Tool Execution](https:\u002F\u002Fwww.deeplearning.ai\u002Fshort-courses\u002Fbuilding-coding-agents-with-tool-execution\u002F) - Free. The core loop behind modern coding agents.\n\nSeveral DeepLearning.AI courses are listed in the topic sections below instead of here: `AI Agents in LangGraph` (under Agents), `Automated Testing for LLMOps` (under Evaluations), `Red Teaming LLM Applications` (under AI Safety), `Efficient Inference with SGLang` (under Deployment), and `Document AI: From OCR to Agentic Doc Extraction` (under Multimodal).\n\n### Which course to pick from the Towards AI offerings\n\n* 2️⃣ No Python yet: [Beginner Python for AI Engineering](https:\u002F\u002Facademy.towardsai.net\u002Fcourses\u002Fpython-for-genai?ref=1f9b29) first.\n* 2️⃣ Non-technical and want to use AI at work: [AI for Work](https:\u002F\u002Facademy.towardsai.net\u002Fcourses\u002Fai-business-professionals?ref=1f9b29).\n* 3️⃣ Want a quick overview first: start with [10-Hour LLM Fundamentals](https:\u002F\u002Facademy.towardsai.net\u002Fcourses\u002Fllm-primer?ref=1f9b29).\n* 4️⃣ Want the whole stack from nothing: start with the [Get it all! From Novice to Expert Bundle](https:\u002F\u002Facademy.towardsai.net\u002Fbundles\u002Fget-it-all?ref=1f9b29).\n* 5️⃣ Python-comfortable and want the full developer path: start with [Full Stack AI Engineering](https:\u002F\u002Facademy.towardsai.net\u002Fcourses\u002Fbeginner-to-advanced-llm-dev?ref=1f9b29).\n* 5️⃣ Want free and docs-heavy: pair the [Hugging Face LLM Course](https:\u002F\u002Fhuggingface.co\u002Flearn\u002Fllm-course\u002Fchapter1\u002F1) with [LangChain Academy](https:\u002F\u002Facademy.langchain.com\u002F).\n* 7️⃣ Already shipped basic LLM apps and want to specialize: go to [Agentic AI Engineering](https:\u002F\u002Facademy.towardsai.net\u002Fcourses\u002Fagent-engineering?ref=1f9b29).\n\n----\n\n## Practice and projects\u003Ca name=\"practice\">\u003C\u002Fa>\n\nReading and watching will only take you so far. You become an AI engineer by building systems that fail in expensive and educational ways.\n\n[\u003Cimg src=\"https:\u002F\u002Fimg.youtube.com\u002Fvi\u002FD89pj9cqUm4\u002Fmaxresdefault.jpg\" width=\"512\"\u002F>](https:\u002F\u002Fyoutu.be\u002FD89pj9cqUm4)\n\nWatch [What I Look For When Hiring AI Engineers](https:\u002F\u002Fyoutu.be\u002FD89pj9cqUm4) before you start your first serious project. I share how I evaluate AI engineering candidates, why decision-making matters more than polished agent-generated output, and what kinds of practice projects actually teach useful skills.\n\n### Good first projects\n\n* 4️⃣ A document question-answering assistant with citations and a real eval set.\n* 4️⃣ A customer support workflow with tools and structured outputs.\n* 5️⃣ A research assistant that plans, searches, reads, and writes a short brief.\n* 5️⃣ A coding helper scoped to one narrow internal task.\n* 5️⃣ A multimodal invoice or receipt parser with validation.\n* 6️⃣ [Designing Real-World AI Agents Workshop](https:\u002F\u002Fgithub.com\u002Fiusztinpaul\u002Fdesigning-real-world-ai-agents-workshop) - Paul Iusztin's hands-on workshop for building a Deep Research Agent plus a LinkedIn Writing Workflow as MCP servers. It includes code, slides, video, evaluation patterns, and an `implement_yourself\u002F` path designed to be rebuilt with agentic coding tools instead of copied.\n* 6️⃣ A small agent that plans, acts, checks, and retries within a budget.\n\n### Reference repos and tutorials\n\n* 3️⃣ [OpenAI Cookbook](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fopenai-cookbook) - Official recipes in notebook form. The quickest path to a working example of most common tasks.\n* 4️⃣ [Google Gemini Cookbook](https:\u002F\u002Fgithub.com\u002Fgoogle-gemini\u002Fcookbook) - Google. Gemini-flavored equivalent covering multimodal, long context, and tool use.\n* 4️⃣ [LlamaIndex Starter Tutorial](https:\u002F\u002Fdevelopers.llamaindex.ai\u002Fpython\u002Fframework\u002Fgetting_started\u002Fstarter_example\u002F) and [Understanding LlamaIndex](https:\u002F\u002Fdevelopers.llamaindex.ai\u002Fpython\u002Fframework\u002Funderstanding\u002F) - The fastest path from zero to a working RAG pipeline.\n* 4️⃣ [AI Engineering Cheatsheets](https:\u002F\u002Fgithub.com\u002Flouisfb01\u002Fai-engineering-cheatsheets) - Louis-François Bouchard's decision tables and playbooks for choosing approaches.\n* 5️⃣ [Pydantic AI docs](https:\u002F\u002Fai.pydantic.dev\u002F) - Type-safe agent framework from the Pydantic team.\n* 6️⃣ [DSPy Tutorials](https:\u002F\u002Fdspy.ai\u002Ftutorials\u002F) - Tutorials for the DSPy approach of compiling prompts as programs.\n* 6️⃣ [Designing Real-World AI Agents Workshop](https:\u002F\u002Fgithub.com\u002Fiusztinpaul\u002Fdesigning-real-world-ai-agents-workshop) - Build and run a multi-agent system with MCP servers, evaluator-optimizer loops, grounded search, structured outputs, and LLM-as-judge evaluation.\n* 7️⃣ [Paul Iusztin's hands-on-llms repo](https:\u002F\u002Fgithub.com\u002Fiusztinpaul\u002Fhands-on-llms) - End-to-end production project with training, serving, and monitoring.\n\nFramework docs for agent-oriented libraries (LangGraph, CrewAI, AutoGen, Agno) live in the Agents section below.\n\n### Questions to force yourself to answer on every project\n\n* Why is this prompt, tool, or architecture chosen?\n* Where and how will it fail?\n* How will I evaluate it, offline and online?\n* What will I log and inspect when it misbehaves?\n* What is the cheapest design that still clears the bar?\n* Is an agent actually the right choice here, or is a workflow enough?\n\nIf you cannot answer those, keep building.\n\n----\n\n## Prompting and structured outputs\u003Ca name=\"prompting\">\u003C\u002Fa>\n\nPrompting still matters in 2026. The useful version is not clever tricks. It is writing reliable contracts for non-deterministic systems.\n\n### Subtopics to cover\n\nClear task framing, output contracts, structured outputs and JSON schemas, few-shot examples, grounding and citations, verification loops, tool-use instructions, completion criteria, and prompt versioning.\n\n### Best resources\n\n* 3️⃣ [OpenAI Prompt Engineering Guide](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fguides\u002Fprompt-engineering) - Official, up-to-date, API-centric.\n* 3️⃣ [Anthropic Prompt Engineering Overview](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fbuild-with-claude\u002Fprompt-engineering\u002Foverview) - Claude-specific advice, generalizes well.\n* 3️⃣ [Anthropic Prompt Library](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fprompt-library\u002Flibrary) - Anthropic's curated library of battle-tested prompts.\n* 3️⃣ [Learn Prompting](https:\u002F\u002Flearnprompting.org\u002F) - Free, community-maintained reference covering beginner to advanced prompting.\n* 5️⃣ [OpenAI GPT-5 Prompting Guide](https:\u002F\u002Fcookbook.openai.com\u002Fexamples\u002Fgpt-5\u002Fgpt-5_prompting_guide) - Model-specific prompting advice from the OpenAI team.\n* 5️⃣ [Anthropic: Increase Output Consistency](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Ftest-and-evaluate\u002Fstrengthen-guardrails\u002Fincrease-consistency) - Techniques for reducing output drift across runs.\n* 5️⃣ [Instructor: structured outputs with Pydantic](https:\u002F\u002Fpython.useinstructor.com\u002F) - Jason Liu's library for turning free-form LLM outputs into typed Python objects.\n* 6️⃣ [Structured Data Extraction from Unstructured Content Using LLM Schemas](https:\u002F\u002Fsimonwillison.net\u002F2025\u002FFeb\u002F28\u002Fllm-schemas\u002F) - Simon Willison's approach to schema-first extraction.\n\nTreat prompts as code you version, interfaces you test, and product decisions you revisit. That framing is more useful than any list of prompting tricks.\n\n----\n\n## Reasoning models and test-time compute\u003Ca name=\"reasoning\">\u003C\u002Fa>\n\nReasoning models (OpenAI o-series, Anthropic Claude with extended thinking, Google Gemini Pro with thinking, DeepSeek R-models, Qwen reasoning variants) behave differently from standard chat models. They reward different prompting and break in different ways.\n\n### Subtopics to cover\n\nWhen reasoning models help, when they hurt, how to set thinking budgets, how to structure input for a thinking model, extended thinking and tool use together, and cost\u002Flatency tradeoffs.\n\n### Best resources\n\n* 4️⃣ [Towards AI Newsletter issues](https:\u002F\u002Fnewsletter.towardsai.net\u002F) - Weekly coverage of major reasoning model releases with benchmarks and opinion.\n* 5️⃣ [Anthropic: Prompt caching](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fbuild-with-claude\u002Fprompt-caching) - Usually where reasoning costs get controlled in production.\n* 6️⃣ [Anthropic: Building with extended thinking](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fbuild-with-claude\u002Fextended-thinking) - Official docs on how to use Claude's thinking mode correctly.\n* 7️⃣ [OpenAI: Run long horizon tasks with Codex](https:\u002F\u002Fdevelopers.openai.com\u002Fblog\u002Frun-long-horizon-tasks-with-codex) - Long-running reasoning workflows in practice.\n* 7️⃣ [OpenAI: Unrolling the Codex agent loop](https:\u002F\u002Fopenai.com\u002Findex\u002Funrolling-the-codex-agent-loop\u002F) - Inside the loop that a reasoning agent actually runs.\n* 7️⃣ [The State of LLMs 2025](https:\u002F\u002Fmagazine.sebastianraschka.com\u002Fp\u002Fstate-of-llms-2025) - Sebastian Raschka's overview of how reasoning models reshaped the stack.\n* 8️⃣ [Why We Think](https:\u002F\u002Flilianweng.github.io\u002Fposts\u002F2025-05-01-thinking\u002F) - Lilian Weng on the theory behind test-time compute.\n\nRule of thumb for 2026: reach for a reasoning model when the task genuinely requires multi-step planning, verification, or tool use. For simple classification, extraction, or short answers, a cheaper standard model still wins on cost and latency.\n\n----\n\n## Context engineering and long context\u003Ca name=\"context\">\u003C\u002Fa>\n\nContext engineering is one of the most important 2026 skills. The model is only as good as what you put in its context and how you stage it.\n\n### Subtopics to cover\n\nWhat belongs in context and what does not, context windows and context rot, message history management, memory versus retrieval, compaction and summaries, working files and scratchpads, repo-level instructions such as AGENTS.md or CLAUDE.md, and context handoffs between runs.\n\n### Best resources\n\n* 4️⃣ [Anthropic: Context windows](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fbuild-with-claude\u002Fcontext-windows) - Official docs with practical guidance on context limits and caching.\n* 5️⃣ [Context engineering](https:\u002F\u002Fsimonwillison.net\u002F2025\u002FJun\u002F27\u002Fcontext-engineering\u002F) and [How to Fix Your Context](https:\u002F\u002Fsimonwillison.net\u002F2025\u002FJun\u002F29\u002Fhow-to-fix-your-context\u002F) - Simon Willison. The two posts that gave the field its current vocabulary.\n* 5️⃣ [Lost in the Middle](https:\u002F\u002Fpub.towardsai.net\u002Flost-in-the-middle-629b20d86152) - How attention drops inside long contexts and what it means for your prompt design.\n* 6️⃣ [Effective context engineering for AI agents](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Feffective-context-engineering-for-ai-agents) - Anthropic. How the Claude team thinks about context as a first-class design surface.\n* 6️⃣ [Harness Engineering](https:\u002F\u002Fwww.louisbouchard.ai\u002Fharness-engineering\u002F) - Louis-François Bouchard on the scaffolding around the model that controls what enters and leaves context.\n* 7️⃣ [Context engineering for LLMs: Production-Ready RAG Systems](https:\u002F\u002Fpub.towardsai.net\u002Fcontext-engineering-4a17018c41cf) - Chunking, retrieval, reranking, and token budgeting for real systems.\n* 7️⃣ [Jason Liu's Context Engineering Series](https:\u002F\u002Fjxnl.co\u002Fwriting\u002F2025\u002F08\u002F28\u002Fcontext-engineering-index\u002F) - Consulting-flavored write-up from enterprise projects.\n\nMost people try to fix bad systems by stuffing more tokens into the prompt. That usually makes results worse. The better habit is to be intentional about which instructions are permanent, which data is retrieved on demand, which state gets externalized into files or tools, and when to reset the context entirely.\n\n----\n\n## Retrieval-Augmented Generation (RAG)\u003Ca name=\"rag\">\u003C\u002Fa>\n\nRAG is still a core technique. The naive \"stuff some chunks into the prompt\" version is no longer enough.\n\n### Subtopics to cover\n\nChunking strategies, embeddings, vector search, hybrid search with BM25, reranking, citations and provenance, metadata filtering, query rewriting, corrective RAG, retrieval quality evaluation, agentic retrieval, and knowing when RAG is the wrong answer.\n\n### Best resources\n\n* 4️⃣ [Why RAG Is Not Training Your AI](https:\u002F\u002Fwww.louisbouchard.ai\u002Fwhy-rag-is-not-training-your-ai\u002F) - Louis-François Bouchard on the mental model most builders get wrong.\n* 4️⃣ [LlamaIndex Introduction to RAG](https:\u002F\u002Fdevelopers.llamaindex.ai\u002Fpython\u002Fframework\u002Funderstanding\u002Frag\u002F) - Official docs. The cleanest free path to a working RAG system.\n* 4️⃣ [Pinecone RAG guide](https:\u002F\u002Fwww.pinecone.io\u002Flearn\u002Fretrieval-augmented-generation\u002F) - Vendor-written but solid introduction with diagrams.\n* 5️⃣ [Is RAG Still Needed in the Era of Long Context LLMs?](https:\u002F\u002Fpub.towardsai.net\u002Fis-rag-still-needed-in-the-era-of-long-context-llms-3d89907ce624) - Clear framework for when long context replaces RAG and when it does not.\n* 6️⃣ [Contextual Retrieval in AI Systems](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fcontextual-retrieval) - Anthropic's prompt-cached contextual chunking pattern with measured quality gains.\n* 6️⃣ [Hybrid Search RAG That Actually Works](https:\u002F\u002Fpub.towardsai.net\u002Fhybrid-search-rag-that-actually-works-bm25-vectors-reranking-in-python-0c02ade0799d) - Production-ready code combining BM25, vectors, and reranking.\n* 7️⃣ [Context Engineering, Not Retrieval: Why Your Agentic RAG Fails in Production](https:\u002F\u002Fpub.towardsai.net\u002Fcontext-engineering-not-retrieval-why-your-agentic-rag-fails-in-production-39093f0e5025) - April 2026. The gap between prototype and production is almost always a context problem, not a retrieval problem. Practical diagnosis for teams that have tuned embeddings for months and still see failures.\n* 7️⃣ [Why Most RAGs Stay POCs — How to Take Your Data Pipelines to Production](https:\u002F\u002Fpub.towardsai.net\u002Fwhy-most-rags-stay-pocs-how-to-take-your-data-pipelines-to-production-4ac01fe9f9e3) - Why prototype RAG systems stall before production, and how to structure data pipelines (Databricks Asset Bundles, Python Wheel artifacts, Clean Architecture) so they actually ship and stay maintainable.\n* 7️⃣ [Vectorless RAG: Your RAG Pipeline Doesn't Need a Vector Database](https:\u002F\u002Fpub.towardsai.net\u002Fvectorless-rag-your-rag-pipeline-doesnt-need-a-vector-database-0a0839feabd9) - For structured documents like contracts and financial reports, building a hierarchical JSON tree and letting the LLM navigate it can beat embeddings-plus-vector-DB. No chunking, no vector DB, fully traceable citations.\n* 7️⃣ [Systematically Improving RAG](https:\u002F\u002Fjxnl.co\u002Fwriting\u002F2024\u002F05\u002F22\u002Fsystematically-improving-your-rag\u002F) - Jason Liu's playbook for RAG iteration.\n* 8️⃣ [Evolve or perish: The new RAG paradigm](https:\u002F\u002Fwww.decodingai.com\u002Fp\u002Fevolve-or-perish-the-new-rag-paradigm) - Paul Iusztin on where RAG is heading.\n\nDo not stop at \"uploaded PDF, got answer.\" Build one serious RAG app with citations, retrieval debugging, considered chunking choices, metadata filters, an eval set, and a way to inspect misses. That is where the real learning happens.\n\n----\n\n## Embeddings, rerankers, and vector databases\u003Ca name=\"vectors\">\u003C\u002Fa>\n\nGood retrieval depends on the pieces around the model.\n\n### Embedding models and rerankers\n\n* 4️⃣ [Cohere Embed and Rerank](https:\u002F\u002Fdocs.cohere.com\u002Fdocs\u002Fembeddings) - Strong general-purpose production choice with multilingual support.\n* 4️⃣ [Voyage AI](https:\u002F\u002Fdocs.voyageai.com\u002F) - Domain-specific embeddings (finance, legal, medical) plus the `rerank-2` reranker.\n* 4️⃣ [Jina Embeddings](https:\u002F\u002Fjina.ai\u002Fembeddings\u002F) and [Jina Reranker](https:\u002F\u002Fjina.ai\u002Freranker) - Competitive multilingual options, strong on long documents.\n* 4️⃣ [Nomic Embed](https:\u002F\u002Fhome.nomic.ai\u002Fembed) - Strong open-source option with Apache 2.0 licensing.\n* 5️⃣ [Hugging Face MTEB Leaderboard](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fmteb\u002Fleaderboard) - Community leaderboard for picking an embedding model by task.\n\n### Vector databases\n\n* 4️⃣ [Qdrant docs](https:\u002F\u002Fqdrant.tech\u002Fdocumentation\u002F) - Fast, production-ready, open source, free managed tier.\n* 4️⃣ [Weaviate docs](https:\u002F\u002Fweaviate.io\u002Fdevelopers\u002Fweaviate) - Open source with built-in hybrid search and RAG modules.\n* 4️⃣ [LanceDB docs](https:\u002F\u002Flancedb.github.io\u002Flancedb\u002F) - Embedded, Python-first, no server needed. Great for local RAG prototypes.\n* 4️⃣ [Pinecone](https:\u002F\u002Fdocs.pinecone.io\u002F) - Managed serverless, the most common enterprise default.\n* 4️⃣ [pgvector](https:\u002F\u002Fgithub.com\u002Fpgvector\u002Fpgvector) - Vector search inside Postgres. Best choice when you already have Postgres and want to avoid a second system.\n* 4️⃣ [Chroma](https:\u002F\u002Fdocs.trychroma.com\u002F) - Light, simple, good for prototypes and tutorials.\n\n### Good practitioner write-ups\n\n* 7️⃣ [Inside Vector Databases: Engineering High-Dimensional Search](https:\u002F\u002Fpub.towardsai.net\u002Finside-vector-databases-engineering-high-dimensional-search-for-modern-ai-systems-704c2efe99e9) - How HNSW and IVF actually work.\n\n----\n\n## Tools, MCP, and computer use\u003Ca name=\"tools\">\u003C\u002Fa>\n\nIf prompting was the first phase of AI apps, and tools the second, then in 2026 MCP and structured tool ecosystems are part of the default stack.\n\n### Subtopics to cover\n\nFunction and tool calling, tool schemas, tool selection and retries, permissions and safety boundaries, tool result formatting, MCP clients and servers, web search and code execution tools, computer use, and authentication against external systems.\n\n### Best resources\n\n* 5️⃣ [Anthropic Tool use overview](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fbuild-with-claude\u002Ftool-use\u002Foverview) - The cleanest reference for function calling with Claude.\n* 5️⃣ [Introducing the Model Context Protocol](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fmodel-context-protocol) - Original announcement, still the best one-page summary.\n* 5️⃣ [Model Context Protocol Getting Started](https:\u002F\u002Fmodelcontextprotocol.io\u002Fdocs\u002Fgetting-started\u002Fintro) - Official MCP docs.\n* 5️⃣ [Hugging Face MCP Course](https:\u002F\u002Fhuggingface.co\u002Flearn\u002Fmcp-course\u002F) - Free course covering the client and server implementation.\n* 5️⃣ [Introduction to Model Context Protocol](https:\u002F\u002Fanthropic.skilljar.com\u002Fintroduction-to-model-context-protocol) - Anthropic's own short course, free.\n* 6️⃣ [Anthropic Web search tool](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fbuild-with-claude\u002Ftool-use\u002Fweb-search-tool) and [Code execution tool](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fagents-and-tools\u002Ftool-use\u002Fcode-execution-tool) - Built-in tools that remove most of the glue you used to write.\n* 6️⃣ [Anthropic Skills](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fequipping-agents-for-the-real-world-with-agent-skills) and [Agent Skills open standard](https:\u002F\u002Fagentskills.io\u002F) - The skills primitive: reusable markdown instructions Claude loads at the right moment. First-class in Claude.ai, Claude Code, and the API in 2026, now an open standard used across multiple agent platforms.\n* 6️⃣ [Anthropic Computer use](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fbuild-with-claude\u002Fcomputer-use) - Letting a model control a screen and a keyboard inside sandboxed environments.\n* 6️⃣ [MCP Architecture overview](https:\u002F\u002Fmodelcontextprotocol.io\u002Fdocs\u002Flearn\u002Farchitecture), [Server concepts](https:\u002F\u002Fmodelcontextprotocol.io\u002Fdocs\u002Flearn\u002Fserver-concepts), [Build an MCP server](https:\u002F\u002Fmodelcontextprotocol.io\u002Fdocs\u002Fdevelop\u002Fbuild-server), and [Build an MCP client](https:\u002F\u002Fmodelcontextprotocol.io\u002Fdocs\u002Fdevelop\u002Fbuild-client) - Full reference for both sides of the protocol.\n* 7️⃣ [Writing effective tools for agents](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fwriting-tools-for-agents) - Anthropic. Practical guide to tool schemas, descriptions, and error handling.\n* 7️⃣ [Code execution with MCP](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fcode-execution-with-mcp) - Anthropic's pattern for composing MCP servers through code instead of long tool lists.\n* 7️⃣ [Model Context Protocol (MCP): Why Every AI Developer Needs MCP in 2026](https:\u002F\u002Fpub.towardsai.net\u002Fmodel-context-protocol-mcp-why-every-ai-developer-needs-mcp-in-2026-e68d39a49417) - Why MCP replaces ad-hoc REST integrations: decoupled Host\u002FClient\u002FServer architecture, why it scales better than direct API wiring, and what it means for maintaining AI applications across provider changes.\n* 8️⃣ [Model Context Protocol has Prompt Injection Security Problems](https:\u002F\u002Fsimonwillison.net\u002F2025\u002FApr\u002F9\u002Fmcp-prompt-injection\u002F) - Simon Willison. Read this before you deploy an MCP server that touches private data.\n\n### Search APIs worth knowing\n\nAgents that need to search the web rarely call raw Google or Bing. These are the APIs most production stacks use:\n\n* 4️⃣ [Tavily](https:\u002F\u002Fdocs.tavily.com\u002F) - Purpose-built search API for LLM agents with content extraction and summarization.\n* 4️⃣ [Exa](https:\u002F\u002Fdocs.exa.ai\u002F) - Semantic search API with neural retrieval over the web.\n* 4️⃣ [Brave Search API](https:\u002F\u002Fbrave.com\u002Fsearch\u002Fapi\u002F) - Privacy-focused web search, common choice for agent stacks that need independent indexing.\n\nThe model is not your system. The tool layer is where most real capability and most real risk both live.\n\n----\n\n## Workflows, agents, and multi-agent systems\u003Ca name=\"agents\">\u003C\u002Fa>\n\nThis is where hype gets loud and engineering judgment becomes valuable.\n\n### Subtopics to cover\n\nWorkflow versus agent, single agent versus multi-agent, ReAct and tool loops, routing and orchestration, planning and reflection, human-in-the-loop, state and memory, failure modes, and when to avoid autonomy altogether.\n\n### Best resources\n\n* 5️⃣ [AI Agents in LangGraph](https:\u002F\u002Fwww.deeplearning.ai\u002Fshort-courses\u002Fai-agents-in-langgraph\u002F) - Harrison Chase and DeepLearning.AI. Free. The cleanest intro to graph-based agents.\n* 5️⃣ [LangGraph docs](https:\u002F\u002Fdocs.langchain.com\u002Foss\u002Fpython\u002Flanggraph\u002Foverview) - Official graph-based orchestration docs for long-running, stateful agents.\n* 5️⃣ [LlamaIndex Workflows](https:\u002F\u002Fdevelopers.llamaindex.ai\u002Fpython\u002Fllamaagents\u002Fworkflows\u002F) - LlamaIndex's event-driven workflow system.\n* 5️⃣ [CrewAI](https:\u002F\u002Fdocs.crewai.com\u002F), [AutoGen](https:\u002F\u002Fmicrosoft.github.io\u002Fautogen\u002Fstable\u002F), and [Agno](https:\u002F\u002Fdocs.agno.com\u002F) - Framework docs for three of the main alternatives.\n* 6️⃣ [Building Effective AI Agents](https:\u002F\u002Fwww.anthropic.com\u002Fresearch\u002Fbuilding-effective-agents) - Anthropic. The reference post on agent vs workflow design.\n* 6️⃣ [Stop Building Agent Demos](https:\u002F\u002Fwww.louisbouchard.ai\u002Fstop-building-agent-demos\u002F) - Louis-François Bouchard on the demo-to-production gap.\n* 6️⃣ [Agents and Workflows](https:\u002F\u002Fwww.louisbouchard.ai\u002Fagents-and-workflows\u002F) - Louis-François Bouchard on when multi-agent is overengineering.\n* 6️⃣ [What Makes an AI Agent Actually Agentic?](https:\u002F\u002Fpub.towardsai.net\u002Fwhat-makes-an-ai-agent-actually-agentic-building-beyond-the-basics-with-langgraph-cf73c659d753) - What separates a real agent from a workflow wearing an LLM hat: autonomy, memory, and resilience. Walks through refactoring a hardcoded LangGraph assistant into a ReAct-based agent with SQLite checkpointing and layered, context-aware error handling.\n* 6️⃣ [Agent Architecture Guide](https:\u002F\u002Fgithub.com\u002Flouisfb01\u002Fai-engineering-cheatsheets\u002Fblob\u002Fmain\u002FAgent_Architecture_Guide.md) - Louis-François Bouchard's 13-question decision framework for agent design.\n* 7️⃣ [LLM Powered Autonomous Agents](https:\u002F\u002Flilianweng.github.io\u002Fposts\u002F2023-06-23-agent\u002F) - Lilian Weng. The reference post that defined the field.\n* 7️⃣ [Agents](https:\u002F\u002Fhuyenchip.com\u002F2025\u002F01\u002F07\u002Fagents.html) - Chip Huyen's long-form primer on agent design, planning, and tool use. One of the most-shared agent posts of 2025.\n* 7️⃣ [12-Factor Agents](https:\u002F\u002Fgithub.com\u002Fhumanlayer\u002F12-factor-agents) - Dex Horthy's widely-cited production-agent checklist covering state, tools, context, and reliability. Heavily referenced across 2025-2026 agent engineering discussions.\n* 7️⃣ [Creating an Advanced AI Agent From Scratch with Python in 2026](https:\u002F\u002Fpub.towardsai.net\u002Fcreating-an-advanced-ai-agent-from-scratch-with-python-in-2026-part-1-ce74a23f6514) - Modular architecture over framework lock-in: a flexible tool system, provider-agnostic LLM wrapper, and a ReAct-based agent orchestrator with Pydantic for type-safe tool execution. Lets you swap models and tools without touching the core loop.\n* 7️⃣ [The Two Things Every Reliable Agent Needs](https:\u002F\u002Fpub.towardsai.net\u002Fthe-two-things-every-reliable-agent-needs-ec3c2621cce7) - A framework centered on memory-first design and an anti-Goodhart scoreboard: treat memory as a core system with defined forms, functions, and dynamics, and evaluate with adversarial metrics across full episodes so agents solve the actual problem instead of gaming a proxy.\n* 7️⃣ [LangChain Middleware: The Missing Layer Between Your Agent and Production](https:\u002F\u002Fpub.towardsai.net\u002Flangchain-middleware-the-missing-layer-between-your-agent-and-production-b7a5b8cba4c2) - LangChain's new middleware system pulls operational concerns (summarization, human approval, retries, token tracking, dynamic routing, tool monitoring, context injection) out of agent logic and into a dedicated layer. Covers decorator vs class-style hooks, ordering rules, custom state schemas, and five production patterns.\n* 7️⃣ [Google's A2A Protocol using LangGraph: Build Agent Systems That Actually Communicate](https:\u002F\u002Fpub.towardsai.net\u002Fgoogles-a2a-protocol-using-langgraph-build-agent-systems-that-actually-communicate-2b8ee488f808) - Divy Yadav. Practical deep-dive into Agent2Agent: Agent Cards for discovery, structured task lifecycles, HTTP messaging, and how A2A complements (not competes with) MCP. Covers real production failure modes — timeout handling, context mismatch, authentication drift — with a LangGraph implementation walkthrough.\n* 7️⃣ [Agentic AI Engineering](https:\u002F\u002Facademy.towardsai.net\u002Fcourses\u002Fagent-engineering?ref=1f9b29) - Towards AI's deep dive with two shipped agents as capstones. *(Paid)*\n* 8️⃣ [How we built our multi-agent research system](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Fbuilt-multi-agent-research-system) - Anthropic. Real architecture behind a shipped multi-agent product.\n* 8️⃣ [Building Production Text-to-SQL for 70,000+ Tables: OpenAI's Data Agent Architecture](https:\u002F\u002Fpub.towardsai.net\u002Fbuilding-production-text-to-sql-for-70-000-tables-openais-data-agent-architecture-bcd695990d55) - How OpenAI built an internal data agent for its own data warehouse. Goes beyond naive query generation: six layers of context (table usage patterns, human annotations, business logic extracted from code), plus a closed-loop validation step where the agent profiles results, catches its own errors, and repairs queries. The real lesson — agent effectiveness depends on the richness of context, not the model.\n\nMost teams should start with a workflow. Add autonomy only where it clearly buys something. That saves token spend, latency, debugging pain, and a lot of regret.\n\n----\n\n## Evaluations, observability, and harnesses\u003Ca name=\"evals\">\u003C\u002Fa>\n\nThe layer most people skip and rediscover the hard way.\n\n### Subtopics to cover\n\nGolden datasets, rule-based checks, LLM-as-a-judge, regression testing, traces and spans, prompt versioning, error analysis, offline evaluations and online monitoring, harness design, and testability of agent behavior.\n\n### Best resources\n\n* 5️⃣ [Your job is to deliver code you have proven to work](https:\u002F\u002Fsimonwillison.net\u002F2025\u002FDec\u002F18\u002Fcode-proven-to-work\u002F) - Simon Willison. Less about tooling, more about the right mental model for this work.\n* 5️⃣ [Your AI Product Needs Evals](https:\u002F\u002Fhamel.dev\u002Fblog\u002Fposts\u002Fevals\u002F) and [LLM Evals FAQ](https:\u002F\u002Fhamel.dev\u002Fblog\u002Fposts\u002Fevals-faq\u002F) - Hamel Husain. The canonical starting point.\n* 5️⃣ [Automated Testing for LLMOps](https:\u002F\u002Fwww.deeplearning.ai\u002Fshort-courses\u002Fautomated-testing-llmops\u002F) - DeepLearning.AI short course, free. CI-style testing for LLM-powered apps.\n* 5️⃣ [Ragas](https:\u002F\u002Fdocs.ragas.io\u002Fen\u002Fstable\u002F) - Open-source RAG evaluation library.\n* 5️⃣ [LangSmith](https:\u002F\u002Fdocs.langchain.com\u002Flangsmith\u002Fhome) and [LangSmith Evaluation](https:\u002F\u002Fdocs.langchain.com\u002Flangsmith\u002Fevaluation) - Hosted tracing and eval tooling from LangChain.\n* 5️⃣ [Braintrust](https:\u002F\u002Fwww.braintrust.dev\u002Fdocs\u002Fstart) - Commercial eval and observability platform popular with teams that want structured experiment tracking.\n* 5️⃣ [Arize Phoenix](https:\u002F\u002Farize.com\u002Fdocs\u002Fphoenix) - Open-source observability for LLM applications.\n* 5️⃣ [Pydantic AI and Logfire](https:\u002F\u002Flogfire.pydantic.dev\u002Fdocs\u002F) - Type-safe agent framework and observability tool from the Pydantic team.\n* 6️⃣ [Harness Engineering: The Missing Layer Behind AI Agents](https:\u002F\u002Fwww.louisbouchard.ai\u002Fharness-engineering\u002F) - Louis-François Bouchard on why harnesses, not models, are what separates production from prototype.\n* 6️⃣ [Harness engineering](https:\u002F\u002Fopenai.com\u002Findex\u002Fharness-engineering\u002F) - OpenAI's framing of the same layer for coding agents.\n* 6️⃣ [Testing Agent Skills Systematically with Evals](https:\u002F\u002Fdevelopers.openai.com\u002Fblog\u002Feval-skills) - OpenAI on building evals for agent skills.\n* 6️⃣ [Effective harnesses for long-running agents](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Feffective-harnesses-for-long-running-agents) - Anthropic. Scaffolding for hour-long agent runs.\n* 6️⃣ [A Field Guide to Rapidly Improving AI Products](https:\u002F\u002Fhamel.dev\u002Fblog\u002Fposts\u002Ffield-guide\u002F) - Hamel Husain's end-to-end playbook for going from \"it kinda works\" to a real product, pairing evals with error analysis and data flywheels.\n* 6️⃣ [Task-Specific LLM Evals that Do & Don't Work](https:\u002F\u002Feugeneyan.com\u002Fwriting\u002Fevals\u002F) and [Evaluating LLM-Evaluators](https:\u002F\u002Feugeneyan.com\u002Fwriting\u002Fllm-evaluators\u002F) - Eugene Yan on where LLM-as-judge helps and where it misleads.\n* 6️⃣ [In Defense of AI Evals, for Everyone](https:\u002F\u002Fwww.sh-reya.com\u002Fblog\u002Fin-defense-ai-evals\u002F) and [Data Flywheels for LLM Applications](https:\u002F\u002Fwww.sh-reya.com\u002Fblog\u002Fai-engineering-flywheel\u002F) - Shreya Shankar on why evals are a product skill, not a research skill.\n* 7️⃣ [Agent Observability and Evaluation: A 2026 Developer's Guide](https:\u002F\u002Fpub.towardsai.net\u002Fagent-observability-and-evaluation-a-2026-developers-guide-to-building-reliable-ai-agents-f4547e4beb14) - Divy Yadav. One of the most complete recent write-ups.\n* 7️⃣ [MLflow Observability for Generative AI: A Deep Dive with Text2SQL + RAG + WebSearch using LangGraph](https:\u002F\u002Fpub.towardsai.net\u002Fmlflow-observability-for-generative-ai-a-deep-dive-with-text2sql-rag-websearch-using-langgraph-2430c502adfa) - MLflow's native tracing applied to a real LangGraph e-commerce agent. Every node instrumented with spans, traces, and cost-tracking decorators — shows what hierarchical trace trees actually look like for a production agentic pipeline, not just HTTP latency timestamps.\n* 8️⃣ [Inspect AI](https:\u002F\u002Finspect.aisi.org.uk\u002F) - UK AI Safety Institute's open-source framework for building LLM evals, used in frontier safety research and increasingly in production.\n\nIf you cannot tell whether your system is improving, you are not engineering yet, you are moving vibes around.\n\n----\n\n## Fine-tuning and data curation\u003Ca name=\"finetuning\">\u003C\u002Fa>\n\nFine-tuning still matters, and in 2026 it is no longer the first hammer most teams reach for. Reasoning models, prompt caching, long context, and cheap high-quality base models shifted the tradeoff.\n\n### Subtopics to cover\n\nWhen prompting is enough, when RAG is enough, when supervised fine-tuning helps, synthetic data generation, dataset cleaning and formatting, preference optimization and reinforcement fine-tuning, Low-Rank Adaptation (LoRA) and Decomposed Low-Rank Adaptation (DoRA), domain adaptation, and cost\u002Fmaintenance tradeoffs.\n\n### Best resources\n\n* 5️⃣ [Building LLMs for Production](https:\u002F\u002Famzn.to\u002F4dZ0Mtz) - Towards AI. The fine-tuning chapters alone are worth the price for most teams. The [Academy e-book version](https:\u002F\u002Facademy.towardsai.net\u002Fcourses\u002Fbuildingllmsforproduction?ref=1f9b29) is also available. *(Paid)*\n* 5️⃣ [Hugging Face smol fine-tuning course](https:\u002F\u002Fhuggingface.co\u002Flearn\u002Fsmol-course\u002Funit1\u002F3) - Free, code-first walkthrough. Fine-tuning small models hands-on.\n* 5️⃣ [OpenAI model optimization guide](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fguides\u002Fmodel-optimization) - Official docs for API-level fine-tuning and distillation.\n* 6️⃣ [Hugging Face PEFT docs](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Fpeft\u002F) - The official library for LoRA and related methods.\n* 6️⃣ [Using and Finetuning Pretrained Transformers](https:\u002F\u002Fmagazine.sebastianraschka.com\u002Fp\u002Fusing-and-finetuning-pretrained-transformers) - Sebastian Raschka's reference post.\n* 7️⃣ [How to Fine-Tune LLMs in 2025 with Hugging Face](https:\u002F\u002Fwww.philschmid.de\u002Ffine-tune-llms-in-2025) - Philipp Schmid. Single best recent how-to on modern fine-tuning.\n* 7️⃣ [LoRA vs Full Fine-Tuning](https:\u002F\u002Fpub.towardsai.net\u002Fllm-fine-tuning-lora-vs-full-fine-tuning-a-comparison-3aa1c1a0dc4d) - Florin Andrei's side-by-side comparison on real tasks.\n* 7️⃣ [What SFT, DPO, RLHF, and RAG Actually Do in an AI Agent](https:\u002F\u002Fpub.towardsai.net\u002Fwhat-sft-dpo-rlhf-and-rag-actually-do-in-an-ai-agent-d5b8daf0aedb) - Shenggang Li anchors each technique to a customer-support scenario: SFT for tone and task format, RAG for business facts at inference, DPO for choosing between two valid replies, RLHF when the problem runs deeper than any single answer. A clean decision framework for picking the right fix.\n* 8️⃣ [Improving LoRA: Implementing DoRA from Scratch](https:\u002F\u002Fmagazine.sebastianraschka.com\u002Fp\u002Flora-and-dora-from-scratch) - Sebastian Raschka on the LoRA successor.\n\nOnly fine-tune after you understand the baseline and have evals. Otherwise you are tuning toward a blurry target.\n\n----\n\n## Multimodal and document understanding\u003Ca name=\"multimodal\">\u003C\u002Fa>\n\nMany real AI products need to read images, parse PDFs, work with screenshots, or combine text and visuals.\n\n### Subtopics to cover\n\nVision inputs, document layout understanding beyond Optical Character Recognition (OCR), multimodal prompting, image-grounded extraction, and table and chart extraction.\n\n### Best resources\n\n* 4️⃣ [Anthropic Vision docs](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fbuild-with-claude\u002Fvision) - Claude-specific vision API and prompting guidance.\n* 4️⃣ [OpenAI vision guide](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fguides\u002Fvision) - Official OpenAI vision reference.\n* 4️⃣ [Google Gemini multimodal capabilities](https:\u002F\u002Fai.google.dev\u002Fgemini-api\u002Fdocs) - Gemini native multimodal, strong on long documents and video.\n* 5️⃣ [Docling](https:\u002F\u002Fdocling-project.github.io\u002Fdocling\u002F) - IBM's open-source document extraction toolkit with layout and table reconstruction. Free.\n* 5️⃣ [Document AI: From OCR to Agentic Doc Extraction](https:\u002F\u002Fwww.deeplearning.ai\u002Fshort-courses\u002Fdocument-ai-from-ocr-to-agentic-doc-extraction\u002F) - LandingAI short course with Andrew Ng. Free.\n* 5️⃣ [LlamaIndex Structured Prediction](https:\u002F\u002Fdevelopers.llamaindex.ai\u002Fpython\u002Fframework\u002Funderstanding\u002Fextraction\u002Fstructured_prediction\u002F) - Schema-first extraction from documents and images.\n* 8️⃣ [Multimodal Large Language Models: Architectures, Training, and Real-World Applications](https:\u002F\u002Fpub.towardsai.net\u002Fmultimodal-large-language-models-architectures-training-and-real-world-applications-02155bf974c3) - Technical overview of MLLMs: modular versus monolithic architectures, alignment and fusion layers between encoders and LLM backbones, the three-stage training pipeline (modality alignment, joint pretraining, instruction tuning), and applications from document understanding to autonomous GUI agents.\n\nGood first project ideas: invoice extraction with validation, a receipt parser with structured outputs, a screenshot-to-action assistant, or a research workflow that extracts and cites figures from PDFs.\n\n----\n\n## Voice agents and realtime AI\u003Ca name=\"voice\">\u003C\u002Fa>\n\nVoice became table stakes for many products in 2025-2026. Low-latency turn-taking and realtime multimodal APIs now compete with traditional text chat.\n\n### Subtopics to cover\n\nSpeech-to-text and text-to-speech selection, turn-taking and barge-in, session management, latency budgeting, tool use inside a voice turn, and when voice beats text.\n\n### Best resources\n\n* 4️⃣ [Anthropic voice guidance](https:\u002F\u002Fdocs.anthropic.com\u002F) - Pairs Claude with an external speech pipeline (ElevenLabs, Deepgram, etc.).\n* 4️⃣ [ElevenLabs docs](https:\u002F\u002Felevenlabs.io\u002Fdocs) - Production voice cloning and streaming text-to-speech.\n* 4️⃣ [Deepgram](https:\u002F\u002Fdevelopers.deepgram.com\u002Fdocs) - Low-latency speech-to-text.\n* 5️⃣ [OpenAI Realtime API](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fguides\u002Frealtime) - The primary realtime reference for most teams. Native speech-to-speech with tool use.\n* 5️⃣ [Gemini Live API](https:\u002F\u002Fai.google.dev\u002Fgemini-api\u002Fdocs\u002Flive) - Google's realtime multimodal endpoint.\n* 5️⃣ [Pipecat](https:\u002F\u002Fdocs.pipecat.ai\u002F) - Open-source voice agent framework. Free.\n* 5️⃣ [LiveKit Agents](https:\u002F\u002Fdocs.livekit.io\u002Fagents\u002F) - Realtime agent infrastructure with strong WebRTC support.\n\n----\n\n## Deployment, inference, and open-weight models\u003Ca name=\"deployment\">\u003C\u002Fa>\n\nThis is where \"my notebook works\" becomes \"my product survives real users and traffic.\"\n\n### Subtopics to cover\n\nApplication Programming Interface (API) deployment, containers, concurrency, OpenAI-compatible serving, prompt and KV cache use, vLLM and other inference servers, local models and privacy tradeoffs, cost and latency and throughput tradeoffs, self-hosted versus serverless, and reliability, scaling, and rollbacks.\n\n### Serving and inference\n\n* 4️⃣ [Ollama](https:\u002F\u002Follama.com\u002F) and [Ollama docs](https:\u002F\u002Fdocs.ollama.com\u002F) - The easiest way to run open models locally.\n* 4️⃣ [LM Studio](https:\u002F\u002Flmstudio.ai\u002F) - Graphical User Interface (GUI) for local inference, good for non-developers.\n* 6️⃣ [vLLM docs](https:\u002F\u002Fdocs.vllm.ai\u002Fen\u002Flatest\u002F) and [vLLM Quickstart](https:\u002F\u002Fdocs.vllm.ai\u002Fen\u002Fstable\u002Fgetting_started\u002Fquickstart\u002F) - UC Berkeley's high-throughput inference server. De facto standard for self-hosting.\n* 6️⃣ [SGLang](https:\u002F\u002Fgithub.com\u002Fsgl-project\u002Fsglang) - Structured generation and batching, strong for constrained outputs.\n* 6️⃣ [Text Generation Inference (TGI)](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Ftext-generation-inference) - Hugging Face's production-ready serving stack.\n* 6️⃣ [llama.cpp](https:\u002F\u002Fgithub.com\u002Fggerganov\u002Fllama.cpp) - Central Processing Unit (CPU) and edge inference with GGUF quantization. The main path to running models on laptops.\n* 6️⃣ [Efficient Inference with SGLang](https:\u002F\u002Fwww.deeplearning.ai\u002Fcourses\u002Fefficient-inference-with-sglang-text-and-image-generation) - DeepLearning.AI short course, free.\n\n### Cloud GPU and managed inference\n\n* 4️⃣ [RunPod](https:\u002F\u002Fdocs.runpod.io\u002F) - Low-cost on-demand Graphics Processing Unit (GPU) rental.\n* 4️⃣ [Together AI](https:\u002F\u002Fdocs.together.ai\u002F) - Fast managed inference for open-weight models.\n* 4️⃣ [Fireworks AI](https:\u002F\u002Fdocs.fireworks.ai\u002F) - Another leading managed inference provider.\n* 4️⃣ [Groq](https:\u002F\u002Fconsole.groq.com\u002Fdocs) - Language Processing Unit (LPU) hardware for very low-latency serving.\n* 4️⃣ [Cerebras](https:\u002F\u002Finference-docs.cerebras.ai\u002F) - Wafer-scale inference, fastest tokens per second on certain models.\n* 5️⃣ [Modal docs](https:\u002F\u002Fmodal.com\u002Fdocs\u002Fguide) and [Developing with LLMs on Modal](https:\u002F\u002Fmodal.com\u002Fdocs\u002Fguide\u002Fdeveloping-with-llms) - Serverless GPU compute with a clean Python interface.\n* 7️⃣ [BentoML docs](https:\u002F\u002Fdocs.bentoml.com\u002F), the [LLM Inference Handbook](https:\u002F\u002Fbentoml.com\u002Fllm\u002F), [OpenAI-compatible API guide](https:\u002F\u002Fbentoml.com\u002Fllm\u002Fllm-inference-basics\u002Fopenai-compatible-api), [Serverless vs. self-hosted](https:\u002F\u002Fbentoml.com\u002Fllm\u002Fllm-inference-basics\u002Fserverless-vs-self-hosted-llm-inference), and [Inference optimization](https:\u002F\u002Fbentoml.com\u002Fllm\u002Finference-optimization) - Thorough free handbook on inference economics.\n\n### LLM gateways and routing layers\n\nMost production stacks sit one layer above the provider to handle fallbacks, rate limits, cost tracking, and per-request model selection:\n\n* 5️⃣ [LiteLLM](https:\u002F\u002Fdocs.litellm.ai\u002F) - Open-source proxy and Python SDK that lets you call 100+ LLM providers through a unified OpenAI-compatible interface. De facto standard for multi-provider applications.\n* 5️⃣ [OpenRouter](https:\u002F\u002Fopenrouter.ai\u002Fdocs) - Hosted router with a single API across hundreds of models, including preview access to models before they hit of",2,"2026-06-11 04:12:06","CREATED_QUERY"]