[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-76062":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":8,"language":10,"languages":8,"totalLinesOfCode":8,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":14,"stars7d":15,"stars30d":16,"stars90d":14,"forks30d":14,"starsTrendScore":17,"compositeScore":18,"rankGlobal":8,"rankLanguage":8,"license":8,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":21,"hasPages":19,"topics":22,"createdAt":8,"pushedAt":8,"updatedAt":23,"readmeContent":24,"aiSummary":25,"trendingCount":14,"starSnapshotCount":14,"syncStatus":15,"lastSyncTime":26,"discoverSource":27},76062,"ai-engineer-roadmap","v9ai\u002Fai-engineer-roadmap","v9ai",null,"https:\u002F\u002Fai-engineer-roadmap.xyz","TypeScript",176,5,101,0,2,52,1,42.53,false,"main",true,[],"2026-06-12 04:01:20","\u003Cdiv align=\"center\">\n\n\u003Cbr \u002F>\n\n# 🧠 &nbsp; AI Engineering\n\n### From Zero to Production AI Engineer\n\n**A hands-on learning platform that takes engineers from transformer internals to shipping production AI systems.**\n\n108 deeply-researched lessons across 15 categories — RAG, agents, evals, fine-tuning, prompting — wired together with semantic search, AI audio narration, an interactive knowledge graph, a RAG tutor, and per-learner mastery analytics.\n\n\u003Cbr \u002F>\n\n[![Next.js](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FNext.js-15-000000?style=for-the-badge&logo=next.js&logoColor=white)](https:\u002F\u002Fnextjs.org\u002F)\n[![TypeScript](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTypeScript-strict-3178C6?style=for-the-badge&logo=typescript&logoColor=white)](https:\u002F\u002Fwww.typescriptlang.org\u002F)\n[![LangGraph](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLangGraph-5_graphs-1C3C3C?style=for-the-badge&logo=langchain&logoColor=white)](https:\u002F\u002Flangchain-ai.github.io\u002Flanggraph\u002F)\n[![Postgres](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FNeon-pgvector-336791?style=for-the-badge&logo=postgresql&logoColor=white)](https:\u002F\u002Fneon.tech\u002F)\n[![Vercel](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FVercel-deployed-000000?style=for-the-badge&logo=vercel&logoColor=white)](https:\u002F\u002Fvercel.com\u002F)\n\n\u003Cbr \u002F>\n\n**[🚀 Quick Start](#-quick-start)** &nbsp;·&nbsp; **[✨ Features](#-features)** &nbsp;·&nbsp; **[🧱 Stack](#-stack)** &nbsp;·&nbsp; **[🏗 Architecture](#-architecture)** &nbsp;·&nbsp; **[🛠 Dev](#-dev)**\n\n\u003Csub>**108** lessons &nbsp;·&nbsp; **15** categories &nbsp;·&nbsp; **5** LangGraph graphs &nbsp;·&nbsp; **22** DB tables &nbsp;·&nbsp; **10** course evaluators\u003C\u002Fsub>\n\n\u003C\u002Fdiv>\n\n---\n\n## ✨ Features\n\n| | Feature | What it does |\n|:--:|---|---|\n| 📚 | **108 lessons, 15 categories** | Curriculum from transformer internals → RAG → agents → evals → production, prerequisite-ordered. |\n| 🔎 | **Semantic + full-text search** | `Cmd+K` instant search over Postgres FTS *and* pgvector cosine similarity. |\n| 🎧 | **Audio narration** | TTS audio per lesson (Rust pipeline → Cloudflare R2) with per-user resume positions in D1. |\n| 🕸️ | **Knowledge graph** | Concepts linked by `prerequisite` \u002F `builds_on` \u002F `related` edges, rendered as an explorable graph. |\n| 🤖 | **AI tutor chat** | RAG chat grounded in lesson content, with intent routing and checkpointed threads. |\n| 📈 | **Mastery analytics** | Bayesian Knowledge Tracing (`mastery \u002F transit \u002F slip \u002F guess`) per learner per lesson. |\n| ✍️ | **Self-authoring** | A 5-pass LangGraph writer (research → outline → draft → review → revise) generates new lessons behind a quality gate. |\n| 🎓 | **Course reviewer** | 10 expert evaluators score & rank external AI courses concurrently. |\n\n## 🚀 Quick Start\n\n> **Prerequisites:** Node 22.x · pnpm · a Neon Postgres database\n\n```bash\npnpm install\ncp .env.example .env.local   # fill in the keys (see Environment below)\n\npnpm db:push                 # sync schema to Neon\npnpm seed                    # seed 108 lessons from content\u002F*.md\npnpm dev                     # → http:\u002F\u002Flocalhost:3006  🎉\n```\n\nThat's the full app — no separate backend server. Chat calls DeepSeek directly from the Next route (set `DEEPSEEK_API_KEY`); article \u002F prep \u002F flashcard \u002F course-review generation are offline Rust bins (see [AI generation](#ai-generation-rust-bins)).\n\n## 🧱 Stack\n\n| Layer | Technology |\n|---|---|\n| **Framework** | Next.js 15 (App Router, Turbopack) |\n| **Database** | Neon PostgreSQL + pgvector, Drizzle ORM |\n| **UI** | Radix UI Themes |\n| **AI \u002F LLM** | OpenAI · DeepSeek |\n| **AI backend** | Rust `axum` LangGraph service (`crates\u002Fml\u002Fserver`) — 6 graphs (`chat`, `app_prep`, `memorize_generate`, `article_generate`, `course_review`, `fetch_courses`); `chat` does SQLite + LanceDB RAG, the rest are stateless LLM orchestration |\n| **Storage** | SQLite (`data\u002Fknowledge.db` content, `data\u002Fcourses.db` courses) + LanceDB (vectors) · Cloudflare R2 (audio) · D1 (per-user playback state) |\n| **Deployment** | Vercel (frontend) + the Rust `knowledge-server` binary (backend) |\n\n## 🏗 Architecture\n\n```mermaid\ngraph TD\n    Browser --> Next[\"Next.js on Vercel\u003Cbr\u002F>pages · API routes · server actions\"]\n    Next --> Adapter[\"data.ts adapter\"]\n    Adapter -->|\"DATA_SOURCE=db\"| DB[(\"Neon Postgres\u003Cbr\u002F>+ pgvector + checkpoints\")]\n    Adapter -->|\"DATA_SOURCE=fs\"| FS[\"content\u002F*.md\"]\n    Next -->|\"BACKEND_URL + bearer\"| Rust[\"Rust knowledge-server :7860\u003Cbr\u002F>6 LangGraph graphs\"]\n    Rust --> DeepSeek[\"DeepSeek API\"]\n    Rust --> SQLite[(\"SQLite + LanceDB\u003Cbr\u002F>knowledge.db · courses.db\")]\n    Next --> R2[\"Cloudflare R2\u003Cbr\u002F>audio files\"]\n    Next --> D1[\"Cloudflare D1\u003Cbr\u002F>audio progress\"]\n```\n\n**Request paths:** lesson pages read through `data.ts` (DB or filesystem) and pull related lessons via pgvector cosine similarity. Chat does FTS + vector retrieval in Next.js, then POSTs snippets + history to the Rust `knowledge-server`, which merges them with its own SQLite + LanceDB retrieval and calls DeepSeek (stateless — history is supplied by the caller).\n\n## 🔀 LangGraph Pipelines\n\n- **Content generation** (`article_generate`) — research → outline → draft → review → revise, with a conditional revision loop (max 2) gated on word count, code blocks, cross-refs, ≥5 xyflow diagrams, and mandatory sections. Run via the `gen-article` Rust bin (`pnpm generate:rust`) or over `\u002Fruns\u002Fwait`.\n- **RAG chat** (`chat`) — SQLite lexical + LanceDB vector retrieval merged with caller snippets → format context → one DeepSeek call.\n- **Course review** (`course_review`) — 10 expert evaluators run concurrently, then a weighted aggregator computes score + verdict.\n\n## 🗂 Project Layout\n\n```\napp\u002F                  Next.js App Router (lessons, AWS hub, applications, coursework, problems, api\u002F*)\ncomponents\u002F           React components (search, audio-player, toc, …)\ncontent\u002F              Markdown lesson files\nsrc\u002Fdb\u002F               Neon client + Drizzle schema (22 tables)\nsrc\u002Flib\u002F              backend-client (typed POST \u002Fruns\u002Fwait)\nlib\u002F                  data.ts adapter, db queries, r2.ts, d1.ts, server actions\ncrates\u002Fml\u002F            Rust workspace — knowledge-server (6 graphs, gen-article,\n                      seed-topic-courses), core (seed\u002Fexport), audio-guide\nscripts\u002F              seed, scrape, review-courses, e2e\nsql\u002F · migrations\u002F    Neon setup + D1 migrations\n```\n\n## 🛠 Dev\n\n```bash\npnpm dev                       # start on :3006\npnpm db:push \u002F db:studio       # sync schema \u002F open Drizzle Studio\npnpm seed \u002F seed:courses       # seed lessons \u002F Udemy catalog\npnpm scrape:udemy              # scrape AI\u002FML Udemy topics → external_courses\n\npnpm generate \u003Cslug>           # generate a lesson via LangGraph\npnpm generate:dry \u003Cslug>       # preview without saving\npnpm generate:batch            # generate all missing lessons\npnpm review:courses            # batch-review unreviewed courses\n\npnpm backend:rust              # run knowledge-server on :7860 (Rust)\npnpm backend:rust:index        # (re)build the LanceDB section index\npnpm generate:rust \u003Cargs>      # gen-article bin (research→…→finalize)\npnpm seed:courses \u003Cargs>       # seed-topic-courses bin → data\u002Fcourses.db\npnpm audio:meta \u003Cargs>         # markdown → AudioMeta JSON (deterministic)\n\npnpm backend:test              # cargo test (knowledge-server + audio-guide)\npnpm test:e2e                  # smoke the running server\n```\n\n### AI generation (Rust bins)\n\nThere is **no long-running backend server**. Chat is a direct DeepSeek call in\n`app\u002Fapi\u002Fchat\u002Froute.ts` (via `lib\u002Fchat-llm.ts` — 1:1 port of the old Rust\n`chat` prompt; needs `DEEPSEEK_API_KEY` in the Next env \u002F Vercel). All other AI\ngeneration runs offline as one-shot `aer-ml` bins that write Neon \u002F SQLite:\n\n```bash\npnpm prep:loop -- --slug \u003Cslug>     # interview prep (deepseek-loop agent → Neon)\npnpm prep:rust  \u002F prep:owner:rust   # prep variants (static artifact \u002F owner)\npnpm prep:memorize -- --slug \u003Cslug> # flashcards → concepts + applications row\npnpm review:courses [--dry-run]     # course_review → data\u002Fcourses.db\npnpm generate:rust -- --slug \u003Cs>    # article generation\n```\n\nEnv for the bins: `DEEPSEEK_API_KEY` (monorepo-root `.env`) and, for the ones\nthat persist, `DATABASE_URL` (auto-exported by the npm scripts).\nCourse data is scraped\u002Freviewed into `data\u002Fcourses.db` and surfaced to the\nfrontend as JSON via `pnpm export:content` (Rust → `data\u002Fcontent\u002F*.json`).\n\n### Local prep generation (DB-backed)\n\nThe application **\u002Fprep** page reads `interviewQuestions` from Neon. In prod\nthat DB-backed path is dormant (no Rust backend is deployed; the public page\nfalls back to the committed `data\u002Fapp-prep\u002F\u003Cslug>.json` seed). To regenerate\n*real* prep and push it into the row — which, since `DATABASE_URL` is the\nshared Neon, also updates the live owner view — one **full-Rust** command does\ngenerate → validate → persist:\n\n```bash\npnpm prep:loop                 # ECB SSM Cockpit Developer (default slug)\npnpm prep:loop -- --slug \u003Cslug>            # another application\npnpm prep:loop -- --slug \u003Cslug> --no-db    # regenerate + validate only\n```\n\n`prep:loop` runs the `gen-app-prep-loop` Rust bin (`crates\u002Fml`). It: (1) runs\na `deepseek-loop` agent **in-process** (`deepseek::run`, only the builtin\n`Read`\u002F`Write` tools, `AcceptEdits`) that reads the job description from\n`data\u002Fapp-prep\u002F\u003Cslug>.json` and rewrites the artifact's prep fields; (2)\nvalidates the result in Rust — exactly 4 `## ` sections, `techStack` a JSON\nstring whose every `category` ∈ `app_prep::CATEGORIES`, valid `relevance`; (3)\nconnects to Neon over Postgres (`sqlx`) and updates the `applications` row\n(`interviewQuestions`\u002F`techStack`, plus `jobDescription` backfill if the\nrow had none), then reads back to confirm. It mutates the **live** row — not a\ndry run; the agent step and the validation gate must both pass first, so Neon\nis untouched on failure.\n\nThe npm script exports `DEEPSEEK_API_KEY` from the monorepo-root\n`\u002FUsers\u002Fvadimnicolai\u002FPublic\u002Fai-apps\u002F.env` and `DATABASE_URL` from `.env.local`\n(the Rust process does **not** load `.env*`); `>1` rows for a slug →\n`--user-id \u003Cid>`. (`prep:rust` — the `gen-app-prep` bin, static artifact only —\nand `backend:rust:local` remain valid alternatives. `pnpm test:app-prep` still\nexists as the seed-loader \u002F public-render contract test.)\n\n### Environment\n\n```env\nDATABASE_URL=             # Neon connection string\nOPENAI_API_KEY=\nDEEPSEEK_API_KEY=         # required by the Next runtime for \u002Fapi\u002Fchat + the AI bins\nNEXT_PUBLIC_DATA_SOURCE=  # \"db\" | \"fs\"\nNEXT_PUBLIC_R2_DOMAIN=    # audio CDN domain\nR2_ACCOUNT_ID= R2_ACCESS_KEY_ID= R2_SECRET_ACCESS_KEY= R2_BUCKET_NAME=\nCLOUDFLARE_ACCOUNT_ID= CLOUDFLARE_AUDIO_D1_ID= CLOUDFLARE_D1_API_TOKEN=\n```\n\n---\n\n\u003Cdiv align=\"center\">\n\n### Built with Next.js 15 · LangGraph · Neon pgvector · Cloudflare\n\n\u003Csub>From zero to production AI engineer — one lesson at a time.\u003C\u002Fsub>\n\n[⬆ Back to top](#--ai-engineering)\n\n\u003C\u002Fdiv>\n","ai-engineer-roadmap 是一个面向AI工程师的实战学习平台，旨在帮助工程师从零开始掌握到生产级AI系统开发。项目采用TypeScript编写，基于Next.js框架，并结合了语义搜索、AI音频讲解、交互式知识图谱、RAG辅导和个性化学习分析等核心功能。适合希望深入理解并实践AI工程化流程的技术人员使用，涵盖从Transformer内部机制到检索增强生成（RAG）、代理、评估及生产部署等多个方面。通过108个精心设计的学习模块，用户能够系统地构建自己的AI知识体系和技术栈。","2026-06-11 03:54:21","CREATED_QUERY"]