[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-71383":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":15,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":22,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":45,"readmeContent":46,"aiSummary":47,"trendingCount":16,"starSnapshotCount":16,"syncStatus":48,"lastSyncTime":49,"discoverSource":50},71383,"Awesome-Prompt-Engineering","promptslab\u002FAwesome-Prompt-Engineering","promptslab","This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc ","https:\u002F\u002Fdiscord.gg\u002Fm88xfYMbK6",null,"TypeScript",6018,696,93,7,0,28,119,21,102.53,"Apache License 2.0",false,"main",true,[26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44],"chatgpt","chatgpt-api","deep-learning","few-shot-learning","gpt","gpt-3","machine-learning","openai","prompt","prompt-based-learning","prompt-engineering","prompt-generator","prompt-learning","prompt-toolkit","prompt-tuning","promptengineering","text-to-image","text-to-speech","text-to-video","2026-06-12 04:01:00","\u003Ch2 align=\"center\">Awesome Prompt Engineering 🧙‍♂️\u003C\u002Fh2>\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"650\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fpromptslab\u002FAwesome-Prompt-Engineering\u002Fmain\u002F_source\u002Fprompt.png\">\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  A hand-curated collection of resources for Prompt Engineering and Context Engineering — covering papers, tools, models, APIs, benchmarks, courses, and communities for working with Large Language Models.\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\nhttps:\u002F\u002Fpromptslab.github.io\n  \u003C\u002Fp>\n \u003Ch4 align=\"center\">\n  \n  ```\n     Master Prompt Engineering. Join the Course at https:\u002F\u002Fpromptslab.github.io\n  ```\n  \u003Ca href=\"https:\u002F\u002Fawesome.re\">\u003Cimg src=\"https:\u002F\u002Fawesome.re\u002Fbadge.svg\" alt=\"Awesome\" \u002F>\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fpromptslab\u002FAwesome-Prompt-Engineering\u002Fblob\u002Fmain\u002FLICENSE\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-Apache_2.0-blue.svg\" alt=\"License\" \u002F>\u003C\u002Fa>\n  \u003Ca href=\"http:\u002F\u002Fmakeapullrequest.com\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPRs-welcome-brightgreen.svg?style=flat-square\" alt=\"PRs Welcome\" \u002F>\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fdiscord.gg\u002Fm88xfYMbK6\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDiscord-Community-orange\" alt=\"Community\" \u002F>\u003C\u002Fa>\n  \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLast%20Updated-February%202026-brightgreen\" alt=\"Last Updated\" \u002F>\n\u003C\u002Fp>\n\n---\n\n## 🚀 Start Here\n\nNew to prompt engineering? Follow this path:\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"1000\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fpromptslab\u002FAwesome-Prompt-Engineering\u002Frefs\u002Fheads\u002Fmain\u002F_source\u002Fmain.jpg\">\n\u003C\u002Fp>\n\n1. **Learn the basics** → [ChatGPT Prompt Engineering for Developers](https:\u002F\u002Fwww.deeplearning.ai\u002Fshort-courses\u002Fchatgpt-prompt-engineering-for-developers\u002F) (free, ~90 min)\n2. **Read the guide** → [Prompt Engineering Guide by DAIR.AI](https:\u002F\u002Fwww.promptingguide.ai\u002F) (open-source, comprehensive)\n3. **Study provider docs** → [OpenAI Prompt Engineering Guide](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fguides\u002Fprompt-engineering) · [Anthropic Prompt Engineering Guide](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fbuild-with-claude\u002Fprompt-engineering\u002Foverview)\n4. **Understand where the field is heading** → [Anthropic: Effective Context Engineering for AI Agents](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Feffective-context-engineering-for-ai-agents)\n5. **Read the research** → [The Prompt Report](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.06608) — taxonomy of 58+ prompting techniques from 1,500+ papers\n\n---\n\n## Table of Contents\n\n- [Papers](#papers)\n  - [Major Surveys](#major-surveys)\n  - [Prompt Optimization and Automatic Prompting](#prompt-optimization-and-automatic-prompting)\n  - [Prompt Compression](#prompt-compression)\n  - [Reasoning Advances](#reasoning-advances)\n  - [In-Context Learning](#in-context-learning)\n  - [Agentic Prompting and Multi-Agent Systems](#agentic-prompting-and-multi-agent-systems)\n  - [Multimodal Prompting](#multimodal-prompting)\n  - [Structured Output and Format Control](#structured-output-and-format-control)\n  - [Prompt Injection and Security](#prompt-injection-and-security)\n  - [Applications of Prompt Engineering](#applications-of-prompt-engineering)\n  - [Text-to-Image Generation](#text-to-image-generation)\n  - [Text-to-Music\u002FAudio Generation](#text-to-musicaudio-generation)\n  - [Foundational Papers (Pre-2024)](#foundational-papers-pre-2024)\n- [Tools and Code](#tools-and-code)\n  - [Prompt Management and Testing](#prompt-management-and-testing)\n  - [LLM Evaluation Tools](#llm-evaluation-tools)\n  - [Agent Frameworks](#agent-frameworks)\n  - [Prompt Optimization Tools](#prompt-optimization-tools)\n  - [Red Teaming and Prompt Security](#red-teaming-and-prompt-security)\n  - [MCP (Model Context Protocol)](#mcp-model-context-protocol)\n  - [Vibe Coding and AI Coding Assistants](#vibe-coding-and-ai-coding-assistants)\n    - [CLI-Based Coding Agents](#cli-based-coding-agents)\n    - [AI Code Editors \u002F IDEs](#ai-code-editors--ides)\n    - [IDE Extensions \u002F Plugins](#ide-extensions--plugins)\n    - [AI Coding Platforms \u002F Cloud Agents](#ai-coding-platforms--cloud-agents)\n    - [Open-Source Coding Agent Frameworks](#open-source-coding-agent-frameworks)\n  - [Other Notable Repositories](#other-notable-repositories)\n- [APIs](#apis)\n- [Datasets and Benchmarks](#datasets-and-benchmarks)\n- [Models](#models)\n- [AI Content Detectors](#ai-content-detectors)\n- [Books](#books)\n- [Courses](#courses)\n- [Tutorials and Guides](#tutorials-and-guides)\n- [Videos](#videos)\n- [Communities](#communities)\n- [Autonomous Research & Self-Improving Agents](#autonomous-research--self-improving-agents)\n- [How to Contribute](#how-to-contribute)\n\n---\n\n## Papers\n📄\n\n### Major Surveys\n\n- [The Prompt Report: A Systematic Survey of Prompting Techniques](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.06608) [2024] — Most comprehensive survey: taxonomy of 58 text and 40 multimodal prompting techniques from 1,500+ papers. Co-authored with OpenAI, Microsoft, Google, Stanford.\n- [A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.07927) [2024] — 44 techniques across application areas with per-task performance summaries.\n- [A Survey of Prompt Engineering Methods in LLMs for Different NLP Tasks](https:\u002F\u002Farxiv.org\u002Fabs\u002F2407.12994) [2024] — 39 prompting methods across 29 NLP tasks.\n- [A Survey of Automatic Prompt Engineering: An Optimization Perspective](https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.11560) [2025] — Formalizes auto-PE methods as discrete\u002Fcontinuous\u002Fhybrid optimization problems.\n- [Efficient Prompting Methods for Large Language Models: A Survey](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.01077) [2024] — Survey of efficiency-oriented prompting (compression, optimization, APE) for reducing compute and latency.\n- [Navigate through Enigmatic Labyrinth: A Survey of Chain of Thought Reasoning](https:\u002F\u002Farxiv.org\u002Fabs\u002F2309.15402) [2023, ACL 2024] — Systematic CoT survey.\n- [Demystifying Chains, Trees, and Graphs of Thoughts](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.14295) [2024] — Unified framework for multi-prompt reasoning topologies.\n- [Towards Goal-oriented Prompt Engineering for Large Language Models: A Survey](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.14043) [2024] — Focuses on prompts designed around explicit task goals.\n- [Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning LLMs](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.09567) [2025] — Distinguishes Long CoT from Short CoT in o1\u002FR1-era models.\n\n### Prompt Optimization and Automatic Prompting\n\n- [OPRO: Large Language Models as Optimizers](https:\u002F\u002Farxiv.org\u002Fabs\u002F2309.03409) [2023, NeurIPS 2024] — Uses LLMs as optimizers via meta-prompts; optimized prompts outperform human-designed ones by up to 50% on BBH.\n- [DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.03714) [2023, ICLR 2024] — Framework for programming (not prompting) LLMs with automatic prompt optimization.\n- [MIPRO: Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.11695) [2024, EMNLP 2024] — Bayesian optimization for multi-stage LM programs; up to 13% accuracy gains.\n- [TextGrad: Automatic \"Differentiation\" via Text](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.07496) [2024] — Treats compound AI systems as computation graphs with textual feedback as gradients. Published in Nature.\n- [EvoPrompt](https:\u002F\u002Farxiv.org\u002Fabs\u002F2309.08532) [2023, ACL 2024] — Evolutionary algorithm approach for automatically optimizing discrete prompts.\n- [Meta Prompting for AI Systems](https:\u002F\u002Farxiv.org\u002Fabs\u002F2311.11482) [2023, ICLR 2024 Workshop] — Example-agnostic structural templates formalized using category theory.\n- [Prompt Engineering a Prompt Engineer (PE²)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2311.05661) [2024, ACL Findings] — Uses LLMs to meta-prompt themselves, refining prompts with step-by-step templates to significantly improve reasoning.\n- [Large Language Models Are Human-Level Prompt Engineers](https:\u002F\u002Farxiv.org\u002Fabs\u002F2211.01910) [2022] — Automatic prompt generation via APE.\n- [Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning](https:\u002F\u002Farxiv.org\u002Fabs\u002F2302.03668) [2023]\n- [SPO: Self-Supervised Prompt Optimization](https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.06855) [2025] — Competitive performance at 1–6% of the cost of prior methods.\n\n### Prompt Compression\n\n- [LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.12968) [2024, ACL 2024] — 3x–6x faster than LLMLingua with GPT-4 data distillation.\n- [LongLLMLingua](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.06839) [2023, ACL 2024] — Question-aware compression for long contexts; 21.4% performance boost with 4x fewer tokens.\n- [Prompt Compression for Large Language Models: A Survey](https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.12388) [2024] — Comprehensive survey of hard and soft prompt compression methods.\n\n### Reasoning Advances\n\n- [Scaling LLM Test-Time Compute Optimally](https:\u002F\u002Farxiv.org\u002Fabs\u002F2408.03314) [2024] — Shows optimal test-time compute allocation can outperform 14x larger models.\n- [DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.12948) [2025] — Pure RL-trained reasoning model matching o1; open-source with distilled variants.\n- [s1: Simple Test-Time Scaling](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.19393) [2025] — SFT on just 1,000 examples creates competitive reasoning model via \"budget forcing.\"\n- [Reasoning Language Models: A Blueprint](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.11223) [2025] — Systematic framework organizing reasoning LM approaches.\n- [Demystifying Long Chain-of-Thought Reasoning in LLMs](https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.03373) [2025] — Analyzes long CoT behavior in modern reasoning models.\n- [Graph of Thoughts: Solving Elaborate Problems with LLMs](https:\u002F\u002Farxiv.org\u002Fabs\u002F2308.09687) [2023, AAAI 2024] — Models thoughts as arbitrary graphs; 62% quality improvement over ToT on sorting.\n- [Tree of Thoughts: Deliberate Problem Solving with LLMs](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.10601) [2023, NeurIPS 2023] — Tree search over reasoning paths.\n- [Everything of Thoughts](https:\u002F\u002Farxiv.org\u002Fabs\u002F2311.04254) [2023] — Integrates CoT, ToT, and external solvers via MCTS.\n- [Skeleton-of-Thought](https:\u002F\u002Farxiv.org\u002Fabs\u002F2307.15337) [2023] — Parallel decoding via answer skeleton generation for up to 2.69x speedup.\n- [Chain of Thought Prompting Elicits Reasoning in Large Language Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2201.11903) [2022] — The foundational CoT paper.\n- [Self-Consistency Improves Chain of Thought Reasoning](https:\u002F\u002Farxiv.org\u002Fabs\u002F2203.11171) [2022] — Aggregating multiple CoT outputs for reliability.\n- [Large Language Models are Zero-Shot Reasoners](https:\u002F\u002Farxiv.org\u002Fabs\u002F2205.11916) [2022] — \"Let's think step by step\" as a zero-shot reasoning trigger.\n- [ReAct: Synergizing Reasoning and Acting in Language Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2210.03629) [2022] — Interleaving reasoning and tool use.\n\n### In-Context Learning\n\n- [Many-Shot In-Context Learning](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.11018) [2024, NeurIPS 2024 Spotlight] — Significant gains scaling ICL to hundreds\u002Fthousands of examples; introduces Reinforced and Unsupervised ICL.\n- [Many-Shot In-Context Learning in Multimodal Foundation Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.09798) [2024] — Scales multimodal ICL to ~2,000 examples across 14 datasets.\n- [Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?](https:\u002F\u002Farxiv.org\u002Fabs\u002F2202.12837) [2022]\n- [Fantastically Ordered Prompts and Where to Find Them](https:\u002F\u002Farxiv.org\u002Fabs\u002F2104.08786) [2021] — Overcoming few-shot prompt order sensitivity.\n- [Calibrate Before Use: Improving Few-Shot Performance of Language Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2102.09690) [2021]\n\n### Agentic Prompting and Multi-Agent Systems\n\n- [Agentic Large Language Models: A Survey](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.23037) [2025] — Comprehensive survey organizing agentic LLMs by reasoning, acting, and interacting capabilities.\n- [Large Language Model based Multi-Agents: A Survey of Progress and Challenges](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.01680) [2024] — Covers profiling, communication, and growth mechanisms.\n- [Multi-Agent Collaboration Mechanisms: A Survey of LLMs](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.06322) [2025] — Reviews debate and cooperation strategies in LLM-based multi-agent systems.\n- [AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation](https:\u002F\u002Farxiv.org\u002Fabs\u002F2308.08155) [2023] — Microsoft's foundational multi-agent framework paper.\n- [ToolLLM: Facilitating Large Language Models to Master 16000+ Real-World APIs](https:\u002F\u002Farxiv.org\u002Fabs\u002F2307.16789) [2023, ICLR 2024] — Trains LLMs to use massive real-world API collections.\n- [SWE-bench: Can Language Models Resolve Real-World GitHub Issues?](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.06770) [2023, ICLR 2024] — The benchmark driving agentic coding progress.\n- [AgentBench: Evaluating LLMs as Agents](https:\u002F\u002Farxiv.org\u002Fabs\u002F2308.03688) [2023, ICLR 2024] — Benchmark across 8 environments.\n- [PAL: Program-aided Language Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2211.10435) [2023] — Offloading computation to code interpreters.\n\n### Multimodal Prompting\n\n- [Visual Prompting in Multimodal Large Language Models: A Survey](https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.15310) [2024] — First comprehensive survey on visual prompting methods in MLLMs.\n- [Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.11441) [2023] — Visual markers dramatically improve visual grounding.\n- [A Comprehensive Survey and Guide to Multimodal Large Language Models in Vision-Language Tasks](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.06284) [2024] — Covers text, image, video, audio MLLMs.\n- [Multimodal Chain-of-Thought Reasoning in Language Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2302.00923) [2023]\n- [From Prompt Engineering to Prompt Craft](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.13422) [2024] — Design-research view of prompt \"craft\" for diffusion models.\n\n### Structured Output and Format Control\n\n- [Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of LLMs](https:\u002F\u002Farxiv.org\u002Fabs\u002F2408.02442) [2024] — Examines how constraining outputs to structured formats impacts reasoning performance.\n- [Batch Prompting: Efficient Inference with LLM APIs](https:\u002F\u002Farxiv.org\u002Fabs\u002F2301.08721) [2023]\n- [Structured Prompting: Scaling In-Context Learning to 1,000 Examples](https:\u002F\u002Farxiv.org\u002Fabs\u002F2212.06713) [2022]\n\n### Prompt Injection and Security\n\n- [Formalizing and Benchmarking Prompt Injection Attacks and Defenses](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.12815) [2023, USENIX Security 2024] — Formal framework with systematic evaluation of 5 attacks and 10 defenses across 10 LLMs.\n- [The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.13208) [2024] — OpenAI's priority-level training for injection defense.\n- [AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.13352) [2024] — Realistic agent scenario benchmark.\n- [InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated LLM Agents](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.02691) [2024]\n- [SecAlign: Defending Against Prompt Injection with Preference Optimization](https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.05451) [2024] — DPO-based defense.\n- [WASP: Benchmarking Web Agent Security Against Prompt Injection](https:\u002F\u002Farxiv.org\u002Fabs\u002F2504.18575) [2025] — Security benchmark for web\u002Fcomputer-use agents.\n- [Many-Shot Jailbreaking](https:\u002F\u002Fwww.anthropic.com\u002Fresearch\u002Fmany-shot-jailbreaking) [2024] — Scaling harmful examples in long-context windows enables jailbreaking (Anthropic Technical Report).\n- [Constitutional AI: Harmlessness from AI Feedback](https:\u002F\u002Farxiv.org\u002Fabs\u002F2212.08073) [2022]\n- [Ignore Previous Prompt: Attack Techniques For Language Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2211.09527) [2022]\n- [Artificial Intelligence and Cybersecurity: Documented Risks, Enterprise Guardrails, and Emerging Threats in 2024–2025](https:\u002F\u002Fwww.ijfmr.com\u002Fresearch-paper.php?id=62200) [2025] — Survey of real prompt-injection incidents with practical governance prompt patterns.\n\n### Applications of Prompt Engineering\n\n- [Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves](https:\u002F\u002Farxiv.org\u002Fabs\u002F2311.04205) [2023]\n- [Legal Prompt Engineering for Multilingual Legal Judgement Prediction](https:\u002F\u002Farxiv.org\u002Fabs\u002F2212.02199) [2023]\n- [Conversing with Copilot: Exploring Prompt Engineering for Solving CS1 Problems](https:\u002F\u002Farxiv.org\u002Fabs\u002F2210.15157) [2022]\n- [Commonsense-Aware Prompting for Controllable Empathetic Dialogue Generation](https:\u002F\u002Farxiv.org\u002Fabs\u002F2302.01441) [2023]\n- [PLACES: Prompting Language Models for Social Conversation Synthesis](https:\u002F\u002Farxiv.org\u002Fabs\u002F2302.03269) [2023]\n- [Medical Image Segmentation Using Transformer Encoders and Prompt-Based Learning: A Systematic Review](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F11313186\u002F) [2025]\n- [TableRAG: A Retrieval Augmented Generation Framework for Heterogeneous Document Reasoning](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.10380) [2025] — SQL-based interface preserving tabular structure for multi-hop queries.\n\n### Text-to-Image Generation\n\n- [A Taxonomy of Prompt Modifiers for Text-To-Image Generation](https:\u002F\u002Farxiv.org\u002Fabs\u002F2204.13988) [2022]\n- [Design Guidelines for Prompt Engineering Text-to-Image Generative Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2109.06977) [2021]\n- [High-Resolution Image Synthesis with Latent Diffusion Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2112.10752) [2021]\n- [DALL·E: Creating Images from Text](https:\u002F\u002Farxiv.org\u002Fabs\u002F2102.12092) [2021]\n- [Investigating Prompt Engineering in Diffusion Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2211.15462) [2022]\n\n### Text-to-Music\u002FAudio Generation\n\n- [MusicLM: Generating Music From Text](https:\u002F\u002Farxiv.org\u002Fabs\u002F2301.11325) [2023]\n- [ERNIE-Music: Text-to-Waveform Music Generation with Diffusion Models](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2302.04456) [2023]\n- [AudioLM: A Language Modeling Approach to Audio Generation](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2209.03143) [2023]\n- [Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2301.12661.pdf) [2023]\n\n### Foundational Papers (Pre-2024)\n\nThese papers established the core concepts that modern prompt engineering builds on:\n\n- [Language Models are Few-Shot Learners (GPT-3)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2005.14165) [2020] — Demonstrated few-shot prompting at scale.\n- [Prefix-Tuning: Optimizing Continuous Prompts for Generation](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.00190) [2021]\n- [The Power of Scale for Parameter-Efficient Prompt Tuning](https:\u002F\u002Farxiv.org\u002Fabs\u002F2104.08691) [2021]\n- [Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm](https:\u002F\u002Farxiv.org\u002Fabs\u002F2102.07350) [2021]\n- [Show Your Work: Scratchpads for Intermediate Computation with Language Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2112.00114) [2021]\n- [Generated Knowledge Prompting for Commonsense Reasoning](https:\u002F\u002Farxiv.org\u002Fabs\u002F2110.08387) [2021]\n- [Making Pre-trained Language Models Better Few-shot Learners](https:\u002F\u002Faclanthology.org\u002F2021.acl-long.295) [2021]\n- [AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts](https:\u002F\u002Farxiv.org\u002Fabs\u002F2010.15980) [2020]\n- [How Can We Know What Language Models Know?](https:\u002F\u002Fdirect.mit.edu\u002Ftacl\u002Farticle\u002Fdoi\u002F10.1162\u002Ftacl_a_00324\u002F96460\u002F) [2020]\n- [A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT](https:\u002F\u002Farxiv.org\u002Fabs\u002F2302.11382) [2023]\n- [Synthetic Prompting: Generating Chain-of-Thought Demonstrations for LLMs](https:\u002F\u002Farxiv.org\u002Fabs\u002F2302.00618) [2023]\n- [Progressive Prompts: Continual Learning for Language Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2301.12314) [2023]\n- [Successive Prompting for Decompleting Complex Questions](https:\u002F\u002Farxiv.org\u002Fabs\u002F2212.04092) [2022]\n- [Decomposed Prompting: A Modular Approach for Solving Complex Tasks](https:\u002F\u002Farxiv.org\u002Fabs\u002F2210.02406) [2022]\n- [PromptChainer: Chaining Large Language Model Prompts through Visual Programming](https:\u002F\u002Farxiv.org\u002Fabs\u002F2203.06566) [2022]\n- [Ask Me Anything: A Simple Strategy for Prompting Language Models](https:\u002F\u002Fpaperswithcode.com\u002Fpaper\u002Fask-me-anything-a-simple-strategy-for) [2022]\n- [Prompting GPT-3 To Be Reliable](https:\u002F\u002Farxiv.org\u002Fabs\u002F2210.09150) [2022]\n- [On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning](https:\u002F\u002Farxiv.org\u002Fabs\u002F2212.08061) [2022]\n\n---\n\n## Tools and Code\n🔧\n\n### Prompt Management and Testing\n\n| Name | Description | Link |\n|:-----|:-----------|:----:|\n| **Promptfoo** | Open-source CLI for testing, evaluating, and red-teaming LLM prompts. YAML configs, CI\u002FCD integration, adversarial testing. ~9K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fpromptfoo\u002Fpromptfoo) |\n| **Promptify** | Solve NLP Problems with LLM's & Easily generate different NLP Task prompts for popular generative models like GPT, PaLM, and more with Promptify | [[Github]](https:\u002F\u002Fgithub.com\u002Fpromptslab\u002FPromptify) |\n| **Agenta** | Open-source LLM developer platform for prompt management, evaluation, human feedback, and deployment. | [GitHub](https:\u002F\u002Fgithub.com\u002FAgenta-AI\u002Fagenta) |\n| **PromptLayer** | Version, test, and monitor every prompt and agent with robust evals, tracing, and regression sets. | [Website](https:\u002F\u002Fpromptlayer.com\u002F) |\n| **Helicone** | Production prompt monitoring and optimization platform. | [Website](https:\u002F\u002Fhelicone.ai\u002F) |\n| **LangGPT** | Framework for structured and meta-prompt design. 10K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Flanggpt\u002FLangGPT) |\n| **ChainForge** | Visual toolkit for building, testing, and comparing LLM prompt responses without code. | [GitHub](https:\u002F\u002Fgithub.com\u002Fianarawjo\u002FChainForge) |\n| **LMQL** | A query language for LLMs making complex prompt logic programmable. | [GitHub](https:\u002F\u002Fgithub.com\u002Feth-sri\u002Flmql) |\n| **Promptotype** | Platform for developing, testing, and managing structured LLM prompts. | [Website](https:\u002F\u002Fwww.promptotype.io) |\n| **PromptPanda** | AI-powered prompt management system for streamlining prompt workflows. | [Website](https:\u002F\u002Fpromptpanda.io) |\n| **Promptimize AI** | Browser extension to automatically improve user prompts for any AI model. | [Website](https:\u002F\u002Fpromptimize.ai) |\n| **PROMPTMETHEUS** | Web-based \"Prompt Engineering IDE\" for iteratively creating and running prompts. | [Website](https:\u002F\u002Fpromptmetheus.com) |\n| **Better Prompt** | Test suite for LLM prompts before pushing to production. | [GitHub](https:\u002F\u002Fgithub.com\u002Fkrrishdholakia\u002Fbetterprompt) |\n| **OpenPrompt** | Open-source framework for prompt-learning research. | [GitHub](https:\u002F\u002Fgithub.com\u002Fthunlp\u002FOpenPrompt) |\n| **Prompt Source** | Toolkit for creating, sharing, and using natural language prompts. | [GitHub](https:\u002F\u002Fgithub.com\u002Fbigscience-workshop\u002Fpromptsource) |\n| **Prompt Engine** | NPM utility library for creating and maintaining prompts for LLMs (Microsoft). | [GitHub](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fprompt-engine) |\n| **PromptInject** | Framework for quantitative analysis of LLM robustness to adversarial prompt attacks. | [GitHub](https:\u002F\u002Fgithub.com\u002Fagencyenterprise\u002FPromptInject) |\n| **LynxPrompt** | Self-hostable platform for managing AI IDE config files (.cursorrules, CLAUDE.md, copilot-instructions.md). Web UI, REST API, CLI, and federated blueprint marketplace for 30+ AI coding assistants. | [GitHub](https:\u002F\u002Fgithub.com\u002FGeiserX\u002FLynxPrompt) |\n| **flompt** | Visual AI prompt builder that decomposes prompts into 12 semantic blocks (role, context, constraints, examples, etc.) and compiles them into optimized XML. Browser extension for ChatGPT\u002FClaude\u002FGemini, and MCP server for Claude Code agents. Free, open-source. | [Website](https:\u002F\u002Fflompt.dev) |\n\n### LLM Evaluation Tools\n\n| Name | Description | Link |\n|:-----|:-----------|:----:|\n| **DeepEval** | Open-source evaluation framework covering RAG, agents, and conversations with CI\u002FCD integration. ~7K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fconfident-ai\u002Fdeepeval) |\n| **Ragas** | RAG evaluation with knowledge-graph-based test set generation and 30+ metrics. ~8K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fexplodinggradients\u002Fragas) |\n| **LangSmith** | LangChain's platform for debugging, testing, evaluating, and monitoring LLM applications. | [Website](https:\u002F\u002Fsmith.langchain.com\u002F) |\n| **Langfuse** | Open-source LLM observability with tracing, prompt management, and human annotation. ~7K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Flangfuse\u002Flangfuse) |\n| **Braintrust** | End-to-end AI evaluation platform, SOC2 Type II certified. | [Website](https:\u002F\u002Fwww.braintrust.dev\u002F) |\n| **Arize AI \u002F Phoenix** | Real-time LLM monitoring with drift detection and tracing. | [GitHub](https:\u002F\u002Fgithub.com\u002FArize-ai\u002Fphoenix) |\n| **TruLens** | Evaluating and explaining LLM apps; tracks hallucinations, relevance, groundedness. | [GitHub](https:\u002F\u002Fgithub.com\u002Ftruera\u002Ftrulens) |\n| **InspectAI** | Purpose-built for evaluating agents against benchmarks (UK AISI). | [GitHub](https:\u002F\u002Fgithub.com\u002FUKGovernmentBEIS\u002Finspect_ai) |\n| **Opik** | Evaluate, test, and ship LLM applications across dev and production lifecycles. | [GitHub](https:\u002F\u002Fgithub.com\u002Fcomet-ml\u002Fopik) |\n| **EvalView** | CLI tool for testing multi-step AI agents with YAML test cases, regression detection, and production monitoring. |[GitHub](https:\u002F\u002Fgithub.com\u002Fhidai25\u002Feval-view) |\n\n### Agent Frameworks\n\n| Name | Description | Link |\n|:-----|:-----------|:----:|\n| **LangChain \u002F LangGraph** | Most widely adopted LLM app framework; LangGraph adds graph-based multi-step agent workflows. ~100K+ \u002F ~10K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Flangchain-ai\u002Flangchain) · [LangGraph](https:\u002F\u002Fgithub.com\u002Flangchain-ai\u002Flanggraph) |\n| **CrewAI** | Role-playing AI agent orchestration with 700+ integrations. ~44K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002FcrewAIInc\u002FcrewAI) |\n| **AutoGen (AG2)** | Microsoft's multi-agent conversational framework. ~40K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fautogen) |\n| **DSPy** | Stanford's framework for programming LLMs with automatic prompt\u002Fweight optimization. ~22K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fstanfordnlp\u002Fdspy) |\n| **OpenAI Agents SDK** | Official agent framework with function calling, guardrails, and handoffs. ~10K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fopenai-agents-python) |\n| **Semantic Kernel** | Microsoft's AI framework powering M365 Copilot; C#, Python, Java. ~24K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fsemantic-kernel) |\n| **LlamaIndex** | Data framework for RAG and agent capabilities. ~40K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Frun-llama\u002Fllama_index) |\n| **Haystack** | Open-source NLP framework with pipeline architecture for RAG and agents. ~20K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fdeepset-ai\u002Fhaystack) |\n| **Agno (formerly Phidata)** | Python agent framework with microsecond instantiation. ~20K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fagno-agi\u002Fagno) |\n| **Smolagents** | Hugging Face's minimalist code-centric agent framework (~1000 LOC). ~15K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Fsmolagents) |\n| **Pydantic AI** | Type-safe agent framework using Pydantic for structured validation. ~8K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fpydantic\u002Fpydantic-ai) |\n| **Mastra** | TypeScript AI agent framework with assistants, RAG, and observability. ~20K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fmastra-ai\u002Fmastra) |\n| **Google ADK** | Agent Development Kit deeply integrated with Gemini and Google Cloud. | [GitHub](https:\u002F\u002Fgithub.com\u002Fgoogle\u002Fadk-python) |\n| **Strands Agents (AWS)** | Model-agnostic framework with deep AWS integrations. | [GitHub](https:\u002F\u002Fgithub.com\u002Fstrands-agents\u002Fsdk-python) |\n| **Langflow** | Node-based visual agent builder with drag-and-drop. ~50K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Flangflow-ai\u002Flangflow) |\n| **n8n** | Workflow automation with AI agent capabilities and 400+ integrations. ~60K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fn8n-io\u002Fn8n) |\n| **Dify** | All-in-one backend for agentic workflows with tool-using agents and RAG. | [GitHub](https:\u002F\u002Fgithub.com\u002Flanggenius\u002Fdify) |\n| **PraisonAI** | Multi-AI Agents framework with 100+ LLM support, MCP integration, and built-in memory. | [GitHub](https:\u002F\u002Fgithub.com\u002FMervinPraison\u002FPraisonAI) |\n| **Neurolink** | Multi-provider AI agent framework unifying 12+ providers with workflow orchestration. | [GitHub](https:\u002F\u002Fgithub.com\u002Fjuspay\u002Fneurolink) |\n| **Composio** | Connect 100+ tools to AI agents with zero setup. | [GitHub](https:\u002F\u002Fgithub.com\u002Fcomposiohq\u002Fcomposio) |\n\n### Prompt Optimization Tools\n\n| Name | Description | Link |\n|:-----|:-----------|:----:|\n| **DSPy** | Multiple optimizers (MIPROv2, BootstrapFewShot, COPRO) for automatic prompt tuning. ~22K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fstanfordnlp\u002Fdspy) |\n| **TextGrad** | Automatic differentiation via text (Stanford). ~2K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fzou-group\u002Ftextgrad) |\n| **OPRO** | Google DeepMind's optimization by prompting. | [GitHub](https:\u002F\u002Fgithub.com\u002Fgoogle-deepmind\u002Fopro) |\n\n### Red Teaming and Prompt Security\n\n| Name | Description | Link |\n|:-----|:-----------|:----:|\n| **Garak (NVIDIA)** | LLM vulnerability scanner for hallucination, injection, and jailbreaks — the \"nmap for LLMs.\" ~3K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Fgarak) |\n| **PyRIT (Microsoft)** | Python Risk Identification Tool for automated red-teaming. ~3K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002FAzure\u002FPyRIT) |\n| **DeepTeam** | 40+ vulnerabilities, 10+ attack methods, OWASP Top 10 support. | [GitHub](https:\u002F\u002Fgithub.com\u002Fconfident-ai\u002Fdeepteam) |\n| **LLM Guard** | Security toolkit for LLM I\u002FO validation. ~2K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fprotectai\u002Fllm-guard) |\n| **NeMo Guardrails (NVIDIA)** | Programmable guardrails for conversational systems. ~5K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FNeMo-Guardrails) |\n| **Guardrails AI** | Define strict output formats (JSON schemas) to ensure system reliability. | [Website](https:\u002F\u002Fwww.guardrailsai.com) |\n| **Lakera** | AI security platform for real-time prompt injection detection. | [Website](https:\u002F\u002Flakera.ai\u002F) |\n| **Purple Llama (Meta)** | Open-source LLM safety evaluation including CyberSecEval. | [GitHub](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002FPurpleLlama) |\n| **GPTFuzz** | Automated jailbreak template generation achieving >90% success rates. | [GitHub](https:\u002F\u002Fgithub.com\u002Fsherdencooper\u002FGPTFuzz) |\n| **Rebuff** | Open-source tool for detection and prevention of prompt injection. | [GitHub](https:\u002F\u002Fgithub.com\u002Fprotectai\u002Frebuff) |\n| **AgentSeal** | \"Open-source scanner that runs 150 attack probes to test AI agents for prompt injection and extraction vulnerabilities.\" | [GitHub](https:\u002F\u002Fgithub.com\u002Fagentseal\u002Fagentseal) |\n\n### MCP (Model Context Protocol)\n\nMCP is an open standard developed by Anthropic (Nov 2024, donated to Linux Foundation Dec 2025) for connecting AI assistants to external data sources and tools through a standardized interface. It has **97M+ monthly SDK downloads** and has been adopted by GitHub, Google, and most major AI providers.\n\n| Name | Description | Link |\n|:-----|:-----------|:----:|\n| **MCP Specification** | The core protocol specification and SDKs. ~15K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fmodelcontextprotocol\u002Fmodelcontextprotocol) |\n| **MCP Reference Servers** | Official implementations: fetch, filesystem, GitHub, Slack, Postgres. | [GitHub](https:\u002F\u002Fgithub.com\u002Fmodelcontextprotocol\u002Fservers) |\n| **FastMCP (Python)** | High-level Pythonic framework for building MCP servers. ~5K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fjlowin\u002Ffastmcp) |\n| **GitHub MCP Server** | GitHub's official MCP server for repo, issue, PR, and Actions interaction. ~15K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fgithub\u002Fgithub-mcp-server) |\n| **Awesome MCP Servers** | Curated list of 10,000+ community MCP servers. ~30K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fpunkpeye\u002Fawesome-mcp-servers) |\n| **Context7** | MCP server providing version-specific documentation to reduce code hallucination. | [GitHub](https:\u002F\u002Fgithub.com\u002Fupstash\u002Fcontext7) |\n| **GitMCP** | Creates remote MCP servers for any GitHub repo by changing the domain. | [Website](https:\u002F\u002Fgitmcp.io\u002F) |\n| **MCP Inspector** | Visual testing tool for MCP server development. | [GitHub](https:\u002F\u002Fgithub.com\u002Fmodelcontextprotocol\u002Finspector) |\n\n### Vibe Coding and AI Coding Assistants\n\n> 🟢 = Open Source · 🔵 = Commercial · 🟣 = Open Source + Commercial (open core with paid cloud\u002FAPI)\n\n#### CLI-Based Coding Agents\n\nTerminal-native agentic tools that understand your codebase and execute multi-step tasks.\n\n| Name | Description | Type | Link |\n|:-----|:-----------|:----:|:----:|\n| **Claude Code** | Anthropic's agentic coding CLI; understands full codebases and executes complex multi-step tasks via natural language. | 🔵 | [Docs](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fclaude-code) |\n| **OpenAI Codex CLI** | Open-source terminal coding agent from OpenAI; lightweight, local-first, with sandboxed code execution. ~68K+ ⭐ | 🟣 | [GitHub](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fcodex) |\n| **Gemini CLI** | Google's open-source terminal AI agent with 1M-token context window and Google Search grounding. ~96K+ ⭐ | 🟣 | [GitHub](https:\u002F\u002Fgithub.com\u002Fgoogle-gemini\u002Fgemini-cli) |\n| **Qwen Code** | Open-source terminal AI agent optimized for Qwen3-Coder; multi-protocol support (OpenAI\u002FAnthropic\u002FGemini APIs), 1,000 free requests\u002Fday. ~21K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002FQwenLM\u002Fqwen-code) |\n| **Aider** | AI pair programming in terminal with deep Git integration; maps entire codebases and auto-commits changes. ~42K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002FAider-AI\u002Faider) |\n| **OpenCode** | Powerful open-source AI coding agent with beautiful TUI; supports nearly all AI model providers. ~120K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002Fopencode-ai\u002Fopencode) |\n| **Goose** | Extensible open-source AI agent from Block (Square\u002FCash App); installs, executes, edits, and tests with any LLM. ~29K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002Fblock\u002Fgoose) |\n| **Crush** | Glamorous agentic coding agent from Charmbracelet with multi-model support, LSP integration, and beautiful terminal UI. ~9K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002Fcharmbracelet\u002Fcrush) |\n| **Amazon Q Developer CLI** | Agentic chat experience in terminal from AWS; transitioning to Kiro CLI. | 🟣 | [GitHub](https:\u002F\u002Fgithub.com\u002Faws\u002Famazon-q-developer-cli) |\n| **Amp** | Sourcegraph's agentic coding tool (Cody successor); works across CLI and IDE. | 🔵 | [Website](https:\u002F\u002Fampcode.com) |\n| **Junie CLI** | JetBrains' LLM-agnostic coding agent CLI (beta 2026); supports all major model providers. | 🔵 | [Website](https:\u002F\u002Fwww.jetbrains.com\u002Fjunie\u002F) |\n| **Autohand Code CLI** | Self-evolving autonomous terminal coding agent with multi-provider LLM support, 40+ tools, and modular skills system. | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002Fautohandai\u002Fcode-cli) |\n\n#### AI Code Editors \u002F IDEs\n\nStandalone editors or IDE forks with deep AI integration.\n\n| Name | Description | Type | Link |\n|:-----|:-----------|:----:|:----:|\n| **Cursor** | Leading AI-native code editor (VS Code fork); Composer generates entire apps from natural language, agentic multi-file edits. | 🔵 | [Website](https:\u002F\u002Fcursor.com) |\n| **Windsurf** | AI-powered IDE (VS Code fork) with proprietary Cascade agent and SWE-1.5 model; acquired by Cognition AI. | 🔵 | [Website](https:\u002F\u002Fwindsurf.com) |\n| **Zed** | High-performance editor in Rust with native AI features, Zeta edit prediction, and Agent Client Protocol support. ~77K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002Fzed-industries\u002Fzed) |\n| **Trae** | Free AI-powered IDE from ByteDance (\"The Real AI Engineer\") with Builder Mode; provides free access to Claude, GPT-4o, and DeepSeek. | 🔵 | [Website](https:\u002F\u002Fwww.trae.ai) |\n| **Google Antigravity** | Google's agent-first IDE (VS Code fork) with Manager view for orchestrating multiple agents in parallel; powered by Gemini. | 🔵 | [Website](https:\u002F\u002Fantigravity.google) |\n| **Kiro** | AWS's spec-driven agentic AI IDE (VS Code fork); turns prompts into specs, then working code, docs, and tests. | 🔵 | [Website](https:\u002F\u002Fkiro.dev) |\n| **PearAI** | Open-source AI code editor (VS Code fork) with Continue-based chat and completions. ~40K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002Ftrypear\u002Fpearai-app) |\n| **Void** | Open-source Cursor alternative (VS Code fork); any model or local hosting with change visualization. ~28K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002Fvoideditor\u002Fvoid) |\n| **Melty** | Open-source chat-first AI code editor with multi-file editing and deep Git integration. ~7K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002Fmeltylabs\u002Fmelty) |\n| **Emdash** | Open-source agentic dev environment (YC W26) for running multiple coding agents in parallel in isolated Git worktrees. | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002Fgeneralaction\u002Femdash) |\n\n#### IDE Extensions \u002F Plugins\n\nPlugins for VS Code, JetBrains, Neovim, and other editors.\n\n| Name | Description | Type | Link |\n|:-----|:-----------|:----:|:----:|\n| **GitHub Copilot** | Most widely adopted AI coding assistant; inline completions, chat, and agentic coding agent across VS Code, JetBrains, Neovim. | 🔵 | [Website](https:\u002F\u002Fgithub.com\u002Ffeatures\u002Fcopilot) |\n| **Cline** | Autonomous coding agent in VS Code with human-in-the-loop approvals; file editing, terminal commands, and browser use. ~59K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002Fcline\u002Fcline) |\n| **Continue** | Open-source VS Code and JetBrains extension for creating custom, modular AI dev systems; any model. ~32K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002Fcontinuedev\u002Fcontinue) |\n| **Cody** | Sourcegraph-powered AI assistant that pulls context from local and remote codebases; VS Code, JetBrains, Visual Studio. | 🔵 | [Website](https:\u002F\u002Fsourcegraph.com\u002Fcody) |\n| **Codeium** | Free AI coding extension for 40+ IDEs with completions, chat, and search across 70+ languages. | 🟣 | [Website](https:\u002F\u002Fcodeium.com) |\n| **Amazon Q Developer** | AWS's AI coding assistant with completions, inline chat, and agent mode; deep AWS integration. | 🟣 | [Website](https:\u002F\u002Faws.amazon.com\u002Fq\u002Fdeveloper\u002F) |\n| **Gemini Code Assist** | Google's IDE extension powered by Gemini with completions, Next Edit Predictions, and inline diffs; free for individuals. | 🟣 | [Website](https:\u002F\u002Fcodeassist.google) |\n| **Tabnine** | Privacy-focused AI assistant trained on permissive-licensed OSS; supports all major IDEs with on-premises deployment. | 🔵 | [Website](https:\u002F\u002Fwww.tabnine.com) |\n| **Augment Code** | Enterprise AI coding assistant with 200K-token Context Engine for deep codebase understanding. | 🔵 | [Website](https:\u002F\u002Fwww.augmentcode.com) |\n| **Qodo** | AI code review and quality platform with multi-agent architecture; test generation, code review, CI\u002FCD enforcement. | 🟣 | [Website](https:\u002F\u002Fwww.qodo.ai) |\n| **CodeGeeX** | Open-source multilingual code generation model supporting 20+ languages with VS Code and JetBrains extensions. ~11K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002Fzai-org\u002FCodeGeeX) |\n| **Tabby** | Self-hosted open-source AI coding assistant (Copilot alternative); runs entirely on your infrastructure. ~25K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002FTabbyML\u002Ftabby) |\n\n#### AI Coding Platforms \u002F Cloud Agents\n\nBrowser-based or cloud-hosted agents that build, test, and deploy autonomously.\n\n| Name | Description | Type | Link |\n|:-----|:-----------|:----:|:----:|\n| **Devin** | First fully autonomous cloud-based AI software engineer; plans, codes, tests, and opens PRs independently. | 🔵 | [Website](https:\u002F\u002Fdevin.ai) |\n| **Replit Agent** | Cloud-native AI agent that autonomously builds, tests, and deploys full-stack apps in-browser; 50+ languages. | 🔵 | [Website](https:\u002F\u002Freplit.com\u002Fproducts\u002Fagent) |\n| **bolt.new** | AI-powered web dev agent; prompt, run, edit, and deploy full-stack apps directly in the browser via WebContainers. ~15K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002Fstackblitz\u002Fbolt.new) |\n| **bolt.diy** | Community fork of bolt.new with extended features and broader LLM flexibility. ~12K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002Fstackblitz-labs\u002Fbolt.diy) |\n| **Lovable** | Full-stack apps from natural language with built-in Supabase, auth, and one-click deploy; fastest European startup to $20M ARR. | 🔵 | [Website](https:\u002F\u002Flovable.dev) |\n| **v0** | Vercel's AI platform for generating high-quality React\u002FNext.js UI components from natural language. | 🔵 | [Website](https:\u002F\u002Fv0.dev) |\n| **GitHub Copilot Workspace** | Cloud-based coding environment with plan, brainstorm, and repair agents; included with paid Copilot plans. | 🔵 | [Website](https:\u002F\u002Fgithubnext.com\u002Fprojects\u002Fcopilot-workspace) |\n| **Firebase Studio** | Google's agentic cloud-based development environment. | 🔵 | [Website](https:\u002F\u002Ffirebase.google.com\u002Fstudio) |\n\n#### Open-Source Coding Agent Frameworks\n\nFrameworks and research projects for building autonomous coding agents.\n\n| Name | Description | Type | Link |\n|:-----|:-----------|:----:|:----:|\n| **OpenHands** | Leading open-source platform for cloud coding agents; consistently top on SWE-bench. Formerly OpenDevin. ~69K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002FOpenHands\u002FOpenHands) |\n| **SWE-agent** | Takes a GitHub issue and automatically fixes it using a custom agent-computer interface. [NeurIPS 2024] ~19K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002FSWE-agent\u002FSWE-agent) |\n| **Open SWE** | LangChain's async cloud-hosted coding agent framework built on LangGraph with Slack\u002FLinear integration. ~8K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002Flangchain-ai\u002Fopen-swe) |\n| **Devika** | Open-source agentic software engineer; breaks down instructions, researches, and writes code. Devin alternative. ~18K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002Fstitionai\u002Fdevika) |\n| **AutoCodeRover** | Autonomous program improvement combining LLMs with fault localization for GitHub issue resolution. ~2.8K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002Fnus-apr\u002Fauto-code-rover) |\n| **Agentless** | Simple three-phase approach (localize → repair → validate) to solving software development problems. ~2K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002FOpenAutoCoder\u002FAgentless) |\n| **Devon** | Open-source pair programmer SWE agent with code writing, planning, and research; supports Claude, GPT-4, Llama, Ollama. ~3.5K+ ⭐ | 🟢 | [GitHub](https:\u002F\u002Fgithub.com\u002Fentropy-research\u002FDevon) |\n\n### Other Notable Repositories\n\n| Name | Description | Link |\n|:-----|:-----------|:----:|\n| **Prompt Engineering Guide (DAIR.AI)** | The definitive open-source guide and resource hub. 3M+ learners. ~55K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fdair-ai\u002FPrompt-Engineering-Guide) |\n| **Awesome ChatGPT Prompts \u002F Prompts.chat** | World's largest open-source prompt library. 1000s of prompts for all major models. | [GitHub](https:\u002F\u002Fgithub.com\u002Ff\u002Fawesome-chatgpt-prompts) |\n| **12-Factor Agents** | Principles for building production-grade LLM-powered software. ~17K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002Fhumanlayer\u002F12-factor-agents) |\n| **NirDiamant\u002FPrompt_Engineering** | 22 hands-on Jupyter Notebook tutorials. ~3K+ ⭐ | [GitHub](https:\u002F\u002Fgithub.com\u002FNirDiamant\u002FPrompt_Engineering) |\n| **Context Engineering Repository** | First-principles handbook for moving beyond prompt engineering to context design. | [GitHub](https:\u002F\u002Fgithub.com\u002Fdavidkimai\u002FContext-Engineering) |\n| **AI Agent System Prompts Library** | Collection of system prompts from production AI coding agents (Claude Code, Gemini CLI, Cline, Aider, Roo Code). | [GitHub](https:\u002F\u002Fgithub.com\u002Ftallesborges\u002Fagentic-system-prompts) |\n| **Awesome Vibe Coding** | Curated list of 245+ tools and resources for building software through natural language prompts. | [GitHub](https:\u002F\u002Fgithub.com\u002Ftaskade\u002Fawesome-vibe-coding) |\n| **OpenAI Cookbook** | Official recipes for prompts, tools, RAG, and evaluations. | [GitHub](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fopenai-cookbook) |\n| **Embedchain** | Framework to create ChatGPT-like bots over your dataset. | [GitHub](https:\u002F\u002Fgithub.com\u002Fembedchain\u002Fembedchain) |\n| **ThoughtSource** | Framework for the science of machine thinking. | [GitHub](https:\u002F\u002Fgithub.com\u002FOpenBioLink\u002FThoughtSource) |\n| **Promptext** | Extracts and formats code context for AI prompts with token counting. | [GitHub](https:\u002F\u002Fgithub.com\u002F1broseidon\u002Fpromptext) |\n| **Price Per Token** | Compare LLM API pricing across 200+ models. | [Website](https:\u002F\u002Fpricepertoken.com\u002F) |\n| **OpenPaw** | CLI tool (`npx pawmode`) that turns Claude Code into a personal assistant by generating system prompts (CLAUDE.md + SOUL.md) with personality, memory, and 38 skill routers. | [GitHub](https:\u002F\u002Fgithub.com\u002Fdaxaur\u002Fopenpaw) |\n| **Think Better** | Open-source CLI that permanently injects 10 structured decision frameworks (MECE, Issue Trees, Pre-Mortems) and 12 cognitive bias detectors into AI assistant prompts. Go, MIT. | [GitHub](https:\u002F\u002Fgithub.com\u002FHoangTheQuyen\u002Fthink-better) |\n\n---\n\n## APIs\n💻\n\n### OpenAI\n\n| Model | Context | Price (Input\u002FOutput per 1M tokens) | Key Feature |\n|:------|:--------|:-----------------------------------|:------------|\n| GPT-5.2 \u002F 5.2 Thinking | 400K | $1.75 \u002F $14 | Latest flagship, 90% cached discount, configurable reasoning |\n| GPT-5.1 | 400K | $1.25 \u002F $10 | Previous generation flagship |\n| GPT-4.1 \u002F 4.1 mini \u002F nano | 1M | $2 \u002F $8 | Best non-reasoning model, 40% faster and 80% cheaper than GPT-4o |\n| o3 \u002F o3-pro | 200K | Varies | Reasoning models with native tool use |\n| o4-mini | 200K | Cost-efficient | Fast reasoning, best on AIME at its cost class |\n| GPT-OSS-120B \u002F 20B | 128K | $0.03 \u002F $0.30 | First open-weight models, Apache 2.0 |\n\nKey features: Responses API, Agents SDK, Structured Outputs, function calling, prompt caching (90% discount), Batch API (50% discount), MCP support. [Platform Docs](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fmodels)\n\n### Anthropic (Claude)\n\n| Model | Context | Price (Input\u002FOutput per 1M tokens) | Key Feature |\n|:------|:--------|:-----------------------------------|:------------|\n| Claude Opus 4.6 | 1M (beta) | $5 \u002F $25 | Most powerful, state-of-the-art coding and agentic tasks |\n| Claude Sonnet 4.5 | 200K | $3 \u002F $15 | Best coding model, 61.4% OSWorld (computer use) |\n| Claude Haiku 4.5 | 200K | Fast tier | Near-frontier, fastest model class |\n| Claude Opus 4 \u002F Sonnet 4 | 200K | $15\u002F$75 (Opus) | Opus: 72.5% SWE-bench, Sonnet 4 powers GitHub Copilot |\n\nKey features: Extended Thinking with tool use, Computer Use, MCP (originated here), prompt caching, Claude Code CLI, available on AWS Bedrock and Google Vertex AI. [API Docs](https:\u002F\u002Fdocs.anthropic.com\u002F)\n\n### Google (Gemini)\n\n| Model | Context | Price (Input\u002FOutput per 1M tokens) | Key Feature |\n|:------|:--------|:-----------------------------------|:------------|\n| Gemini 3 Pro Preview | 1M | $2 \u002F $12 | Most intelligent Google model, deployed to 2B+ Search users |\n| Gemini 2.5 Pro | 1M | $1.25 \u002F $10 | Best for coding\u002Fagentic tasks, thinking model |\n| Gemini 2.5 Flash \u002F Flash-Lite | 1M | $0.30\u002F$1.50 · $0.10\u002F$0.40 | Price-performance leaders |\n\nKey features: Thinking (all 2.5+ models), Google Search grounding, code execution, Live API (real-time audio\u002Fvideo), context caching. [Google AI Studio](https:\u002F\u002Fai.google.dev\u002F)\n\n### Meta (Llama)\n\n| Model | Architecture | Context | Key Feature |\n|:------|:------------|:--------|:------------|\n| Llama 4 Scout | 109B MoE \u002F 17B active | 10M | Fits single H100, multimodal, open-weight |\n| Llama 4 Maverick | 400B MoE \u002F 17B active, 128 experts | 1M | Beats GPT-4o, open-weight |\n| Llama 3.3 70B | Dense | 128K | Matches Llama 3.1 405B |\n\nAvailable on 25+ cloud partners, Hugging Face, and inference APIs. [Llama](https:\u002F\u002Fai.meta.com\u002Fllama\u002F)\n\n### Other Notable Providers\n\n| Provider | Description | Link |\n|:---------|:-----------|:----:|\n| **Mistral AI** | Mistral Large 3 (675B MoE), Devstral 2, Ministral 3. Apache 2.0. | [Website](https:\u002F\u002Fmistral.ai) |\n| **DeepSeek** | V3.2 (671B MoE), R1 (reasoning, MIT license). $0.15\u002F$0.75 per 1M tokens. | [Website](https:\u002F\u002Fdeepseek.com) |\n| **xAI (Grok)** | Grok 4.1 Fast: 2M context, $0.20\u002F$0.50 per 1M tokens. | [Website](https:\u002F\u002Fx.ai) |\n| **Cohere** | Command A (111B, 256K context), Embed v4, Rerank 4.0. Excels at RAG. | [Website](https:\u002F\u002Fcohere.com) |\n| **Together AI** | 200+ open models with sub-100ms latency. | [Website](https:\u002F\u002Ftogether.ai) |\n| **Groq** | LPU hardware with ~300+ tokens\u002Fsec inference. | [Website](https:\u002F\u002Fgroq.com) |\n| **Fireworks AI** | Fast inference with HIPAA + SOC2 compliance. | [Website](https:\u002F\u002Ffireworks.ai) |\n| **OpenRouter** | Unified API for 300+ models from all providers. | [Website](https:\u002F\u002Fopenrouter.ai) |\n| **Cerebras** | Wafer-scale chips with best total response time. | [Website](https:\u002F\u002Fcerebras.ai) |\n| **Perplexity AI** | Search-augmented API with citations. | [Website](https:\u002F\u002Fperplexity.ai) |\n| **Amazon Bedrock** | Managed multi-model service with Claude, Llama, Mistral, Cohere. | [Website](https:\u002F\u002Faws.amazon.com\u002Fbedrock\u002F) |\n| **Hugging Face Inference** | Access to open models via API. | [Website](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Fapi-inference\u002Findex) |\n\n---\n\n## Datasets and Benchmarks\n💾\n\n### Major Benchmarks (2024–2026)\n\n| Name | Description | Link |\n|:-----|:-----------|:----:|\n| **Chatbot Arena \u002F LM Arena** | 6M+ user votes for Elo-rated pairwise LLM comparisons. De facto standard for human preference. | [Website](https:\u002F\u002Flmarena.ai\u002F) |\n| **MMLU-Pro** | 12,000+ graduate-level questions across 14 domains. NeurIPS 2024 Spotlight. | [GitHub](https:\u002F\u002Fgithub.com\u002FTIGER-AI-Lab\u002FMMLU-Pro) |\n| **GPQA** | 448 \"Google-proof\" STEM questions; non-expert validators achieve only 34%. | [arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2311.12022) |\n| **SWE-bench Verified** | Human-validated 500-task subset for real-world GitHub issue resolution. | [Website](https:\u002F\u002Fwww.swebench.com\u002F) |\n| **SWE-bench Pro** | 1,865 tasks across 41 professional repos; best models score only ~23%. | [Leaderboard](https:\u002F\u002Fscale.com\u002Fleaderboard\u002Fswe_bench_pro_public) |\n| **Humanity's Last Exam (HLE)** | 2,500 expert-vetted questions; top AI scores only ~10–30%. | [Website](https:\u002F\u002Fagi.safe.ai\u002F) |\n| **BigCodeBench** | 1,140 coding tasks across 7 domains; AI achieves ~35.5% vs. 97% human success. | [Leaderboard](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fbigcode\u002Fbigcodebench-leaderboard) |\n| **LiveBench** | Contamination-resistant with frequently updated questions. | [Paper](https:\u002F\u002Fopenreview.net\u002Fforum?id=sKYHBTAxVa) |\n| **FrontierMath** | Research-level math; AI solves only ~2% of problems. | Research |\n| **ARC-AGI v2** | Abstract reasoning measuring fluid intelligence. | Research |\n| **IFEval** | Instruction-following evaluation with formatting\u002Fcontent constraints. | [arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2311.07911) |\n| **MLE-bench** | OpenAI's ML engineering evaluation via Kaggle-style tasks. | [GitHub](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fmle-bench) |\n| **PaperBench** | Evaluates AI's ability to replicate 20 ICML 2024 papers from scratch. | [GitHub](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fpreparedness) |\n\n### Leaderboards and Meta-Benchmarks\n\n| Name | Description | Link |\n|:-----|:-----------|:----:|\n| **Hugging Face Open LLM Leaderboard v2** | Evaluates open models on MMLU-Pro, GPQA, IFEval, MATH. | [Leaderboard](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fopen-llm-leaderboard\u002Fopen_llm_leaderboard) |\n| **Artificial Analysis Intelligence Index v3** | Aggregates 10 evaluations. | [Website](https:\u002F\u002Fartificialanalysis.ai\u002F) |\n| **SEAL by Scale AI** | Hosts SWE-bench Pro and agentic evaluations. | [Leaderboard](https:\u002F\u002Fscale.com\u002Fleaderboard) |\n\n### Prompt and Instruction Datasets\n\n| Name | Description | Link |\n|:-----|:-----------|:----:|\n| **P3 (Public Pool of Prompts)** | Prompt templates for 270+ NLP tasks used to train T0 and similar models. | [HuggingFace](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fbigscience\u002FP3) |\n| **System Prompts Dataset** | 944 system prompt templates for agent workflows (by Daniel Rosehill, Aug 2025). | [HuggingFace](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fdanielrosehill\u002Fsystem_prompts) |\n| **OpenAssistant Conversations (OASST)** | 161,443 messages in 35 languages with 461,292 quality ratings. | [HuggingFace](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FOpenAssistant\u002Foasst1) |\n| **UltraChat \u002F UltraFeedback** | Large-scale synthetic instruction and preference datasets for alignment training. | HuggingFace |\n| **SoftAge Prompt Engineering Dataset** | 1,000 diverse prompts across 10 categories for benchmarking prompt performance. | HuggingFace |\n| **Text Transformation Prompt Library** | Comprehensive collection of text transformation prompts (May 2025). | HuggingFace |\n| **Writing Prompts** | ~300K human-written stories paired with prompts from r\u002FWritingPrompts. | [Kaggle](https:\u002F\u002Fwww.kaggle.com\u002Fdatasets\u002Fratthachat\u002Fwriting-prompts) |\n| **Midjourney Prompts** | Text prompts and image URLs scraped from MidJourney's public Discord. | [HuggingFace](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fsuccinctly\u002Fmidjourney-prompts) |\n| **CodeAlpaca-20k** | 20,000 programming instruction-output pairs. | [HuggingFace](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fsahil2801\u002FCodeAlpaca-20k) |\n| **ProPEX-RAG** | Dataset for prompt optimization in RAG workflows. | HuggingFace |\n| **NanoBanana Trending Prompts** | 1,000+ curated AI image prompts from X\u002FTwitter, ranked by engagement. | [GitHub](https:\u002F\u002Fgithub.com\u002Fjau123\u002Fnanobanana-trending-prompts) |\n\n### Red Teaming and Adversarial Datasets\n\n| Name | Description | Link |\n|:-----|:-----------|:----:|\n| **HarmBench** | 510 harmful behaviors across standard, contextual, copyright, and multimodal categories. | [Website](https:\u002F\u002Fsafetyprompts.com\u002F) |\n| **JailbreakBench** | Open robustness benchmark for jailbreaking with 100 prompts. | Research |\n| **AgentHarm** | 110 malicious agent tasks across 11 harm categories. | [arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.09024) |\n| **DecodingTrust** | 243,877 prompts evaluating trustworthiness across 8 perspectives. | Research |\n| **SafetyPrompts.com** | Aggregator tracking 50+ safety\u002Fred-teaming datasets. | [Website](https:\u002F\u002Fsafetyprompts.com\u002F) |\n\n---\n\n## Models\n🧠\n\n### Frontier Models (2025–2026)\n\n| Model | Provider | Context | Key Strength |\n|:------|:---------|:--------|:-------------|\n| **GPT-5.2** | OpenAI | 400K | General intelligence, 100% AIME 2025 |\n| **Claude Opus 4.6** | Anthropic | 1M (beta) | Coding, agentic tasks, extended thinking |\n| **Gemini 3 Pro** | Google | 1M | #1 LMArena (~1500 Elo), multimodal |\n| **Grok 4.1** | xAI | 2M | #2 LMArena (1483 Elo), low hallucination |\n| **Mistral Large 3** | Mistral AI | 256K | Best open-weight (675B MoE\u002F41B active), Apache 2.0 |\n| **DeepSeek-V3.2** | DeepSeek | 128K | Best value (671B MoE\u002F37B active), MIT license |\n| **Llama 4 Maverick** | Meta | 1M | Beats GPT-4o (400B MoE\u002F17B active), open-weight |\n\n### Reasoning Models\n\n| Model | Key Detail |\n|:------|:-----------|\n| **OpenAI o3 \u002F o3-pro** | 87.7% GPQA Diamond. Native tool use. |\n| **OpenAI o4-mini** | Best AIME at its cost class with visual reasoning. |\n| **DeepSeek-R1 \u002F R1-0528** | Open-weight, RL-trained. 87.5% on AIME 2025. MIT license. |\n| **QwQ (Qwen with Questions)** | 32B reasoning model. Apache 2.0. Comparable to R1. |\n| **Gemini 2.5 Pro\u002FFlash (Thinking)** | Built-in reasoning with configurable thinking budget. |\n| **Claude Extended Thinking** | Hybrid mode with visible chain-of-thought and tool use. |\n| **Phi-4 Reasoning \u002F Plus** | 14B reasoning models rivaling much larger models. Open-weight. |\n| **GPT-OSS-120B** | OpenAI's open-weight with CoT. Near-parity with o4-mini. Apache 2.0. |\n\n### Notable Open-Source Models\n\n| Model | Provider | Key Detail |\n|:------|:---------|:-----------|\n| **Qwen3-235B-A22B** | Alibaba | Flagship MoE. Strong reasoning\u002Fcode\u002Fmultilingual. Apache 2.0. Most downloaded family on HuggingFace. |\n| **Gemma 3** | Google | 270M to 27B. Multimodal. 128K context. 140+ languages. |\n| **OLMo 2\u002F3** | Allen AI | Fully open (data, code, weights, logs). OLMo 2 32B surpasses GPT-3.5. Apache 2.0. |\n| **SmolLM3-3B** | Hugging Face | Outperforms Llama-3.2-3B. Dual-mode reasoning. 128K context. |\n| **Kimi K2** | Moonshot AI | 32B active. Open-weight. Tailored for coding\u002Fagentic use. |\n| **Llama 4 Scout** | Meta | 109B MoE\u002F17B active. 10M token context. Fits single H100. |\n\n### Code-Specialized Models\n\n| Model | Key Detail |\n|:------|:-----------|\n| **Qwen3-Coder (480B-A35B)** | 69.6% SWE-bench — milestone for open-source coding. 256K context. Apache 2.0. |\n| **Devstral 2 (123B)** | 72.2% SWE-bench Verified. 7x more cost-efficient than Claude Sonnet. |\n| **Codestral 25.01** | Mistral's code model. 80+ languages. Fill-in-the-Middle support. |\n| **DeepSeek-Coder-V2** | 236B MoE \u002F 21B active. 338 programming languages. |\n| **Qwen 2.5-Coder** | 7B\u002F32B. 92 programming languages. 88.4% HumanEval. Apache 2.0. |\n\n### Foundational Models (Historical Reference)\n\nThese models established key concepts but are largely superseded for practical use:\n\n| Model | Provider | Significance |\n|:------|:---------|:-------------|\n| GLM-130B | Tsinghua | Open bilingual English\u002FChinese LLM (2023) |\n| Falcon 180B | TII | Large open generative model (2023) |\n| Mixtral 8x7B | Mistral AI | Pioneered MoE architecture for open models (2023) |\n| GPT-NeoX-20B | EleutherAI | Early open autoregressive LLM |\n| GPT-J-6B | EleutherAI | Early open causal language model |\n\n---\n\n## AI Content Detectors\n🔎\n\n### Leading Commercial Detectors\n\n| Name | Accuracy | Key Feature | Link |\n|:-----|:---------|:------------|:----:|\n| **GPTZero** | 99% claimed | 10M+ users, #1 on G2 (2025). Detects GPT-4\u002F5, Gemini, Claude, Llama. Free tier available. | [Website](https:\u002F\u002Fgptzero.me) |\n| **Originality.ai** | 98–100% (peer-reviewed) | Consistently rated most accurate. Combines AI detection + plagiarism + fact checking. From $14.95\u002Fmonth. | [Website](https:\u002F\u002Foriginality.ai) |\n| **Turnitin AI Detection** | 98%+ on unmodified AI text | Dominant in academia. Launched AI bypasser\u002Fhumanizer detection (Aug 2025). Institutional licensing. | [Website](https:\u002F\u002Fwww.turnitin.com\u002Fsolutions\u002Ftopics\u002Fai-writing\u002F) |\n| **Copyleaks** | 99%+ claimed | Enterprise tool detecting AI in 30+ languages. LMS integrations. | [Website](https:\u002F\u002Fcopyleaks.com) |\n| **Winston AI** | 99.98% claimed | OCR for scanned documents, AI image\u002Fdeepfake detection. 11 languages. | [Website](https:\u002F\u002Fgowinston.ai) |\n| **Pangram Labs** | 99.3% (COLING 2025) | Highest score in COLING 2025 Shared Task. 100% TPR on \"humanized\" text. 97.7% adversarial robustness. | [Website](https:\u002F\u002Fwww.pangram.com) |\n\n### Free and Research Detectors\n\n| Name | Description | Link |\n|:-----|:-----------|:----:|\n| **Binoculars** | Open-source research detector using cross-perplexity between two LLMs. | [arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.12070) |\n| **DetectGPT \u002F Fast-DetectGPT** | Statistical method comparing log-probabilities of original text vs. perturbations. | [arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2301.11305) |\n| **Openai Detector** | AI classifier for indicating AI-written text (OpenAI Detector Python wrapper)  | [[GitHub]](https:\u002F\u002Fgithub.com\u002Fpromptslab\u002Fopenai-detector) |\n| **Sapling AI Detector** | Free browser-based detector (up to 2,000 chars). 97% accuracy in some studies. | [Website](https:\u002F\u002Fsapling.ai\u002F) |\n| **QuillBot AI Detector** | Free, no sign-up required. | [Website](https:\u002F\u002Fquillbot.com\u002Fai-content-detector) |\n| **Writer AI Content Detector** | Free tool with color-coded results. | [Website](https:\u002F\u002Fwriter.com\u002Fai-content-detector\u002F) |\n| **ZeroGPT** | Popular free detector evaluated in multiple academic studies. | [Website](https:\u002F\u002Fwww.zerogpt.com\u002F) |\n\n### Watermarking Approaches\n\n| Name | Description | Link |\n|:-----|:-----------|:----:|\n| **SynthID (Google DeepMind)** | Watermarking for AI text, images, and audio via statistical token sampling. Deployed in Google products. | [Website](https:\u002F\u002Fdeepmind.google\u002Ftechnologies\u002Fsynthid\u002F) |\n| **OpenAI Text Watermarking** | Developed but still experimental as of 2025. Research shows fragility concerns. | Experimental |\n\n**Important caveat:** No detector claims 100% accuracy. Mixed human\u002FAI text remains hardest to detect (50–70% accuracy). Adversarial robustness varies widely. The AI detection market is projected to grow from ~$2.3B (2025) to $15B by 2035.\n\n---\n\n## Books\n📖\n\n### Prompt Engineering\n\n| Title | Author(s) | Publisher | Year |\n|:------|:----------|:---------|:-----|\n| **Prompt Engineering for LLMs** | John Berryman & Albert Ziegler | O'Reilly | 2024 |\n| **Prompt Engineering for Generative AI** | James Phoenix & Mike Taylor | O'Reilly | 2024 |\n| **Prompt Engineering for LLMs** | Thomas R. Caldwell | Independent | 2025 |\n\n### LLM Application Development\n\n| Title | Author(s) | Publisher | Year |\n|:------|:----------|:---------|:-----|\n| **AI Engineering: Building Applications with Foundation Models** | Chip Huyen | O'Reilly | 2025 |\n| **Build a Large Language Model (From Scratch)** | Sebastian Raschka | Manning | 2024 |\n| **Building LLMs for Production** | Louis-François Bouchard & Louie Peters | O'Reilly | 2024 |\n| **LLM Engineer's Handbook** | Paul Iusztin & Maxime Labonne | Packt | 2024 |\n| **The Hundred-Page Language Models Book** | Andriy Burkov | Self-Published | 2025 |\n\n### AI Agents\n\n| Title | Author(s) | Publisher | Year |\n|:------|:----------|:---------|:-----|\n| **Building Applications with AI Agents** | Michael Albada | O'Reilly | 2025 |\n| **AI Agents and Applications** | Roberto Infante | Manning | 2025 |\n| **AI Agents in Action** | Micheal Lanham | Manning | 2025 |\n\n### Production, Reliability, and Security\n\n| Title | Author(s) | Publisher | Year |\n|:------|:----------|:---------|:-----|\n| **LLMs in Production** | Christopher Brousseau & Matthew Sharp | Manning | 2025 |\n| **Building Reliable AI Systems** | Rush Shahani | Manning | 2025 |\n| **The Developer's Playbook for LLM Security** | Steve Wilson | O'Reilly | 2024 |\n\n---\n\n## Courses\n👩‍🏫\n\n### Free Short Courses\n\n- [ChatGPT Prompt Engineering for Developers](https:\u002F\u002Fwww.deeplearning.ai\u002Fshort-courses\u002Fchatgpt-prompt-engineering-for-developers\u002F) — Co-taught by Andrew Ng and OpenAI's Isa Fulford. The foundational starting point. (DeepLearning.AI)\n- [Building Systems with the ChatGPT API](https:\u002F\u002Fwww.deeplearning.ai\u002Fshort-courses\u002Fbuilding-systems-with-chatgpt\u002F) — Multi-step LLM system design for production. (DeepLearning.AI)\n- [AI Agents in LangGraph](https:\u002F\u002Fwww.deeplearning.ai\u002Fshort-courses\u002Fai-agents-in-langgraph\u002F) — Agentic dataflows with tool use and research agents. (DeepLearning.AI)\n- [Building Agentic RAG with LlamaIndex](https:\u002F\u002Fwww.deeplearning.ai\u002Fshort-courses\u002Fbuilding-agentic-rag-with-llamaindex\u002F) — RAG research agent construction. (DeepLearning.AI)\n- [Functions, Tools and Agents with LangChain](https:\u002F\u002Fwww.deeplearning.ai\u002Fshort-courses\u002Ffunctions-tools-agents-langchain\u002F) — Function calling and agent building. (DeepLearning.AI)\n- [Prompt Engineering for Vision Models](https:\u002F\u002Fwww.deeplearning.ai\u002Fshort-courses\u002Fprompt-engineering-for-vision-models\u002F) — Visual prompting techniques. (DeepLearning.AI)\n\n### University and Platform Courses\n\n- [Prompt Engineering Specialization (Vanderbilt)](https:\u002F\u002Fwww.coursera.org\u002Fspecializations\u002Fprompt-engineering) — 3-course series by Dr. Jules White covering foundational to advanced PE. (Coursera)\n- [Generative AI with LLMs (DeepLearning.AI + AWS)](https:\u002F\u002Fwww.coursera.org\u002Flearn\u002Fgenerative-ai-with-llms) — LLM lifecycle, transformers, RLHF, deployment. (Coursera)\n- [Stanford CS336: Language Modeling from Scratch](https:\u002F\u002Fcs336.stanford.edu\u002F) — Build an LLM end-to-end. (Stanford, 2024–2026)\n- [MIT 6.S191: Introduction to Deep Learning](https:\u002F\u002Fintrotodeeplearning.com\u002F) — Annual course including LLMs and generative AI. (MIT, 2024–2026)\n- [The Complete Prompt Engineering for AI Bootcamp](https:\u002F\u002Fwww.udemy.com\u002Fcourse\u002Fprompt-engineering-for-ai\u002F) — Covers GPT-5, DSPy, LangGraph, agent architectures. 58K+ ratings. (Udemy, updated Feb 2026)\n\n### Free Platform Courses\n\n- [Google Prompting Essentials](https:\u002F\u002Fgrow.google\u002Fprompting-essentials\u002F) — 5-step prompt design, meta-prompting, Gemini. Under 6 hours.\n- [Microsoft Azure AI Fundamentals: Generative AI](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Ftraining\u002Fpaths\u002Fintroduction-generative-ai\u002F) — Free learning path covering LLMs, prompts, agents, Azure OpenAI.\n- [Hugging Face LLM Course](https:\u002F\u002Fhuggingface.co\u002Flearn\u002Fllm-course\u002Fchapter1\u002F1) — Community-driven course covering transformers, fine-tuning, building reasoning models.\n- [Hugging Face AI Agents Course](https:\u002F\u002Fhuggingface.co\u002Flearn) — Agent theory to practice. 100K+ registered students.\n\n### Learn Prompting Courses\n\n- [ChatGPT for Everyone](https:\u002F\u002Flearnprompting.org\u002Fcourses\u002Fchatgpt-for-everyone)\n- [Introduction to Prompt Engineering](https:\u002F\u002Flearnprompting.org\u002Fcourses\u002Fintroduction_to_prompt_engineering)\n- [Advanced Prompt Engineering](https:\u002F\u002Flearnprompting.org\u002Fcourses\u002Fadvanced-prompt-engineering)\n- [Introduction to Prompt Hacking](https:\u002F\u002Flearnprompting.org\u002Fcourses\u002Fintro-to-prompt-hacking)\n- [Advanced Prompt Hacking](https:\u002F\u002Flearnprompting.org\u002Fcourses\u002Fadvanced-prompt-hacking)\n- [Introduction to Generative AI Agents for Business Professionals](https:\u002F\u002Flearnprompting.org\u002Fcourses\u002Fintroduction-to-agents)\n- [AI Safety](https:\u002F\u002Flearnprompting.org\u002Fcourses\u002Fai-safety)\n\n---\n\n## Tutorials and Guides\n📚\n\n### Official Provider Guides\n\n- [OpenAI Prompt Engineering Guide](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fguides\u002Fprompt-engineering) — Comprehensive, covering GPT-4.1\u002F5 prompting, reasoning models, structured outputs, agentic workflows. Continuously updated.\n- [OpenAI GPT-4.1 Prompting Guide](https:\u002F\u002Fcookbook.openai.com\u002Farticles\u002Fgpt-4-1-prompting-guide) [2025] — Structured agent-like prompt design: goal persistence, tool integration, long-context processing.\n- [Anthropic Prompt Engineering Overview](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fbuild-with-claude\u002Fprompt-engineering\u002Foverview) — Iterative prompt design, XML tags, chain-of-thought, role assignment. Includes prompt generator.\n- [Anthropic Claude 4 Best Practices](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fbuild-with-claude\u002Fprompt-engineering\u002Fclaude-4-best-practices) [2025–2026] — Parallel tool execution, thinking capabilities, image processing.\n- [Anthropic: Effective Context Engineering for AI Agents](https:\u002F\u002Fwww.anthropic.com\u002Fengineering\u002Feffective-context-engineering-for-ai-agents) [2025] — The evolution from prompt eng","该项目是一个精心整理的资源库，专注于生成式预训练变换器（如GPT、ChatGPT和PaLM）的提示工程。核心功能包括提供涵盖论文、工具、模型、API、基准测试、课程和社区的全面资料，以支持大型语言模型的应用开发。技术特点在于其对当前最新研究进展和技术实践的持续更新与归纳总结。适合于希望深入了解并掌握提示工程技术的研究人员、开发者以及任何对人工智能领域感兴趣的人士使用。",2,"2026-06-11 03:37:28","high_star"]