[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-3826":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":15,"stars7d":16,"stars30d":17,"stars90d":15,"forks30d":15,"starsTrendScore":15,"compositeScore":17,"rankGlobal":10,"rankLanguage":10,"license":18,"archived":19,"fork":20,"defaultBranch":21,"hasWiki":20,"hasPages":20,"topics":22,"createdAt":10,"pushedAt":10,"updatedAt":32,"readmeContent":33,"aiSummary":34,"trendingCount":15,"starSnapshotCount":15,"syncStatus":35,"lastSyncTime":36,"discoverSource":37},3826,"ai-pdf-chatbot-langchain","mayooear\u002Fai-pdf-chatbot-langchain","mayooear","AI PDF chatbot agent built with LangChain & LangGraph ","https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=OF6SolDiEwU",null,"TypeScript",16539,3226,156,0,6,45,"MIT License",true,false,"main",[23,24,25,26,27,28,29,30,31],"agents","ai","chatbot","langchain","langgraph","nextjs","openai","pdf","typescript","2026-06-12 02:00:54","# AI PDF Chatbot & Agent Powered by LangChain and LangGraph\n\nThis monorepo is a customizable template example of an AI chatbot agent that \"ingests\" PDF documents, stores embeddings in a vector database (Supabase), and then answers user queries using OpenAI (or another LLM provider) utilising LangChain and LangGraph as orchestration frameworks.\n\nThis template is also an accompanying example to the book [Learning LangChain (O'Reilly)](https:\u002F\u002Fwww.oreilly.com\u002Flibrary\u002Fview\u002Flearning-langchain\u002F9781098167271): Building AI and LLM applications with LangChain and LangGraph.\n\n> [!IMPORTANT]\n> This project is not actively maintained and is kept here for reference.\n> Please do not expect responses to new issues or pull requests.\n\n**Here's what the Chatbot UI looks like:**\n\n\u003Cimg width=\"1096\" alt=\"Screenshot 2025-02-20 at 05 39 55\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F3a9ddea7-b718-476b-bdae-38839be20c12\" \u002F>\n\n## Table of Contents\n\n1. [Features](#features)\n2. [Architecture Overview](#architecture-overview)\n3. [Prerequisites](#prerequisites)\n4. [Installation](#installation)\n5. [Environment Variables](#environment-variables)\n   - [Frontend Variables](#frontend-variables)\n   - [Backend Variables](#backend-variables)\n6. [Local Development](#local-development)\n   - [Running the Backend](#running-the-backend)\n   - [Running the Frontend](#running-the-frontend)\n7. [Usage](#usage)\n   - [Uploading\u002FIngesting PDFs](#uploadingingesting-pdfs)\n   - [Asking Questions](#asking-questions)\n   - [Viewing Chat History](#viewing-chat-history)\n8. [Production Build & Deployment](#production-build--deployment)\n9. [Customizing the Agent](#customizing-the-agent)\n10. [Troubleshooting](#troubleshooting)\n11. [Next Steps](#next-steps)\n\n---\n\n## Features\n\n- **Document Ingestion Graph**: Upload and parse PDFs into `Document` objects, then store vector embeddings into a vector database (we use Supabase in this example).\n- **Retrieval Graph**: Handle user questions, decide whether to retrieve documents or give a direct answer, then generate concise responses with references to the retrieved documents.\n- **Streaming Responses**: Real-time streaming of partial responses from the server to the client UI.\n- **LangGraph Integration**: Built using LangGraph’s state machine approach to orchestrate ingestion and retrieval, visualise your agentic workflow, and debug each step of the graph.  \n- **Next.js Frontend**: Allows file uploads, real-time chat, and easy extension with React components and Tailwind.\n\n---\n\n## Architecture Overview\n\n```ascii\n┌─────────────────────┐    1. Upload PDFs    ┌───────────────────────────┐\n│Frontend (Next.js)   │ ────────────────────> │Backend (LangGraph)       │\n│ - React UI w\u002F chat  │                      │ - Ingestion Graph         │\n│ - Upload .pdf files │ \u003C────────────────────┤   + Vector embedding via  │\n└─────────────────────┘    2. Confirmation   │     SupabaseVectorStore   │\n(storing embeddings in DB)\n\n┌─────────────────────┐    3. Ask questions  ┌───────────────────────────┐\n│Frontend (Next.js)   │ ────────────────────> │Backend (LangGraph)       │\n│ - Chat + SSE stream │                      │ - Retrieval Graph         │\n│ - Display sources   │ \u003C────────────────────┤   + Chat model (OpenAI)   │\n└─────────────────────┘ 4. Streamed answers  └───────────────────────────┘\n\n```\n- **Supabase** is used as the vector store to store and retrieve relevant documents at query time.  \n- **OpenAI** (or other LLM providers) is used for language modeling.  \n- **LangGraph** orchestrates the \"graph\" steps for ingestion, routing, and generating responses.  \n- **Next.js** (React) powers the user interface for uploading PDFs and real-time chat.\n\nThe system consists of:\n- **Backend**: A Node.js\u002FTypeScript service that contains LangGraph agent \"graphs\" for:\n  - **Ingestion** (`src\u002Fingestion_graph.ts`) - handles indexing\u002Fingesting documents\n  - **Retrieval** (`src\u002Fretrieval_graph.ts`) - question-answering over the ingested documents\n  - **Configuration** (`src\u002Fshared\u002Fconfiguration.ts`) - handles configuration for the backend api including model providers and vector stores\n- **Frontend**: A Next.js\u002FReact app that provides a web UI for users to upload PDFs and chat with the AI.\n---\n\n## Prerequisites\n\n1. **Node.js v18+** (we recommend Node v20).\n2. **Yarn** (or npm, but this monorepo is pre-configured with Yarn).\n3. **Supabase project** (if you plan to store embeddings in Supabase; see [Setting up Supabase](https:\u002F\u002Fsupabase.com\u002Fdocs\u002Fguides\u002Fgetting-started)).\n   - You will need:\n     - `SUPABASE_URL`\n     - `SUPABASE_SERVICE_ROLE_KEY`\n     - A table named `documents` and a function named `match_documents` for vector similarity search (see [LangChain documentation for guidance on setting up the tables](https:\u002F\u002Fjs.langchain.com\u002Fdocs\u002Fintegrations\u002Fvectorstores\u002Fsupabase\u002F)).\n4. **OpenAI API Key** (or another LLM provider’s key, supported by LangChain).\n5. **LangChain API Key** (free and optional, but highly recommended for debugging and tracing your LangChain and LangGraph applications). Learn more [here](https:\u002F\u002Fdocs.smith.langchain.com\u002Fadministration\u002Fhow_to_guides\u002Forganization_management\u002Fcreate_account_api_key)\n\n---\n\n## Installation\n\n1. **Clone** the repository:\n\n   ```bash\n   git clone https:\u002F\u002Fgithub.com\u002Fmayooear\u002Fai-pdf-chatbot-langchain.git\n   cd ai-pdf-chatbot-langchain\n   ```\n\n2.\tInstall dependencies (from the monorepo root):\n\nyarn install\n\n\t3.\tConfigure environment variables in both backend and frontend. See .`env.example` files for details.\n\n## Environment Variables\n\nThe project relies on environment variables to configure keys and endpoints. Each sub-project (backend and frontend) has its own .env.example. Copy these to .env and fill in your details.\n\n### Frontend Variables\n\nCreate a .env file in frontend:\n\n`cp frontend\u002F.env.example frontend\u002F.env`\n\n```\n    NEXT_PUBLIC_LANGGRAPH_API_URL=http:\u002F\u002Flocalhost:2024\n    LANGCHAIN_API_KEY=your-langsmith-api-key-here # Optional: LangSmith API key\n    LANGGRAPH_INGESTION_ASSISTANT_ID=ingestion_graph\n    LANGGRAPH_RETRIEVAL_ASSISTANT_ID=retrieval_graph\n\n    LANGCHAIN_TRACING_V2=true # Optional: Enable LangSmith tracing\n\n    LANGCHAIN_PROJECT=\"pdf-chatbot\" # Optional: LangSmith project name\n```\n\n### Backend Variables\n\nCreate a .env file in backend:\n\n`cp backend\u002F.env.example backend\u002F.env`\n\n```\n    OPENAI_API_KEY=your-openai-api-key-here\n    SUPABASE_URL=your-supabase-url-here\n    SUPABASE_SERVICE_ROLE_KEY=your-supabase-service-role-key-here\n\n    LANGCHAIN_TRACING_V2=true # Optional: Enable LangSmith tracing\n\n    LANGCHAIN_PROJECT=\"pdf-chatbot\" # Optional: LangSmith project name\n```\n\n**Explanation of Environment Variables:**\n\n-   `NEXT_PUBLIC_LANGGRAPH_API_URL`: The URL where your LangGraph backend server is running.  Defaults to `http:\u002F\u002Flocalhost:2024` for local development. \n-   `LANGCHAIN_API_KEY`: Your LangSmith API key.  This is optional, but highly recommended for debugging and tracing your LangChain and LangGraph applications.\n-   `LANGGRAPH_INGESTION_ASSISTANT_ID`: The ID of the LangGraph assistant for document ingestion. Default is `ingestion_graph`.\n-   `LANGGRAPH_RETRIEVAL_ASSISTANT_ID`: The ID of the LangGraph assistant for question answering. Default is `retrieval_graph`.\n-   `LANGCHAIN_TRACING_V2`:  Enable tracing to debug your application on the LangSmith platform.  Set to `true` to enable.\n-   `LANGCHAIN_PROJECT`:  The name of your LangSmith project.\n-   `OPENAI_API_KEY`: Your OpenAI API key.\n-   `SUPABASE_URL`: Your Supabase URL.\n-   `SUPABASE_SERVICE_ROLE_KEY`: Your Supabase service role key.\n\n\n\n## Local Development\n\nThis monorepo uses Turborepo to manage both backend and frontend projects. You can run them separately for development.\n\n### Running the Backend\n\n1.\tNavigate to backend:\n\n```bash\ncd backend\n```\n\n2.\tInstall dependencies (already done if you ran yarn install at the root).\n\n3.\tStart LangGraph in dev mode:\n\n```bash\nyarn langgraph:dev\n```\n\nThis will launch a local LangGraph server on port 2024 by default. It should redirect you to a UI for interacting with the LangGraph server. [Langgraph studio guide](https:\u002F\u002Flangchain-ai.github.io\u002Flanggraph\u002Fconcepts\u002Flanggraph_studio\u002F)\n\n### Running the Frontend\n\n1. Navigate to frontend:\n\n```bash\ncd frontend\n```\n\n2. Start the Next.js development server:\n\n```bash\nyarn dev\n```\n\nThis will start a local Next.js development server (by default on port 3000).\n\nAccess the UI in your browser at http:\u002F\u002Flocalhost:3000.\n\n## Usage\n\nOnce both services are running:\n\n1. Use langgraph studio UI to interact with the LangGraph server and ensure the workflow is working as expected.\n\n2. Navigate to http:\u002F\u002Flocalhost:3000 to use the chatbot UI.\n\n3. Upload a small PDF document via the file upload button at the bottom of the page. This will trigger the ingestion graph to extract the text and store the embeddings in Supabase via the frontend `app\u002Fapi\u002Fingest` route.\n\t\n4. After the ingestion is complete, ask questions in the chat input.\n\n5. The chatbot will trigger the retrieval graph via the `app\u002Fapi\u002Fchat` route to retrieve the most relevant documents from the vector database and use the relevant PDF context (if needed) to answer.\n\n\n### Uploading\u002FIngesting PDFs\n\nClick on the paperclip icon in the chat input area.\n\nSelect one or more PDF files to upload ensuring a total of max 5, each under 10MB (you can change these threshold values in the `app\u002Fapi\u002Fingest` route).\n\nThe backend processes the PDFs, extracts text, and stores embeddings in Supabase (or your chosen vector store).\n\n### Asking Questions\n\n- Type your question in the chat input field.\n- Responses stream in real time. If the system retrieved documents, you’ll see a link to “View Sources” for each chunk of text used in the answer.\n\n### Viewing Chat History\n\n- The system creates a unique thread per user session (frontend). All messages are kept in the state for the session.\n- For demonstration purposes, the current example UI does not store the entire conversation beyond the local thread state and is not persistent across sessions. You can extend it to persist threads in a database. However, the \"ingested documents\" are persistent across sessions as they are stored in a vector database.\n\n\n## Deploying the Backend\n\nTo deploy your LangGraph agent to a cloud service, you can either use LangGraph's cloud as per this [guide](https:\u002F\u002Flangchain-ai.github.io\u002Flanggraph\u002Fcloud\u002Fquick_start\u002F?h=studio#deploy-to-langgraph-cloud) or self-host it as per this [guide](https:\u002F\u002Flangchain-ai.github.io\u002Flanggraph\u002Fhow-tos\u002Fdeploy-self-hosted\u002F).\n\n## Deploying the Frontend\nThe frontend can be deployed to any hosting that supports Next.js (Vercel, Netlify, etc.).\n\nMake sure to set relevant environment variables in your deployment environment. In particular, ensure `NEXT_PUBLIC_LANGGRAPH_API_URL` is pointing to your deployed backend URL.\n\n## Customizing the Agent\n\nYou can customize the agent on the backend and frontend.\n\n### Backend\n\n- In the configuration file `src\u002Fshared\u002Fconfiguration.ts`, you can change the default configs i.e. the vector store, k-value, and filter kwargs, shared between the ingestion and retrieval graphs. On the backend, configs can be used in each node of the graph workflow or from frontend, you can pass a config object into the graph's client.\n- You can adjust the prompts in the `src\u002Fretrieval_graph\u002Fprompts.ts` file.\n- If you'd like to change the retrieval model, you can do so in the `src\u002Fshared\u002Fretrieval.ts` file by adding another retriever function that encapsulates the desired client for the vector store and then updating the `makeRetriever` function to return the new retriever.\n\n\n### Frontend\n\n- You can modify the file upload restrictions in the `app\u002Fapi\u002Fingest` route.\n- In `constants\u002FgraphConfigs.ts`, you can change the default config objects sent to the ingestion and retrieval graphs. These include the model provider, k value (no of source documents to retrieve), and retriever provider (i.e. vector store).\n\n\n## Troubleshooting\n1. .env Not Loaded\n   - Make sure you copied .env.example to .env in both backend and frontend.\n   - Check your environment variables are correct and restart the dev server.\n\n2. Supabase Vector Store\n   - Ensure you have configured your Supabase instance with the documents table and match_documents function. Check the official LangChain docs on Supabase integration.\n\n3. OpenAI Errors\n   - Double-check your OPENAI_API_KEY. Make sure you have enough credits\u002Fquota.\n\n4. LangGraph Not Running\n   - If yarn langgraph:dev fails, confirm your Node version is >= 18 and that you have all dependencies installed.\n\n5. Network Errors\n   - Frontend must point to the correct NEXT_PUBLIC_LANGGRAPH_API_URL. By default, it is http:\u002F\u002Flocalhost:2024.\n\n## Next Steps\n\nIf you'd like to contribute to this project, feel free to open a pull request. Ensure it is well documented and includes tests in the test files.\n\nIf you'd like to learn more about building AI chatbots and agents with LangChain and LangGraph, check out the book [Learning LangChain (O'Reilly)](https:\u002F\u002Fwww.oreilly.com\u002Flibrary\u002Fview\u002Flearning-langchain\u002F9781098167271\u002F).\n\n","该项目是一个基于LangChain和LangGraph构建的AI PDF聊天机器人，能够处理PDF文档并回答用户提问。其核心功能包括通过Supabase存储向量嵌入、利用OpenAI或其他大语言模型提供商来生成答案，并且支持实时流式响应。此外，它还集成了LangGraph的状态机方法以优化文档摄入与检索流程，同时提供了Next.js前端界面便于文件上传及即时聊天。此项目适用于需要从大量PDF文档中快速获取信息或构建自定义知识库查询系统的场景。",2,"2026-06-11 02:56:32","top_language"]