[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-5600":3},{"id":4,"name":5,"fullName":6,"owner":5,"repo":5,"description":7,"homepage":8,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":17,"compositeScore":19,"rankGlobal":9,"rankLanguage":9,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":23,"hasPages":23,"topics":24,"createdAt":9,"pushedAt":9,"updatedAt":42,"readmeContent":43,"aiSummary":44,"trendingCount":15,"starSnapshotCount":15,"syncStatus":16,"lastSyncTime":45,"discoverSource":46},5600,"postgresml","postgresml\u002Fpostgresml","Postgres with GPUs for ML\u002FAI apps.","https:\u002F\u002Fpostgresml.org",null,"Rust",6800,361,56,84,0,2,6,17,72.38,"MIT License",false,"master",true,[25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41],"ai","ann","approximate-nearest-neighbor-search","artificial-intelligence","classification","clustering","embeddings","forecasting","knn","llm","machine-learning","ml","postgres","rag","regression","sql","vector-database","2026-06-12 04:00:25","\u003Cdiv align=\"center\">\n   \u003Cpicture>\n     \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F5d5510da-6014-4cf3-849f-566050e053da\">\n     \u003Csource media=\"(prefers-color-scheme: light)\" srcset=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Faea1c38a-15bf-4270-8365-3d5e6311f5fc\">\n     \u003Cimg alt=\"Logo\" src=\"\" width=\"520\">\n   \u003C\u002Fpicture>\n\u003C\u002Fdiv>\n\n\u003Cp align=\"center\">\n   \u003Cp align=\"center\">\u003Cb>Postgres + GPUs for ML\u002FAI applications.\u003C\u002Fb>\u003C\u002Fp>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n| \u003Ca href=\"https:\u002F\u002Fpostgresml.org\u002Fdocs\u002F\">\u003Cb>Documentation\u003C\u002Fb>\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Fpostgresml.org\u002Fblog\">\u003Cb>Blog\u003C\u002Fb>\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Fdiscord.gg\u002FDmyJP3qJ7U\">\u003Cb>Discord\u003C\u002Fb>\u003C\u002Fa> |\n\u003C\u002Fp>\n\n---\nWhy do ML\u002FAI in Postgres?\n\nData for ML & AI systems is inherently larger and more dynamic than the models. It's more efficient, manageable and reliable to move models to the database, rather than constantly moving data to the models.\u003C\u002Fb>\u003C\u002Fp>\n\u003C\u002Fp>\n\n- [Getting started](#getting-started)\n    - [PostgresML Cloud](#postgresml-cloud)\n    - [Self-hosted](#self-hosted)\n    - [Ecosystem](#ecosystem)\n- [Large Language Models](#large-language-models)\n    - [Hugging Face](#hugging-face)\n    - [OpenAI and Other Providers](#openai)\n- [RAG](#rag)\n    - [Chunk](#chunk)\n    - [Embed](#embed)\n    - [Rank](#rank)\n    - [Transform](#transform)\n- [Machine Learning](#machine-learning)\n\n## Architecture\n\n\u003Cdiv align=\"center\">\n   \u003Cpicture>\n     \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fe27f8bda-1fe6-49f8-b9d8-ef563e0150e5\">\n     \u003Csource media=\"(prefers-color-scheme: light)\" srcset=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F09bbed94-b73f-447b-95d9-2d4a7727c3aa\">\n     \u003Cimg alt=\"Logo\" src=\"\" width=\"784\">\n   \u003C\u002Fpicture>\n\u003C\u002Fdiv>\n\n\u003Cdiv align=\"center\">\n\u003Cb>PostgresML is a powerful Postgres extension that seamlessly combines data storage and machine learning inference within your database\u003C\u002Fb>. By integrating these functionalities, PostgresML eliminates the need for separate systems and data transfers, enabling you to perform ML operations directly on your data where it resides.\n\u003C\u002Fdiv>\n\n## Features at a glance\n\n- **In-Database ML\u002FAI**: Run machine learning and AI operations directly within PostgreSQL\n- **GPU Acceleration**: Leverage GPU power for faster computations and model inference\n- **Large Language Models**: Integrate and use state-of-the-art LLMs from Hugging Face\n- **RAG Pipeline**: Built-in functions for chunking, embedding, ranking, and transforming text\n- **Vector Search**: Efficient similarity search using pgvector integration\n- **Diverse ML Algorithms**: 47+ classification and regression algorithms available\n- **High Performance**: 8-40X faster inference compared to HTTP-based model serving\n- **Scalability**: Support for millions of transactions per second and horizontal scaling\n- **NLP Tasks**: Wide range of natural language processing capabilities\n- **Security**: Enhanced data privacy by keeping models and data together\n- **Seamless Integration**: Works with existing PostgreSQL tools and client libraries\n\n# Getting started\n\nThe only prerequisites for using PostgresML is a Postgres database with our open-source `pgml` extension installed.\n\n## PostgresML Cloud\n\nOur serverless cloud is the easiest and recommend way to get started.\n\n[Sign up for a free PostgresML account](https:\u002F\u002Fpostgresml.org\u002Fsignup). You'll get a free database in seconds, with access to GPUs and state of the art LLMs.\n\n## Self-hosted\n\nIf you don't want to use our cloud you can self host it.\n\n```\ndocker run \\\n    -it \\\n    -v postgresml_data:\u002Fvar\u002Flib\u002Fpostgresql \\\n    -p 5433:5432 \\\n    -p 8000:8000 \\\n    ghcr.io\u002Fpostgresml\u002Fpostgresml:2.10.0 \\\n    sudo -u postgresml psql -d postgresml\n```\n\nFor more details, take a look at our [Quick Start with Docker](https:\u002F\u002Fpostgresml.org\u002Fdocs\u002Fopen-source\u002Fpgml\u002Fdevelopers\u002Fquick-start-with-docker) documentation.\n\n## Ecosystem\n\nWe have a number of other tools and libraries that are specifically designed to work with PostgreML. Remeber PostgresML is a postgres extension running inside of Postgres so you can connect with `psql` and use any of your favorite tooling and client libraries like [psycopg](https:\u002F\u002Fwww.psycopg.org\u002Fpsycopg3\u002F) to connect and run queries.\n\n\u003Cb>PostgresML Specific Client Libraries:\u003C\u002Fb>\n- [Korvus](https:\u002F\u002Fgithub.com\u002Fpostgresml\u002Fkorvus) - Korvus is a Python, JavaScript, Rust and C search SDK that unifies the entire RAG pipeline in a single database query.\n- [postgresml-django](https:\u002F\u002Fgithub.com\u002Fpostgresml\u002Fpostgresml-django) - postgresml-django is a Python module that integrates PostgresML with Django ORM.\n\n\u003Cb>Recommended Postgres Poolers:\u003C\u002Fb>\n- [pgcat](https:\u002F\u002Fgithub.com\u002Fpostgresml\u002Fpgcat) - pgcat is a PostgreSQL pooler with sharding, load balancing and failover support.\n\n# Large language models\n\nPostgresML brings models directly to your data, eliminating the need for costly and time-consuming data transfers. This approach significantly enhances performance, security, and scalability for AI-driven applications.\n\nBy running models within the database, PostgresML enables:\n\n- Reduced latency and improved query performance\n- Enhanced data privacy and security\n- Simplified infrastructure management\n- Seamless integration with existing database operations\n\n## Hugging Face\n\nPostgresML supports a wide range of state-of-the-art deep learning architectures available on the Hugging Face [model hub](https:\u002F\u002Fhuggingface.co\u002Fmodels). This integration allows you to:\n\n- Access thousands of pre-trained models\n- Utilize cutting-edge NLP, computer vision, and other AI models\n- Easily experiment with different architectures\n\n## OpenAI and other providers\n\nWhile cloud-based LLM providers offer powerful capabilities, making API calls from within the database can introduce latency, security risks, and potential compliance issues. Currently, PostgresML does not directly support integration with remote LLM providers like OpenAI.\n\n# RAG\n\nPostgresML transforms your PostgreSQL database into a powerful vector database for Retrieval-Augmented Generation (RAG) applications. It leverages pgvector for efficient storage and retrieval of embeddings.\n\nOur RAG implementation is built on four key SQL functions:\n\n1. [Chunk](#chunk): Splits text into manageable segments\n2. [Embed](#embed): Generates vector embeddings from text using pre-trained models\n3. [Rank](#rank): Performs similarity search on embeddings\n4. [Transform](#transform): Applies language models for text generation or transformation\n\nFor more information on using RAG with PostgresML see our guide on [Unified RAG](https:\u002F\u002Fpostgresml.org\u002Fdocs\u002Fopen-source\u002Fpgml\u002Fguides\u002Funified-rag).\n\n## Chunk\n\nThe `pgml.chunk` function chunks documents using the specified splitter. This is typically done before embedding.\n\n```postgresql\npgml.chunk(\n    splitter TEXT,    -- splitter name\n    text TEXT,        -- text to embed\n    kwargs JSON       -- optional arguments (see below)\n)\n```\n\nSee [pgml.chunk docs](https:\u002F\u002Fpostgresml.org\u002Fdocs\u002Fopen-source\u002Fpgml\u002Fapi\u002Fpgml.chunk) for more information.\n\n## Embed\n\nThe `pgml.embed` function generates embeddings from text using in-database models.\n\n```postgresql\npgml.embed(\n    transformer TEXT,\n    \"text\" TEXT,\n    kwargs JSONB\n)\n```\nSee [pgml.embed docs](https:\u002F\u002Fpostgresml.org\u002Fdocs\u002Fopen-source\u002Fpgml\u002Fapi\u002Fpgml.embed) for more information.\n\n## Rank\n\nThe `pgml.rank` function uses [Cross-Encoders](https:\u002F\u002Fwww.sbert.net\u002Fexamples\u002Fapplications\u002Fcross-encoder\u002FREADME.html) to score sentence pairs.\n\nThis is typically used as a re-ranking step when performing search.\n\n```postgresl\npgml.rank(\n    transformer TEXT,\n    query TEXT,\n    documents TEXT[],\n    kwargs JSONB\n)\n```\n\nDocs coming soon.\n\n## Transform\n\nThe `pgml.transform` function can be used to generate text.\n\n```postgresql\nSELECT pgml.transform(\n    task   => TEXT OR JSONB,     -- Pipeline initializer arguments\n    inputs => TEXT[] OR BYTEA[], -- inputs for inference\n    args   => JSONB              -- (optional) arguments to the pipeline.\n)\n```\n\nSee [pgml.transform docs](https:\u002F\u002Fpostgresml.org\u002Fdocs\u002Fopen-source\u002Fpgml\u002Fapi\u002Fpgml.transform) for more information.\n\nSee our [Text Generation guide](https:\u002F\u002Fpostgresml.org\u002Fdocs\u002Fopen-source\u002Fpgml\u002Fguides\u002Fllms\u002Ftext-generation) for a guide on generating text.\n\n# Machine learning\n\n\u003Cb>Some highlights:\u003C\u002Fb>\n- [47+ classification and regression algorithms](https:\u002F\u002Fpostgresml.org\u002Fdocs\u002Fopen-source\u002Fpgml\u002Fapi\u002Fpgml.train)\n- [8 - 40X faster inference than HTTP based model serving](https:\u002F\u002Fpostgresml.org\u002Fblog\u002Fpostgresml-is-8x-faster-than-python-http-microservices)\n- [Millions of transactions per second](https:\u002F\u002Fpostgresml.org\u002Fblog\u002Fscaling-postgresml-to-one-million-requests-per-second)\n- [Horizontal scalability](https:\u002F\u002Fpostgresml.org\u002Fdocs\u002Fopen-source\u002Fpgcat\u002F)\n\n**Training a classification model**\n\n*Training*\n```postgresql\nSELECT * FROM pgml.train(\n    'Handwritten Digit Image Classifier',\n    algorithm => 'xgboost',\n    'classification',\n    'pgml.digits',\n    'target'\n);\n```\n\n*Inference*\n```postgresql\nSELECT pgml.predict(\n    'My Classification Project',\n    ARRAY[0.1, 2.0, 5.0]\n) AS prediction;\n```\n\n## NLP\n\nThe `pgml.transform` function exposes a number of available NLP tasks.\n\nAvailable tasks are:\n- [Text Classification](https:\u002F\u002Fpostgresml.org\u002Fdocs\u002Fopen-source\u002Fpgml\u002Fguides\u002Fllms\u002Ftext-classification)\n- [Zero-Shot Classification](https:\u002F\u002Fpostgresml.org\u002Fdocs\u002Fopen-source\u002Fpgml\u002Fguides\u002Fllms\u002Fzero-shot-classification)\n- [Token Classification](https:\u002F\u002Fpostgresml.org\u002Fdocs\u002Fopen-source\u002Fpgml\u002Fguides\u002Fllms\u002Ftoken-classification)\n- [Translation](https:\u002F\u002Fpostgresml.org\u002Fdocs\u002Fopen-source\u002Fpgml\u002Fguides\u002Fllms\u002Ftranslation)\n- [Summarization](https:\u002F\u002Fpostgresml.org\u002Fdocs\u002Fopen-source\u002Fpgml\u002Fguides\u002Fllms\u002Fsummarization)\n- [Question Answering](https:\u002F\u002Fpostgresml.org\u002Fdocs\u002Fopen-source\u002Fpgml\u002Fguides\u002Fllms\u002Fquestion-answering)\n- [Text Generation](https:\u002F\u002Fpostgresml.org\u002Fdocs\u002Fopen-source\u002Fpgml\u002Fguides\u002Fllms\u002Ftext-generation)\n- [Text-to-Text Generation](https:\u002F\u002Fpostgresml.org\u002Fdocs\u002Fopen-source\u002Fpgml\u002Fguides\u002Fllms\u002Ftext-to-text-generation)\n- [Fill-Mask](https:\u002F\u002Fpostgresml.org\u002Fdocs\u002Fopen-source\u002Fpgml\u002Fguides\u002Fllms\u002Ffill-mask)\n","PostgresML 是一个将机器学习和人工智能功能直接集成到 PostgreSQL 数据库中的扩展。它通过 GPU 加速支持，使用户能够在数据库内部高效地执行 ML\u002FAI 操作，包括使用 Hugging Face 的大型语言模型、RAG 管道处理以及矢量搜索等。该项目特别适合需要在大规模数据集上进行实时预测分析的应用场景，如推荐系统、自然语言处理任务或任何需要快速响应的智能应用。其核心优势在于减少了数据迁移的成本与复杂性，同时提供了比传统 HTTP 服务方式快 8 到 40 倍的推理速度，并支持每秒数百万次事务处理及横向扩展能力。","2026-06-11 03:04:16","top_language"]