[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72276":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":23,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":27,"readmeContent":28,"aiSummary":29,"trendingCount":16,"starSnapshotCount":16,"syncStatus":30,"lastSyncTime":31,"discoverSource":32},72276,"HippoRAG","OSU-NLP-Group\u002FHippoRAG","OSU-NLP-Group","[NeurIPS'24] HippoRAG is a novel RAG framework inspired by human long-term memory that enables LLMs to continuously integrate knowledge across external documents. RAG + Knowledge Graphs + Personalized PageRank.","https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.14831",null,"Python",3573,366,28,19,0,8,18,71,24,91.79,"MIT License",false,"main",true,[],"2026-06-12 04:01:04","\u003Ch1 align=\"center\">HippoRAG 2: From RAG to Memory\u003C\u002Fh1>\n\u003Cp align=\"center\">\n    \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FOSU-NLP-Group\u002FHippoRAG\u002Fraw\u002Fmain\u002Fimages\u002Fhippo_brain.png\" width=\"55%\" style=\"max-width: 300px;\">\n\u003C\u002Fp>\n\n[\u003Cimg align=\"center\" src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\" \u002F>](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1nuelysWsXL8F5xH6q4JYJI8mvtlmeM9O#scrollTo=TjHdNe2KC81K)\n\n[\u003Cimg align=\"center\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2502.14802 HippoRAG 2-b31b1b\" \u002F>](https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.14802)\n[\u003Cimg align=\"center\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🤗 Dataset-HippoRAG 2-yellow\" \u002F>](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fosunlp\u002FHippoRAG_2\u002Ftree\u002Fmain)\n[\u003Cimg align=\"center\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2405.14831 HippoRAG 1-b31b1b\" \u002F>](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.14831)\n[\u003Cimg align=\"center\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FGitHub-HippoRAG 1-blue\" \u002F>](https:\u002F\u002Fgithub.com\u002FOSU-NLP-Group\u002FHippoRAG\u002Ftree\u002Flegacy)\n\n### HippoRAG 2 is a powerful memory framework for LLMs that enhances their ability to recognize and utilize connections in new knowledge—mirroring a key function of human long-term memory.\n\nOur experiments show that HippoRAG 2 improves associativity (multi-hop retrieval) and sense-making (the process of integrating large and complex contexts) in even the most advanced RAG systems, without sacrificing their performance on simpler tasks.\n\nLike its predecessor, HippoRAG 2 remains cost and latency efficient in online processes, while using significantly fewer resources for offline indexing compared to other graph-based solutions such as GraphRAG, RAPTOR, and LightRAG.\n\n\u003Cp align=\"center\">\n  \u003Cimg align=\"center\" src=\"https:\u002F\u002Fgithub.com\u002FOSU-NLP-Group\u002FHippoRAG\u002Fraw\u002Fmain\u002Fimages\u002Fintro.png\" \u002F>\n\u003C\u002Fp>\n\u003Cp align=\"center\">\n  \u003Cb>Figure 1:\u003C\u002Fb> Evaluation of continual learning capabilities across three key dimensions: factual memory (NaturalQuestions, PopQA), sense-making (NarrativeQA), and associativity (MuSiQue, 2Wiki, HotpotQA, and LV-Eval). HippoRAG 2 surpasses other methods across all\ncategories, bringing it one step closer to true long-term memory.\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cimg align=\"center\" src=\"https:\u002F\u002Fgithub.com\u002FOSU-NLP-Group\u002FHippoRAG\u002Fraw\u002Fmain\u002Fimages\u002Fmethodology.png\" \u002F>\n\u003C\u002Fp>\n\u003Cp align=\"center\">\n  \u003Cb>Figure 2:\u003C\u002Fb> HippoRAG 2 methodology.\n\u003C\u002Fp>\n\n#### Check out our papers to learn more:\n\n* [**HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.14831) [NeurIPS '24].\n* [**From RAG to Memory: Non-Parametric Continual Learning for Large Language Models**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.14802) [ICML '25].\n\n----\n\n## Installation\n\n```sh\nconda create -n hipporag python=3.10\nconda activate hipporag\npip install hipporag\n```\nInitialize the environmental variables and activate the environment:\n\n```sh\nexport CUDA_VISIBLE_DEVICES=0,1,2,3\nexport HF_HOME=\u003Cpath to Huggingface home directory>\nexport OPENAI_API_KEY=\u003Cyour openai api key>   # if you want to use OpenAI model\n\nconda activate hipporag\n```\n\n## Quick Start\n\n### OpenAI Models\n\nThis simple example will illustrate how to use `hipporag` with any OpenAI model:\n\n```python\nfrom hipporag import HippoRAG\n\n# Prepare datasets and evaluation\ndocs = [\n    \"Oliver Badman is a politician.\",\n    \"George Rankin is a politician.\",\n    \"Thomas Marwick is a politician.\",\n    \"Cinderella attended the royal ball.\",\n    \"The prince used the lost glass slipper to search the kingdom.\",\n    \"When the slipper fit perfectly, Cinderella was reunited with the prince.\",\n    \"Erik Hort's birthplace is Montebello.\",\n    \"Marina is bom in Minsk.\",\n    \"Montebello is a part of Rockland County.\"\n]\n\nsave_dir = 'outputs'# Define save directory for HippoRAG objects (each LLM\u002FEmbedding model combination will create a new subdirectory)\nllm_model_name = 'gpt-4o-mini' # Any OpenAI model name\nembedding_model_name = 'nvidia\u002FNV-Embed-v2'# Embedding model name (NV-Embed, GritLM or Contriever for now)\n\n#Startup a HippoRAG instance\nhipporag = HippoRAG(save_dir=save_dir, \n                    llm_model_name=llm_model_name,\n                    embedding_model_name=embedding_model_name) \n\n#Run indexing\nhipporag.index(docs=docs)\n\n#Separate Retrieval & QA\nqueries = [\n    \"What is George Rankin's occupation?\",\n    \"How did Cinderella reach her happy ending?\",\n    \"What county is Erik Hort's birthplace a part of?\"\n]\n\nretrieval_results = hipporag.retrieve(queries=queries, num_to_retrieve=2)\nqa_results = hipporag.rag_qa(retrieval_results)\n\n#Combined Retrieval & QA\nrag_results = hipporag.rag_qa(queries=queries)\n\n#For Evaluation\nanswers = [\n    [\"Politician\"],\n    [\"By going to the ball.\"],\n    [\"Rockland County\"]\n]\n\ngold_docs = [\n    [\"George Rankin is a politician.\"],\n    [\"Cinderella attended the royal ball.\",\n    \"The prince used the lost glass slipper to search the kingdom.\",\n    \"When the slipper fit perfectly, Cinderella was reunited with the prince.\"],\n    [\"Erik Hort's birthplace is Montebello.\",\n    \"Montebello is a part of Rockland County.\"]\n]\n\nrag_results = hipporag.rag_qa(queries=queries, \n                              gold_docs=gold_docs,\n                              gold_answers=answers)\n```\n\n#### Example (OpenAI Compatible Embeddings)\n\nIf you want to use LLMs and Embeddings Compatible to OpenAI, please use the following methods.\u003C\u002Fp>\n    \n```python\nhipporag = HippoRAG(save_dir=save_dir, \n    llm_model_name='Your LLM Model name',\n    llm_base_url='Your LLM Model url',\n    embedding_model_name='Your Embedding model name',  \n    embedding_base_url='Your Embedding model url')\n```\n\n### Local Deployment (vLLM)\n\nThis simple example will illustrate how to use `hipporag` with any vLLM-compatible locally deployed LLM.\n\n1. Run a local [OpenAI-compatible vLLM server](https:\u002F\u002Fdocs.vllm.ai\u002Fen\u002Flatest\u002Fgetting_started\u002Fquickstart.html#quickstart-online) with specified GPUs (make sure you leave enough memory for your embedding model).\n\n```sh\nexport CUDA_VISIBLE_DEVICES=0,1\nexport VLLM_WORKER_MULTIPROC_METHOD=spawn\nexport HF_HOME=\u003Cpath to Huggingface home directory>\n\nconda activate hipporag  # vllm should be in this environment\n\n# Tune gpu-memory-utilization or max_model_len to fit your GPU memory, if OOM occurs\nvllm serve meta-llama\u002FLlama-3.3-70B-Instruct --tensor-parallel-size 2 --max_model_len 4096 --gpu-memory-utilization 0.95 \n```\n\n2. Now you can use very similar code to the one above to use `hipporag`: \n\n```python\nsave_dir = 'outputs'# Define save directory for HippoRAG objects (each LLM\u002FEmbedding model combination will create a new subdirectory)\nllm_model_name = # Any OpenAI model name\nembedding_model_name = # Embedding model name (NV-Embed, GritLM or Contriever for now)\nllm_base_url= # Base url for your deployed LLM (i.e. http:\u002F\u002Flocalhost:8000\u002Fv1)\n\nhipporag = HippoRAG(save_dir=save_dir,\n                    llm_model_name=llm_model,\n                    embedding_model_name=embedding_model_name,\n                    llm_base_url=llm_base_url)\n\n# Same Indexing, Retrieval and QA as running OpenAI models above\n```\n\n## Testing\n\nWhen making a contribution to HippoRAG, please run the scripts below to ensure that your changes do not result in unexpected behavior from our core modules. \n\nThese scripts test for indexing, graph loading, document deletion and incremental updates to a HippoRAG object.\n\n### OpenAI Test\n\nTo test HippoRAG with an OpenAI LLM and embedding model, simply run the following. \nThe cost of this test will be negligible.\n\n```sh\nexport OPENAI_API_KEY=\u003Cyour openai api key> \n\nconda activate hipporag\n\npython tests_openai.py\n```\n\n### Local Test\n\nTo test locally, you must deploy a vLLM instance. We choose to deploy a smaller 8B model `Llama-3.1-8B-Instruct` for cheaper testing.\n\n```sh\nexport CUDA_VISIBLE_DEVICES=0\nexport VLLM_WORKER_MULTIPROC_METHOD=spawn\nexport HF_HOME=\u003Cpath to Huggingface home directory>\n\nconda activate hipporag  # vllm should be in this environment\n\n# Tune gpu-memory-utilization or max_model_len to fit your GPU memory, if OOM occurs\nvllm serve meta-llama\u002FLlama-3.1-8B-Instruct --tensor-parallel-size 2 --max_model_len 4096 --gpu-memory-utilization 0.95 --port 6578\n```\n\nThen, we run the following test script:\n\n```sh\nCUDA_VISIBLE=1 python tests_local.py\n```\n\n## Reproducing our Experiments\n\nTo use our code to run experiments we recommend you clone this repository and follow the structure of the `main.py` script.\n\n### Data for Reproducibility\n\nWe evaluated several sampled datasets in our paper, some of which are already included in the `reproduce\u002Fdataset` directory of this repo. For the complete set of datasets, please visit\nour [HuggingFace dataset](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fosunlp\u002FHippoRAG_v2) and place them under `reproduce\u002Fdataset`. We also provide the OpenIE results for both `gpt-4o-mini` and `Llama-3.3-70B-Instruct` for our `musique` sample under `outputs\u002Fmusique`.\n\nTo test your environment is properly set up, you can use the small dataset `reproduce\u002Fdataset\u002Fsample.json` for debugging as shown below.\n\n### Running Indexing & QA\n\nInitialize the environmental variables and activate the environment:\n\n```sh\nexport CUDA_VISIBLE_DEVICES=0,1,2,3\nexport HF_HOME=\u003Cpath to Huggingface home directory>\nexport OPENAI_API_KEY=\u003Cyour openai api key>   # if you want to use OpenAI model\n\nconda activate hipporag\n```\n\n### Run with OpenAI Model\n\n```sh\ndataset=sample  # or any other dataset under `reproduce\u002Fdataset`\n\n# Run OpenAI model\npython main.py --dataset $dataset --llm_base_url https:\u002F\u002Fapi.openai.com\u002Fv1 --llm_name gpt-4o-mini --embedding_name nvidia\u002FNV-Embed-v2\n```\n\n### Run with vLLM (Llama)\n\n1. As above, run a local [OpenAI-compatible vLLM server](https:\u002F\u002Fdocs.vllm.ai\u002Fen\u002Flatest\u002Fgetting_started\u002Fquickstart.html#quickstart-online) with specified GPU.\n\n```sh\nexport CUDA_VISIBLE_DEVICES=0,1\nexport VLLM_WORKER_MULTIPROC_METHOD=spawn\nexport HF_HOME=\u003Cpath to Huggingface home directory>\n\nconda activate hipporag  # vllm should be in this environment\n\n# Tune gpu-memory-utilization or max_model_len to fit your GPU memory, if OOM occurs\nvllm serve meta-llama\u002FLlama-3.3-70B-Instruct --tensor-parallel-size 2 --max_model_len 4096 --gpu-memory-utilization 0.95 \n```\n\n2. Use another GPUs to run the main program in another terminal.\n\n```sh\nexport CUDA_VISIBLE_DEVICES=2,3  # set another GPUs while vLLM server is running\nexport HF_HOME=\u003Cpath to Huggingface home directory>\ndataset=sample\n\npython main.py --dataset $dataset --llm_base_url http:\u002F\u002Flocalhost:8000\u002Fv1 --llm_name meta-llama\u002FLlama-3.3-70B-Instruct --embedding_name nvidia\u002FNV-Embed-v2\n```\n\n#### Advanced: Run with vLLM offline batch\n\nvLLM offers an [offline batch mode](https:\u002F\u002Fdocs.vllm.ai\u002Fen\u002Flatest\u002Fgetting_started\u002Fquickstart.html#offline-batched-inference) for faster inference, which could bring us more than 3x faster indexing compared to vLLM online server. \n\n1. Use the following command to run the main program with vLLM offline batch mode.\n\n```sh\nexport CUDA_VISIBLE_DEVICES=0,1,2,3 # use all GPUs for faster offline indexing\nexport VLLM_WORKER_MULTIPROC_METHOD=spawn\nexport HF_HOME=\u003Cpath to Huggingface home directory>\nexport OPENAI_API_KEY=''\ndataset=sample\n\npython main.py --dataset $dataset --llm_name meta-llama\u002FLlama-3.3-70B-Instruct --openie_mode offline --skip_graph\n```\n\n2. After the first step, OpenIE result is saved to file. Go back to run vLLM online server and main program as described in the `Run with vLLM (Llama)` main section.\n\n## Debugging Note\n\n- `\u002Freproduce\u002Fdataset\u002Fsample.json` is a small dataset specifically for debugging.\n- When debugging vLLM offline mode, set `tensor_parallel_size` as `1` in `hipporag\u002Fllm\u002Fvllm_offline.py`.\n- If you want to rerun a particular experiment, remember to clear the saved files, including OpenIE results and knowledge graph, e.g.,\n\n```sh\nrm reproduce\u002Fdataset\u002Fopenie_results\u002Fopenie_sample_results_ner_meta-llama_Llama-3.3-70B-Instruct_3.json\nrm -rf outputs\u002Fsample\u002Fsample_meta-llama_Llama-3.3-70B-Instruct_nvidia_NV-Embed-v2\n```\n### Custom Datasets\n\nTo setup your own custom dataset for evaluation, follow the format and naming convention shown in `reproduce\u002Fdataset\u002Fsample_corpus.json` (your dataset's name should be followed by `_corpus.json`). If running an experiment with pre-defined questions, organize your query corpus according to the query file `reproduce\u002Fdataset\u002Fsample.json`, be sure to also follow our naming convention.\n\nThe corpus and optional query JSON files should have the following format:\n\n#### Retrieval Corpus JSON\n\n```json\n[\n  {\n    \"title\": \"FIRST PASSAGE TITLE\",\n    \"text\": \"FIRST PASSAGE TEXT\",\n    \"idx\": 0\n  },\n  {\n    \"title\": \"SECOND PASSAGE TITLE\",\n    \"text\": \"SECOND PASSAGE TEXT\",\n    \"idx\": 1\n  }\n]\n```\n\n#### (Optional) Query JSON\n\n```json\n\n[\n  {\n    \"id\": \"sample\u002Fquestion_1.json\",\n    \"question\": \"QUESTION\",\n    \"answer\": [\n      \"ANSWER\"\n    ],\n    \"answerable\": true,\n    \"paragraphs\": [\n      {\n        \"title\": \"{FIRST SUPPORTING PASSAGE TITLE}\",\n        \"text\": \"{FIRST SUPPORTING PASSAGE TEXT}\",\n        \"is_supporting\": true,\n        \"idx\": 0\n      },\n      {\n        \"title\": \"{SECOND SUPPORTING PASSAGE TITLE}\",\n        \"text\": \"{SECOND SUPPORTING PASSAGE TEXT}\",\n        \"is_supporting\": true,\n        \"idx\": 1\n      }\n    ]\n  }\n]\n```\n\n#### (Optional) Chunking Corpus\n\nWhen preparing your data, you may need to chunk each passage, as longer passage may be too complex for the OpenIE process.\n\n## Code Structure\n\n```\n📦 .\n│-- 📂 src\u002Fhipporag\n│   ├── 📂 embedding_model          # Implementation of all embedding models\n│   │   ├── __init__.py             # Getter function for get specific embedding model classes\n|   |   ├── base.py                 # Base embedding model class `BaseEmbeddingModel` to inherit and `EmbeddingConfig`\n|   |   ├── NVEmbedV2.py            # Implementation of NV-Embed-v2 model\n|   |   ├── ...\n│   ├── 📂 evaluation               # Implementation of all evaluation metrics\n│   │   ├── __init__.py\n|   |   ├── base.py                 # Base evaluation metric class `BaseMetric` to inherit\n│   │   ├── qa_eval.py              # Eval metrics for QA\n│   │   ├── retrieval_eval.py       # Eval metrics for retrieval\n│   ├── 📂 information_extraction  # Implementation of all information extraction models\n│   │   ├── __init__.py\n|   |   ├── openie_openai_gpt.py    # Model for OpenIE with OpenAI GPT\n|   |   ├── openie_vllm_offline.py  # Model for OpenIE with LLMs deployed offline with vLLM\n│   ├── 📂 llm                      # Classes for inference with large language models\n│   │   ├── __init__.py             # Getter function\n|   |   ├── base.py                 # Config class for LLM inference and base LLM inference class to inherit\n|   |   ├── openai_gpt.py           # Class for inference with OpenAI GPT\n|   |   ├── vllm_llama.py           # Class for inference using a local vLLM server\n|   |   ├── vllm_offline.py         # Class for inference using the vLLM API directly\n│   ├── 📂 prompts                  # Prompt templates and prompt template manager class\n|   │   ├── 📂 dspy_prompts         # Prompts for filtering\n|   │   │   ├── ...\n|   │   ├── 📂 templates            # All prompt templates for template manager to load\n|   │   │   ├── README.md           # Documentations of usage of prompte template manager and prompt template files\n|   │   │   ├── __init__.py\n|   │   │   ├── triple_extraction.py\n|   │   │   ├── ...\n│   │   ├── __init__.py\n|   |   ├── linking.py              # Instruction for linking\n|   |   ├── prompt_template_manager.py  # Implementation of prompt template manager\n│   ├── 📂 utils                    # All utility functions used across this repo (the file name indicates its relevant usage)\n│   │   ├── config_utils.py         # We use only one config across all modules and its setup is specified here\n|   |   ├── ...\n│   ├── __init__.py\n│   ├── HippoRAG.py          # Highest level class for initiating retrieval, question answering, and evaluations\n│   ├── embedding_store.py   # Storage database to load, manage and save embeddings for passages, entities and facts.\n│   ├── rerank.py            # Reranking and filtering methods\n│-- 📂 examples\n│   ├── ...\n│   ├── ...\n│-- 📜 README.md\n│-- 📜 requirements.txt   # Dependencies list\n│-- 📜 .gitignore         # Files to exclude from Git\n\n\n```\n\n## Contact\n\nQuestions or issues? File an issue or contact \n[Bernal Jiménez Gutiérrez](mailto:jimenezgutierrez.1@osu.edu),\n[Yiheng Shu](mailto:shu.251@osu.edu),\n[Yu Su](mailto:su.809@osu.edu),\nThe Ohio State University\n\n## Citation\n\nIf you find this work useful, please consider citing our papers:\n\n### HippoRAG 2\n```\n@misc{gutiérrez2025ragmemorynonparametriccontinual,\n      title={From RAG to Memory: Non-Parametric Continual Learning for Large Language Models}, \n      author={Bernal Jiménez Gutiérrez and Yiheng Shu and Weijian Qi and Sizhe Zhou and Yu Su},\n      year={2025},\n      eprint={2502.14802},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL},\n      url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.14802}, \n}\n```\n\n### HippoRAG\n\n```\n@inproceedings{gutiérrez2024hipporag,\n      title={HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models}, \n      author={Bernal Jiménez Gutiérrez and Yiheng Shu and Yu Gu and Michihiro Yasunaga and Yu Su},\n      booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},\n      year={2024},\n      url={https:\u002F\u002Fopenreview.net\u002Fforum?id=hkujvAPVsg}\n ```\n\n## TODO:\n\n- [x] Add support for more embedding models\n- [x] Add support for embedding endpoints\n- [ ] Add support for vector database integration\n\nPlease feel free to open an issue or PR if you have any questions or suggestions.\n","HippoRAG 是一种新颖的检索增强生成（RAG）框架，灵感来源于人类的长期记忆机制，使大型语言模型能够持续整合外部文档中的知识。该项目通过结合知识图谱与个性化PageRank算法，提升了模型在多跳检索和复杂上下文理解方面的能力，同时保持了较低的成本和延迟。HippoRAG 适用于需要处理大量信息并从中提取有价值关联的应用场景，如智能问答系统、文档理解和知识管理平台等。其开源代码采用Python编写，并在GitHub上获得了广泛的关注和支持。",2,"2026-06-11 03:41:09","high_star"]