[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72280":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":16,"compositeScore":19,"rankGlobal":10,"rankLanguage":10,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":23,"hasPages":23,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":29,"readmeContent":30,"aiSummary":31,"trendingCount":16,"starSnapshotCount":16,"syncStatus":32,"lastSyncTime":33,"discoverSource":34},72280,"FlashRAG","RUC-NLPIR\u002FFlashRAG","RUC-NLPIR","⚡FlashRAG: A Python Toolkit for Efficient RAG Research (WWW2025 Resource)","https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.13576",null,"Python",3501,305,20,34,0,3,18,62.76,"MIT License",false,"main",true,[25,26,27,28],"benchmark","datasets","large-language-models","retrieval-augmented-generation","2026-06-12 04:01:04","# \u003Cdiv align=\"center\">⚡FlashRAG: A Python Toolkit for Efficient RAG Research\u003Cdiv>\n\\[ English | [中文](README_zh.md) \\]\n\u003Cdiv align=\"center\">\n\u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.13576\" target=\"_blank\">\u003Cimg src=https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-b5212f.svg?logo=arxiv>\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FRUC-NLPIR\u002FFlashRAG_datasets\u002F\" target=\"_blank\">\u003Cimg src=https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20HuggingFace%20Datasets-27b3b4.svg>\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fwww.modelscope.cn\u002Fdatasets\u002Fhhjinjiajie\u002FFlashRAG_Dataset\" target=\"_blank\">\u003Cimg src=https:\u002F\u002Fcustom-icon-badges.demolab.com\u002Fbadge\u002FModelScope%20Datasets-624aff?style=flat&logo=modelscope&logoColor=white>\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fdeepwiki.com\u002FRUC-NLPIR\u002FFlashRAG\">\u003Cimg src=\"https:\u002F\u002Fdevin.ai\u002Fassets\u002Fdeepwiki-badge.png\" alt=\"DeepWiki Document\" height=\"20\"\u002F>\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FRUC-NLPIR\u002FFlashRAG\u002Fblob\u002Fmain\u002FLICENSE\">\u003Cimg alt=\"License\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLICENSE-MIT-green\">\u003C\u002Fa>\n\u003Ca>\u003Cimg alt=\"Static Badge\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fmade_with-Python-blue\">\u003C\u002Fa>\n\u003C\u002Fdiv>\n\n\u003Ch4 align=\"center\">\n\n\u003Cp>\n\u003Ca href=\"#wrench-installation\">Installation\u003C\u002Fa> |\n\u003Ca href=\"#sparkles-features\">Features\u003C\u002Fa> |\n\u003Ca href=\"#rocket-quick-start\">Quick-Start\u003C\u002Fa> |\n\u003Ca href=\"#gear-components\"> Components\u003C\u002Fa> |\n\u003Ca href=\"#art-flashrag-ui\"> FlashRAG-UI\u003C\u002Fa> |\n\u003Ca href=\"#robot-supporting-methods\"> Supporting Methods\u003C\u002Fa> |\n\u003Ca href=\"#notebook-supporting-datasets--document-corpus\"> Supporting Datasets\u003C\u002Fa> |\n\u003Ca href=\"#raised_hands-additional-faqs\"> FAQs\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003C\u002Fh4>\n\n\nFlashRAG is a Python toolkit for the reproduction and development of Retrieval Augmented Generation (RAG) research. Our toolkit includes 36 pre-processed benchmark RAG datasets and **23 state-of-the-art RAG algorithms**, including **7 reasoning-based methods** that combine reasoning ability with retrieval.\n\n\u003Cp align=\"center\">\n\u003Cimg src=\"asset\u002Fframework.jpg\">\n\u003C\u002Fp>\n\nWith FlashRAG and provided resources, you can effortlessly reproduce existing SOTA works in the RAG domain or implement your custom RAG processes and components. Besides, we provide an easy-to-use UI:\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F8ca00873-5df2-48a7-b853-89e7b18bc6e9\n\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Ftrendshift.io\u002Frepositories\u002F10454\" target=\"_blank\">\u003Cimg src=\"https:\u002F\u002Ftrendshift.io\u002Fapi\u002Fbadge\u002Frepositories\u002F10454\" alt=\"RUC-NLPIR%2FFlashRAG | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"\u002F>\u003C\u002Fa>\n\u003C\u002Fp>\n\n## :link: Navigation\n- [Features](#sparkles-features)\n- [Roadmap](#mag_right-roadmap)\n- [Changelog](#page_with_curl-changelog)\n- [Installation](#wrench-installation)\n- [Quick Start](#rocket-quick-start)\n- [Components](#gear-components)\n- [FlashRAG-UI](#art-flashrag-ui)\n- [Supporting Methods](#robot-supporting-methods)\n- [Supporting Datasets & Document Corpus](#notebook-supporting-datasets--document-corpus)\n- [Additional FAQs](#raised_hands-additional-faqs)\n- [License](#bookmark-license)\n- [Citation](#star2-citation)\n\n## :sparkles: Features\n\n- **Extensive and Customizable Framework**: Includes essential components for RAG scenarios such as retrievers, rerankers, generators, and compressors, allowing for flexible assembly of complex pipelines.\n\n- **Comprehensive Benchmark Datasets**: A collection of 36 pre-processed RAG benchmark datasets to test and validate RAG models' performances.\n\n- **Pre-implemented Advanced RAG Algorithms**: Features **23 advancing RAG algorithms** with reported results, based on our framework. Easily reproducing results under different settings.\n\n- **🚀 Reasoning-based Methods**: **NEW!** We now support **7 reasoning-based methods** that combine reasoning ability with retrieval, achieving superior performance on complex multi-hop tasks.\n\n- **Efficient Preprocessing Stage**: Simplifies the RAG workflow preparation by providing various scripts like corpus processing for retrieval, retrieval index building, and pre-retrieval of documents.\n\n- **Optimized Execution**: The library's efficiency is enhanced with tools like vLLM, FastChat for LLM inference acceleration, and Faiss for vector index management.\n\n- **Easy to Use UI** : We have developed a very easy to use UI to easily and quickly configure and experience the RAG baselines we have implemented, as well as run evaluation scripts on a visual interface.\n\n## :mag_right: Roadmap\n\nFlashRAG is still under development and there are many issues and room for improvement. We will continue to update. And we also sincerely welcome contributions on this open-source toolkit.\n\n- [x] Support OpenAI models\n- [x] Provdide instructions for each component\n- [x] Integrate sentence Transformers\n- [x] Support multimodal RAG\n- [x] Support reasoning-based methods\n- [ ] Inlcude more RAG approaches\n- [ ] Enhance code adaptability and readability\n- [ ] Add support for api-based retriever (vllm server)\n\n## :page_with_curl: Changelog\n[25\u002F11\u002F06] 🎯 NEW Retriever! We have integrated a Web Search Engine-based Retriever, which seamlessly integrates with existing methods and can be enabled quickly with just a Serper API key! This enhancement significantly expands retrieval coverage and real-time capability, supporting dynamic information access and external knowledge augmentation. Experience a more flexible and powerful retrieval workflow now! \n\n[25\u002F08\u002F06] 🎯 **NEW!** We have added support for **Reasoning Pipeline**, which is a new paradigm that combines reasoning ability and retrieval, representing work that includes [R1-Searcher](https:\u002F\u002Fgithub.com\u002FSsmallSong\u002FR1-Searcher), [Search-R1](https:\u002F\u002Fgithub.com\u002FPeterGriffinJin\u002FSearch-R1),.... We evaluate the performance of the pipeline on various RAG benchmarks, it can achieve F1 scores close to 60 on multi hop inference datasets such as HotpotQA. See it in [**result table**](#robot-supporting-methods).\n\n[25\u002F03\u002F21] 🚀 **Major Update!** We have expanded our toolkit to support **23 state-of-the-art RAG algorithms**, including **7 reasoning-based methods** that significantly improve performance on complex reasoning tasks. This represents a major milestone in our toolkit's evolution!\n\n[25\u002F02\u002F24] 🔥🔥🔥 We have added support for **multimodal RAG**, including [**MLLMs like Llava, Qwen, InternVL**](https:\u002F\u002Fruc-nlpir.github.io\u002FFlashRAG\u002F#\u002Fzh-cn\u002Fcomponent\u002Fgenerator?id=%e5%a4%9a%e6%a8%a1%e6%80%81%e7%94%9f%e6%88%90%e5%99%a8), and various [**multimodal retrievers with Clip architecture**](https:\u002F\u002Fruc-nlpir.github.io\u002FFlashRAG\u002F#\u002Fzh-cn\u002Fcomponent\u002Fretriever?id=%e5%a4%9a%e6%a8%a1%e6%80%81%e6%a3%80%e7%b4%a2%e5%99%a8). More information can be found in our new version of arxiv article and our documentation. Try it!\n\n[25\u002F01\u002F21] Our technical paper [FlashRAG: A Python Toolkit for Efficient RAG Research](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.13576) is honored to have been accepted to the Resource Track of the 2025 **ACM Web Conference (WWW 2025)**. Please Check it out!\n\n[25\u002F01\u002F12] Introduce \u003Cstrong>FlashRAG-UI\u003C\u002Fstrong>, an easy to use interface. You can easily and quickly configure and experience the supported RAG methods and evaluate them on the benchmarks.\n\n[25\u002F01\u002F11] We have added support for a new method [\u003Cu>RQRAG\u003C\u002Fu>](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.00610) method, see it in [**reproduce_experiment**](docs\u002Foriginal_docs\u002Freproduce_experiment.md).\n\n[25\u002F01\u002F07] We have currently support the aggregation of multiple retrievers, see it in [**multi retriever usage**](https:\u002F\u002Fgithub.com\u002FRUC-NLPIR\u002FFlashRAG\u002Fblob\u002Fmain\u002Fdocs\u002Foriginal_docs\u002Fmulti_retriever_usage.md).\n\n[25\u002F01\u002F07] We have integrated a very flexible and lightweight corpus chunking library [**Chunkie**](https:\u002F\u002Fgithub.com\u002Fchonkie-ai\u002Fchonkie?tab=readme-ov-file#usage), which supports various custom chunking methods (tokens, sentences, semantic, etc.). Use it in [\u003Cu>chunking doc corpus\u003C\u002Fu>](docs\u002Foriginal_docs\u002Fchunk-doc-corpus.md).\n\n[24\u002F10\u002F21] We have released a version based on the Paddle framework that supports Chinese hardware platforms. Please refer to [FlashRAG Paddle](https:\u002F\u002Fgithub.com\u002FRUC-NLPIR\u002FFlashRAG-Paddle) for details.\n\n[24\u002F10\u002F13] A new in-domain dataset and corpus - [DomainRAG](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2406.05654) have been added to the dataset. The dataset is based on the internal enrollment data of Renmin University of China, covering seven types of tasks, which can be used for conducting domain-specific RAG testing.\n\n[24\u002F09\u002F24] We have released a version based on the MindSpore framework that supports Chinese hardware platforms. Please refer to [FlashRAG MindSpore](https:\u002F\u002Fgithub.com\u002FRUC-NLPIR\u002FFlashRAG-MindSpore) for details.\n\n\u003Cdetails>\n\u003Csummary>Show more\u003C\u002Fsummary>\n\n[24\u002F09\u002F18] Due to the complexity and limitations of installing Pyserini in certain environments, we have introduced a lightweight `BM25s` package as an alternative (faster and easier to use). The retriever based on Pyserini will be deprecated in future versions. To use retriever with `bm25s`, just set `bm25_backend` to `bm25s` in config.\n\n[24\u002F09\u002F09] We add support for a new method [\u003Cu>Adaptive-RAG\u003C\u002Fu>](https:\u002F\u002Faclanthology.org\u002F2024.naacl-long.389.pdf), which can automatically select the RAG process to execute based on the type of query. See it result in [\u003Cu>result table\u003C\u002Fu>](#robot-supporting-methods).\n\n[24\u002F08\u002F02] We add support for a new method [\u003Cu>Spring\u003C\u002Fu>](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.19670), significantly improve the performance of LLM by adding only a few token embeddings. See it result in [\u003Cu>result table\u003C\u002Fu>](#robot-supporting-methods).\n\n[24\u002F07\u002F17] Due to some unknown issues with HuggingFace, our original dataset link has been invalid. We have updated it. Please check the [new link](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FRUC-NLPIR\u002FFlashRAG_datasets\u002F) if you encounter any problems.\n\n[24\u002F07\u002F06] We add support for a new method: [\u003Cu>Trace\u003C\u002Fu>](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.11460), which refine text by constructing a knowledge graph. See it [\u003Cu>results\u003C\u002Fu>](#robot-supporting-methods) and [\u003Cu>details\u003C\u002Fu>](.\u002Fdocs\u002Foriginal_docs\u002Fbaseline_details.md).\n\n[24\u002F06\u002F19] We add support for a new method: [\u003Cu>IRCoT\u003C\u002Fu>](https:\u002F\u002Farxiv.org\u002Fabs\u002F2212.10509), and update the [\u003Cu>result table\u003C\u002Fu>](#robot-supporting-methods).\n\n[24\u002F06\u002F15] We provide a [\u003Cu>demo\u003C\u002Fu>](.\u002Fexamples\u002Fquick_start\u002Fdemo_en.py) to perform the RAG process using our toolkit.\n\n[24\u002F06\u002F11] We have integrated `sentence transformers` in the retriever module. Now it's easier to use the retriever without setting pooling methods.\n\n[24\u002F06\u002F05] We have provided detailed document for reproducing existing methods (see [how to reproduce](.\u002Fdocs\u002Foriginal_docs\u002Freproduce_experiment.md), [baseline details](.\u002Fdocs\u002Foriginal_docs\u002Fbaseline_details.md)), and [\u003Cu>configurations settings\u003C\u002Fu>](.\u002Fdocs\u002Foriginal_docs\u002Fconfiguration.md).\n\n[24\u002F06\u002F02] We have provided an introduction of FlashRAG for beginners, see [\u003Cu>an introduction to flashrag\u003C\u002Fu>](.\u002Fdocs\u002Foriginal_docs\u002Fintroduction_for_beginners_en.md) ([\u003Cu>中文版\u003C\u002Fu>](.\u002Fdocs\u002Foriginal_docs\u002Fintroduction_for_beginners_zh.md) [\u003Cu>한국어\u003C\u002Fu>](.\u002Fdocs\u002Foriginal_docs\u002Fintroduction_for_beginners_kr.md)).\n\n[24\u002F05\u002F31] We supported Openai-series models as generator.\n\n\u003C\u002Fdetails>\n\n## :wrench: Installation\n![PyPI - Version](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fflashrag-dev) \n![PyPI - Downloads](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fdw\u002Fflashrag-dev) \n![PyPI - Downloads](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fdm\u002Fflashrag-dev)\n\nTo get started with FlashRAG, you can simply install it with pip:\n\n```base\npip install flashrag-dev --pre\n```\n\nOr you can clone it from Github and install (requires Python 3.10+):\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FRUC-NLPIR\u002FFlashRAG.git\ncd FlashRAG\npip install -e .\n```\n\nIf you want to use vllm, sentence-transformers or pyserini, you can install the optional dependencies:\n\n```bash\n# Install all extra dependencies\npip install flashrag-dev[full]\n\n# Install vllm for faster speed\npip install vllm>=0.4.1\n\n# Install sentence-transformers\npip install sentence-transformers\n\n# Install pyserini for bm25\npip install pyserini\n```\n\nDue to the incompatibility when installing `faiss` using `pip`, it is necessary to use the following conda command for installation.\n\n```bash\n# CPU-only version\nconda install -c pytorch faiss-cpu=1.8.0\n\n# GPU(+CPU) version\nconda install -c pytorch -c nvidia faiss-gpu=1.8.0\n```\n\nNote: It is impossible to install the latest version of `faiss` on certain systems.\n\nFrom the official Faiss repository ([source](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffaiss\u002Fblob\u002Fmain\u002FINSTALL.md)):\n\n> - The CPU-only faiss-cpu conda package is currently available on Linux (x86_64 and arm64), OSX (arm64 only), and Windows (x86_64)\n> - faiss-gpu, containing both CPU and GPU indices, is available on Linux (x86_64 only) for CUDA 11.4 and 12.1\n\n## :rocket: Quick Start\n\n### Corpus Construction\nTo build an index, you first need to save your corpus as a `jsonl` file with each line representing a document.\n\n```jsonl\n{\"id\": \"0\", \"contents\": \"...\"}\n{\"id\": \"1\", \"contents\": \"...\"}\n```\n\nIf you want to use Wikipedia as your corpus, you can refer to our documentation [Processing Wikipedia](.\u002Fdocs\u002Foriginal_docs\u002Fprocess-wiki.md) to convert it into an indexable format.\n\n### Index Construction\n\nYou can use the following code to build your own index.\n\n* For **dense retrieval methods**, especially popular embedding models, we use `faiss` to build the index.\n\n* For **sparse retrieval methods (BM25)**, we use `Pyserini` or `bm25s` to build the corpus into a Lucene inverted index. The built index contains the original documents.\n\n#### For Dense Retrieval Methods\n\nModify the parameters in the following code to your own.\n\n```bash\npython -m flashrag.retriever.index_builder \\\n  --retrieval_method e5 \\\n  --model_path \u002Fmodel\u002Fe5-base-v2\u002F \\\n  --corpus_path indexes\u002Fsample_corpus.jsonl \\\n  --save_dir indexes\u002F \\\n  --use_fp16 \\\n  --max_length 512 \\\n  --batch_size 256 \\\n  --pooling_method mean \\\n  --faiss_type Flat \n```\n\n* ```--pooling_method```: If this parameter is not specified, we will automatically select it based on the model name and model file. However, since different embedding models use different pooling methods, **we may not have fully implemented them**. To ensure accuracy, you can **specify the pooling method corresponding to the retrieval model you are using** (`mean`, `pooler`, or `cls`).\n\n* ```---instruction```: Some embedding models require additional instructions to be concatenated to the query before encoding, which can be specified here. Currently, we will automatically fill in the instructions for **E5** and **BGE** models, while other models need to be supplemented manually.\n\nIf the retrieval model supports the `sentence transformers` library, you can use the following code to build the index (**without considering the pooling method**).\n\n```bash\npython -m flashrag.retriever.index_builder \\\n  --retrieval_method e5 \\\n  --model_path \u002Fmodel\u002Fe5-base-v2\u002F \\\n  --corpus_path indexes\u002Fsample_corpus.jsonl \\\n  --save_dir indexes\u002F \\\n  --use_fp16 \\\n  --max_length 512 \\\n  --batch_size 256 \\\n  --pooling_method mean \\\n  --sentence_transformer \\\n  --faiss_type Flat \n```\n\n#### For Sparse Retrieval Methods (BM25)\n\nIf building a bm25 index, there is no need to specify `model_path`.\n\n##### Building Index with BM25s\n\n```bash\npython -m flashrag.retriever.index_builder \\\n  --retrieval_method bm25 \\\n  --corpus_path indexes\u002Fsample_corpus.jsonl \\\n  --bm25_backend bm25s \\\n  --save_dir indexes\u002F \n```\n\n##### Building Index with Pyserini\n\n```bash\npython -m flashrag.retriever.index_builder \\\n  --retrieval_method bm25 \\\n  --corpus_path indexes\u002Fsample_corpus.jsonl \\\n  --bm25_backend pyserini \\\n  --save_dir indexes\u002F \n```\n\n### For Sparse Neural Retrieval Methods (SPLADE)\n\n##### Install Seismic Index:\n```bash\ncurl --proto '=https' --tlsv1.2 -sSf https:\u002F\u002Fsh.rustup.rs | sh # Install Rust for compiling\npip install pyseismic-lsr # Install Seismic\n```\n\n##### Then build the index with Seismic:\n```bash\npython -m flashrag.retriever.index_builder \\ # builder\n        --retrieval_method splade \\ # Model name to trigger seismic index (splade only available)\n        --model_path retriever\u002Fsplade-v3 \\ # Local path or repository path are both supported.\n        --corpus_embedded_path data\u002Fms_marco\u002Fms_marco_embedded_corpus.jsonl \\  # Use cached embedded corpus if corpus is already available in seismic expected format\n        --corpus_path data\u002Fms_marco\u002Fms_marco_corpus.jsonl \\ # Corpus path in format {id, contents} jsonl file to be embedded if not already built\n        --save_dir indexes\u002F \\ # save index directory\n        --use_fp16 \\ # tell to use fp16 for splade model\n        --max_length 512 \\ # max tokens for each document\n        --batch_size 4 \\ # batch size for splade model (4-5 seems the best size for Tesla T4 16GB)\n        --n_postings 1000 \\ # seismic number of posting lists\n        --centroid_fraction 0.2 \\ # seismic centroids\n        --min_cluster_size 2 \\ # seismic min cluster\n        --summary_energy 0.4 \\ # seismic energy\n        --batched_indexing 10000000 # seismic batch\n        --nknn 32 # Optional parameter. Tell to seismic to use also knn graph. if not present seismic will work without knn graph\n```\n### Using the ready-made pipeline\n\nYou can use the pipeline class we have already built (as shown in [\u003Cu>pipelines\u003C\u002Fu>](#pipelines)) to implement the RAG process inside. In this case, you just need to configure the config and load the corresponding pipeline.\n\nFirstly, load the entire process's config, which records various hyperparameters required in the RAG process. You can input yaml files as parameters or directly as variables.\n\nPlease note that **variables as input take precedence over files**.\n\n```python\nfrom flashrag.config import Config\n\n# hybrid load configs\nconfig_dict = {'data_dir': 'dataset\u002F'}\nmy_config = Config(\n    config_file_path = 'my_config.yaml',\n    config_dict = config_dict\n```\n\nWe provide comprehensive guidance on how to set configurations, you can see our [\u003Cu>configuration guidance\u003C\u002Fu>](.\u002Fdocs\u002Foriginal_docs\u002Fconfiguration.md).\nYou can also refer to the [\u003Cu>basic yaml file\u003C\u002Fu>](.\u002Fflashrag\u002Fconfig\u002Fbasic_config.yaml) we provide to set your own parameters.\n\nNext, load the corresponding dataset and initialize the pipeline. The components in the pipeline will be automatically loaded.\n\n```python\nfrom flashrag.utils import get_dataset\nfrom flashrag.pipeline import SequentialPipeline\nfrom flashrag.prompt import PromptTemplate\nfrom flashrag.config import Config\n\nconfig_dict = {'data_dir': 'dataset\u002F'}\nmy_config = Config(\n    config_file_path = 'my_config.yaml',\n    config_dict = config_dict\n)\nall_split = get_dataset(my_config)\ntest_data = all_split['test']\n\npipeline = SequentialPipeline(my_config)\n```\n\nYou can specify your own input prompt using `PromptTemplete`:\n\n```python\nprompt_templete = PromptTemplate(\n    config,\n    system_prompt = \"Answer the question based on the given document. Only give me the answer and do not output any other words.\\nThe following are given documents.\\n\\n{reference}\",\n    user_prompt = \"Question: {question}\\nAnswer:\"\n)\npipeline = SequentialPipeline(\n  my_config,\n  prompt_template = prompt_templete\n)\n```\n\nFinally, execute `pipeline.run` to obtain the final result.\n\n```python\noutput_dataset = pipeline.run(test_data, do_eval=True)\n```\n\nThe `output_dataset` contains the intermediate results and metric scores for each item in the input dataset.\nMeanwhile, the dataset with intermediate results and the overall evaluation score will also be saved as a file (if `save_intermediate_data` and `save_metric_score` are specified).\n\n### Build your own pipeline!\n\nSometimes you may need to implement more complex RAG process, and you can build your own pipeline to implement it.\nYou just need to inherit `BasicPipeline`, initialize the components you need, and complete the `run` function.\n\n```python\nfrom flashrag.pipeline import BasicPipeline\nfrom flashrag.utils import get_retriever, get_generator\n\nclass ToyPipeline(BasicPipeline):\n  def __init__(self, config, prompt_templete=None):\n    # Load your own components\n    pass\n\n  def run(self, dataset, do_eval=True):\n    # Complete your own process logic\n\n    # get attribute in dataset using `.`\n    input_query = dataset.question\n    ...\n    # use `update_output` to save intermeidate data\n    dataset.update_output(\"pred\",pred_answer_list)\n    dataset = self.evaluate(dataset, do_eval=do_eval)\n    return dataset\n```\n\nPlease first understand the input and output forms of the components you need to use from our [\u003Cu>documentation\u003C\u002Fu>](.\u002Fdocs\u002Foriginal_docs\u002Fbasic_usage.md).\n\n### Just use components\n\nIf you already have your own code and only want to use our components to embed the original code, you can refer to the [\u003Cu>basic introduction of the components\u003C\u002Fu>](.\u002Fdocs\u002Foriginal_docs\u002Fbasic_usage.md) to obtain the input and output formats of each component.\n\n## :gear: Components\n\nIn FlashRAG, we have built a series of common RAG components, including retrievers, generators, refiners, and more. Based on these components, we have assembled several pipelines to implement the RAG workflow, while also providing the flexibility to combine these components in custom arrangements to create your own pipeline.\n\n#### RAG-Components\n\n\u003Ctable>\n  \u003Cthead>\n    \u003Ctr>\n      \u003Cth>Type\u003C\u002Fth>\n      \u003Cth>Module\u003C\u002Fth>\n      \u003Cth>Description\u003C\u002Fth>\n    \u003C\u002Ftr>\n  \u003C\u002Fthead>\n  \u003Ctbody>\n    \u003Ctr>\n      \u003Ctd rowspan=\"1\">Judger\u003C\u002Ftd>\n      \u003Ctd>SKR Judger\u003C\u002Ftd>\n      \u003Ctd>Judging whether to retrieve using \u003Ca href=\"https:\u002F\u002Faclanthology.org\u002F2023.findings-emnlp.691.pdf\">SKR\u003C\u002Fa> method\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n      \u003Ctd rowspan=\"4\">Retriever\u003C\u002Ftd>\n      \u003Ctd>Dense Retriever\u003C\u002Ftd>\n      \u003Ctd>Bi-encoder models such as dpr, bge, e5, using faiss for search\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n      \u003Ctd>BM25 Retriever\u003C\u002Ftd>\n      \u003Ctd>Sparse retrieval method based on Lucene\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n      \u003Ctd>Bi-Encoder Reranker\u003C\u002Ftd>\n      \u003Ctd>Calculate matching score using bi-Encoder\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n      \u003Ctd>Cross-Encoder Reranker\u003C\u002Ftd>\n      \u003Ctd>Calculate matching score using cross-encoder\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n      \u003Ctd rowspan=\"5\">Refiner\u003C\u002Ftd>\n      \u003Ctd>Extractive Refiner\u003C\u002Ftd>\n      \u003Ctd>Refine input by extracting important context\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n      \u003Ctd>Abstractive Refiner\u003C\u002Ftd>\n      \u003Ctd>Refine input through seq2seq model\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n      \u003Ctd>LLMLingua Refiner\u003C\u002Ftd>\n      \u003Ctd>\u003Ca href=\"https:\u002F\u002Faclanthology.org\u002F2023.emnlp-main.825\u002F\">LLMLingua-series\u003C\u002Fa> prompt compressor\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n      \u003Ctd>SelectiveContext Refiner\u003C\u002Ftd>\n      \u003Ctd>\u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.06201\">Selective-Context\u003C\u002Fa> prompt compressor\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n      \u003Ctd> KG Refiner \u003C\u002Ftd>\n      \u003Ctd>Use \u003Ca hred='https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.11460'>Trace method to construct a knowledge graph\u003C\u002Ftd>\n    \u003Ctr>\n      \u003Ctd rowspan=\"4\">Generator\u003C\u002Ftd>\n      \u003Ctd>Encoder-Decoder Generator\u003C\u002Ftd>\n      \u003Ctd>Encoder-Decoder model, supporting \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2007.01282\">Fusion-in-Decoder (FiD)\u003C\u002Fa>\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n      \u003Ctd>Decoder-only Generator\u003C\u002Ftd>\n      \u003Ctd>Native transformers implementation\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n      \u003Ctd>FastChat Generator\u003C\u002Ftd>\n      \u003Ctd>Accelerate with \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Flm-sys\u002FFastChat\">FastChat\u003C\u002Fa>\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n      \u003Ctd>vllm Generator\u003C\u002Ftd>\n      \u003Ctd>Accelerate with \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm\">vllm\u003C\u002Fa>\u003C\u002Ftd>\n    \u003C\u002Ftr>\n  \u003C\u002Ftbody>\n\u003C\u002Ftable>\n\n#### Pipelines\n\nReferring to a [\u003Cu>survey on retrieval-augmented generation\u003C\u002Fu>](https:\u002F\u002Farxiv.org\u002Fabs\u002F2312.10997), we categorized RAG methods into four types based on their inference paths.\n\n- **Sequential**: Sequential execuation of RAG process, like Query-(pre-retrieval)-retriever-(post-retrieval)-generator\n- **Conditional**: Implements different paths for different types of input queries\n- **Branching** : Executes multiple paths in parallel, merging the responses from each path\n- **Loop**: Iteratively performs retrieval and generation\n\nIn each category, we have implemented corresponding common pipelines. Some pipelines have corresponding work papers.\n\n\u003Ctable>\n    \u003Cthead>\n        \u003Ctr>\n            \u003Cth>Type\u003C\u002Fth>\n            \u003Cth>Module\u003C\u002Fth>\n            \u003Cth>Description\u003C\u002Fth>\n        \u003C\u002Ftr>\n    \u003C\u002Fthead>\n    \u003Ctbody>\n        \u003Ctr>\n            \u003Ctd rowspan=\"1\">Sequential\u003C\u002Ftd>\n            \u003Ctd>Sequential Pipeline\u003C\u002Ftd>\n            \u003Ctd>Linear execution of query, supporting refiner, reranker\u003C\u002Ftd>\n        \u003C\u002Ftr>\n        \u003Ctr>\n            \u003Ctd rowspan=\"1\">Conditional\u003C\u002Ftd>\n            \u003Ctd>Conditional Pipeline\u003C\u002Ftd>\n            \u003Ctd>With a judger module, distinct execution paths for various query types\u003C\u002Ftd>\n        \u003C\u002Ftr>\n        \u003Ctr>\n            \u003Ctd rowspan=\"2\">Branching\u003C\u002Ftd>\n            \u003Ctd>REPLUG Pipeline\u003C\u002Ftd>\n            \u003Ctd>Generate answer by integrating probabilities in multiple generation paths\u003C\u002Ftd>\n        \u003C\u002Ftr>\n          \u003Ctd>SuRe Pipeline\u003C\u002Ftd>\n          \u003Ctd>Ranking and merging generated results based on each document\u003C\u002Ftd>\n        \u003C\u002Ftr>\n        \u003Ctr>\n            \u003Ctd rowspan=\"6\">Loop\u003C\u002Ftd>\n            \u003Ctd>Iterative Pipeline\u003C\u002Ftd>\n            \u003Ctd>Alternating retrieval and generation\u003C\u002Ftd>\n        \u003C\u002Ftr>\n        \u003Ctr>\n            \u003Ctd>Self-Ask Pipeline\u003C\u002Ftd>\n            \u003Ctd>Decompose complex problems into subproblems using \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2210.03350\">self-ask\u003C\u002Fa> \u003C\u002Ftd>\n        \u003C\u002Ftr>\n        \u003Ctr>\n            \u003Ctd>Self-RAG Pipeline\u003C\u002Ftd>\n            \u003Ctd>Adaptive retrieval, critique, and generation\u003C\u002Ftd>\n        \u003C\u002Ftr>\n        \u003Ctr>\n            \u003Ctd>FLARE Pipeline\u003C\u002Ftd>\n            \u003Ctd>Dynamic retrieval during the generation process\u003C\u002Ftd>\n        \u003C\u002Ftr>\n        \u003Ctr>\n            \u003Ctd>IRCoT Pipeline\u003C\u002Ftd>\n            \u003Ctd>Integrate retrieval process with CoT\u003C\u002Ftd>\n        \u003C\u002Ftr>\n        \u003Ctr>\n            \u003Ctd>Reasoning Pipeline\u003C\u002Ftd>\n            \u003Ctd>Reasoning with retrieval\u003C\u002Ftd>\n        \u003C\u002Ftr>\n    \u003C\u002Ftbody>\n\u003C\u002Ftable>\n\n## :art: FlashRAG-UI\n\u003Cp>With \u003Cstrong>FlashRAG-UI\u003C\u002Fstrong>, you can easily and quickly configure and experience the supported RAG methods through our meticulously designed visual interface, and evaluate these methods on benchmarks, making complex research work more efficient!\u003C\u002Fp>\n\n### :star2: Features\n- **One-Click Configuration Loading**\n  - You can load parameters and configuration files for various RAG methods through simple clicks, selections, and inputs.\u003C\u002Fli>\n  - Supports preview interface for intuitive parameter settings.\u003C\u002Fli>\n  - Provides save functionality to easily store configurations for future use.\u003C\u002Fli>\n- **Quick Method Experience**\n  - Quickly load corpora and index files to explore the characteristics and application scenarios of various RAG methods.\u003C\u002Fli>\n  - Supports loading and switching different components and hyperparameters, seamlessly connecting different RAG Pipelines to quickly experience their performance and differences!\u003C\u002Fli>\n- **Efficient Benchmark Reproduction**\n  - Easily reproduce the built-in baseline methods and carefully collected benchmarks on FlashRAG-UI.\u003C\u002Fli>\n  - Use cutting-edge research tools directly without complex settings, providing a smooth experience for your research work!\u003C\u002Fli>\n  \n\u003Cdetails>\n\u003Csummary>Show more\u003C\u002Fsummary>\n\u003Ctable align=\"center\">\n  \u003Ctr>\n    \u003Ctd align=\"center\">\n      \u003Cimg src=\".\u002Fasset\u002Fdemo_en1.jpg\" alt=\"Image 1\" width=\"505\"\u002F>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\">\n      \u003Cimg src=\".\u002Fasset\u002Fdemo_en2.jpg\" alt=\"Image 2\" width=\"505\"\u002F>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd align=\"center\">\n      \u003Cimg src=\".\u002Fasset\u002Fdemo_en4.png\" alt=\"Image 3\" width=\"500\"\u002F>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\">\n      \u003Cimg src=\".\u002Fasset\u002Fdemo_en3.jpg\" alt=\"Image 4\" width=\"500\"\u002F>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\u003C\u002Fdetails>\n\n#### Experience our meticulously designed FlashRAG-UI—both user-friendly and visually appealing:\n```bash\ncd webui\npython interface.py\n```\n\n## :robot: Supporting Methods\n\nWe have implemented **23 works** with a consistent setting of:\n\n- **Generator:** LLAMA3-8B-instruct with input length of 2048\n- **Retriever:** e5-base-v2 as embedding model, retrieve 5 docs per query\n- **Prompt:** A consistent default prompt, template can be found in the [\u003Cu>method details\u003C\u002Fu>](.\u002Fdocs\u002Foriginal_docs\u002Fbaseline_details.md).\n\nFor open-source methods, we implemented their processes using our framework. For methods where the author did not provide source code, we will try our best to follow the methods in the original paper for implementation.\n\nFor necessary settings and hyperparameters specific to some methods, we have documented them in the **specific settings** column. For more details, please consult our [\u003Cu>reproduce guidance\u003C\u002Fu>](.\u002Fdocs\u002Foriginal_docs\u002Freproduce_experiment.md) and [\u003Cu>method details\u003C\u002Fu>](.\u002Fdocs\u002Foriginal_docs\u002Fbaseline_details.md).\n\nIt's important to note that, to ensure consistency, we have utilized a uniform setting. However, this setting may differ from the original setting of the method, leading to variations in results compared to the original outcomes.\n\n| Method                                                                                    | Type        | NQ (EM) | TriviaQA (EM) | Hotpotqa (F1) | 2Wiki (F1) | PopQA (F1) | WebQA(EM) | Specific setting                                |\n| ----------------------------------------------------------------------------------------- | ----------- | ------- | ------------- | ------------- | ---------- | ---------- | --------- | ----------------------------------------------- |\n| Naive Generation                                                                          | Sequential  | 22.6    | 55.7          | 28.4          | 33.9       | 21.7       | 18.8      |                                                 |\n| Standard RAG                                                                              | Sequential  | 35.1    | 58.9          | 35.3          | 21.0       | 36.7       | 15.7      |                                                 |\n| [AAR-contriever-kilt](https:\u002F\u002Faclanthology.org\u002F2023.acl-long.136.pdf)                     | Sequential  | 30.1    | 56.8          | 33.4          | 19.8       | 36.1       | 16.1      |                                                 |\n| [LongLLMLingua](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.06839)                                         | Sequential  | 32.2    | 59.2          | 37.5          | 25.0       | 38.7       | 17.5      | Compress Ratio=0.5                              |\n| [RECOMP-abstractive](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2310.04408)                                    | Sequential  | 33.1    | 56.4          | 37.5          | 32.4       | 39.9       | 20.2      |                                                 |\n| [Selective-Context](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.06201)                                     | Sequential  | 30.5    | 55.6          | 34.4          | 18.5       | 33.5       | 17.3      | Compress Ratio=0.5                              |\n| [Trace](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.11460)                                                 | Sequential  | 30.7    | 50.2          | 34.0          | 15.5       | 37.4       | 19.9      |                                                 |\n| [Spring](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.19670)                                                | Sequential  | 37.9    | 64.6          | 42.6          | 37.3       | 54.8       | 27.7      | Use Llama2-7B-chat with trained embedding table |\n| [SuRe](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.13081)                                                  | Branching   | 37.1    | 53.2          | 33.4          | 20.6       | 48.1       | 24.2      | Use provided prompt                             |\n| [REPLUG](https:\u002F\u002Farxiv.org\u002Fabs\u002F2301.12652)                                                | Branching   | 28.9    | 57.7          | 31.2          | 21.1       | 27.8       | 20.2      |                                                 |\n| [SKR](https:\u002F\u002Faclanthology.org\u002F2023.findings-emnlp.691.pdf)                               | Conditional | 33.2    | 56.0          | 32.4          | 23.4       | 31.7       | 17.0      | Use infernece-time training data                |\n| [Adaptive-RAG](https:\u002F\u002Faclanthology.org\u002F2024.naacl-long.389.pdf)                          | Conditional | 35.1    | 56.6          | 39.1          | 28.4       | 40.4       | 16.0      |                                                 |\n| [Ret-Robust](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.01558)                                            | Loop        | 42.9    | 68.2          | 35.8          | 43.4       | 57.2       | 33.7      | Use LLAMA2-13B with trained lora                |\n| [Self-RAG](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.11511)                                              | Loop        | 36.4    | 38.2          | 29.6          | 25.1       | 32.7       | 21.9      | Use trained selfrag-llama2-7B                   |\n| [FLARE](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.06983)                                                 | Loop        | 22.5    | 55.8          | 28.0          | 33.9       | 20.7       | 20.2      |                                                 |\n| [Iter-Retgen](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.15294), [ITRG](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.05149) | Loop        | 36.8    | 60.1          | 38.3          | 21.6       | 37.9       | 18.2      |                                                 |\n| [IRCoT](https:\u002F\u002Faclanthology.org\u002F2023.acl-long.557.pdf)                                   | Loop        | 33.3    | 56.9          | 41.5          | 32.4       | 45.6       | 20.7      |                                                 |\n| [RQRAG](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.00610)                                   | Loop        | 32.6    | 52.5          | 33.5          | 35.8       | 46.4       | 26.2      |  Use trained rqrag-llama2-7B                                               | \n\n#### 🚀 Reasoning-based Methods (NEW!)\n\nWe now support **7 reasoning-based methods** that combine reasoning ability with retrieval, achieving superior performance on complex multi-hop tasks:\n\n| Method                                                                                    | Type        | NQ (EM) | TriviaQA (EM) | PopQA (EM) | Hotpotqa (F1) | 2Wiki (F1) |  Musique (F1) | Bamboogle (F1) | Specific setting                             |\n| ----------------------------------------------------------------------------------------- | ----------- | ------- | ------- | ------------- | ------------- | ---------- | ---------- | --------- | ----------------------------------------------- |\n| [Search-R1](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.09516) | Reasoning | 45.2 | 62.2 | 49.2 | 54.5 | 42.6 | 29.2 |  59.9 | SearchR1-nq_hotpotqa_train-qwen2.5-7b-em-ppo |\n| [R1-Searcher](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2503.05592) | Reasoning | 36.9 | 61.6 | 42.0 | 49.0 | 49.1 | 24.7 | 57.7 | Qwen-2.5-7B-base-RAG-RL |\n| [O2-Searcher](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2505.16582) | Reasoning | 41.4 | 51.4 | 46.8 | 43.4 | 48.6 | 19.0 | 47.6 | O2-Searcher-Qwen2.5-3B-GRPO |\n| [AutoRefine](https:\u002F\u002Fwww.arxiv.org\u002Fpdf\u002F2505.11277) | Reasoning | 43.8 | 59.8 | 32.4 | 54.0 | 50.3 | 23.6 | 46.6 | AutoRefine-Qwen2.5-3B-Base |\n| [ReaRAG](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.21729) | Reasoning | 26.3 | 51.8 | 24.6 | 42.9 | 41.6 | 21.2 | 41.9 | ReaRAG-9B |\n| [CoRAG](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.21729) | Reasoning | 40.9 | 63.1 | 36.0 | 56.6 | 60.7 | 31.9 | 54.1 | CoRAG-Llama3.1-8B-MultihopQA |\n| [SimpleDeepSearcher](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2505.16834) | Reasoning | 36.1 | 61.6 | 42.0 | 49.0 | 49.1 | 24.7 | 57.7 | Qwen-7B-SimpleDeepSearcher |\n\n## :notebook: Supporting Datasets & Document Corpus\n\n### Datasets\n\nWe have collected and processed 36 datasets widely used in RAG research, pre-processing them to ensure a consistent format for ease of use. For certain datasets (such as Wiki-asp), we have adapted them to fit the requirements of RAG tasks according to the methods commonly used within the community. All datasets are available at [\u003Cu>Huggingface datasets\u003C\u002Fu>](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FRUC-NLPIR\u002FFlashRAG_datasets).\n\nFor each dataset, we save each split as a `jsonl` file, and each line is a dict as follows:\n\n```python\n{\n  'id': str,\n  'question': str,\n  'golden_answers': List[str],\n  'metadata': dict\n}\n```\n\nBelow is the list of datasets along with the corresponding sample sizes:\n\n| Task                      | Dataset Name    | Knowledge Source | # Train   | # Dev   | # Test |\n| ------------------------- | --------------- | ---------------- | --------- | ------- | ------ |\n| QA                        | NQ              | wiki             | 79,168    | 8,757   | 3,610  |\n| QA                        | TriviaQA        | wiki & web       | 78,785    | 8,837   | 11,313 |\n| QA                        | PopQA           | wiki             | \u002F         | \u002F       | 14,267 |\n| QA                        | SQuAD           | wiki             | 87,599    | 10,570  | \u002F      |\n| QA                        | MSMARCO-QA      | web              | 808,731   | 101,093 | \u002F      |\n| QA                        | NarrativeQA     | books and story  | 32,747    | 3,461   | 10,557 |\n| QA                        | WikiQA          | wiki             | 20,360    | 2,733   | 6,165  |\n| QA                        | WebQuestions    | Google Freebase  | 3,778     | \u002F       | 2,032  |\n| QA                        | AmbigQA         | wiki             | 10,036    | 2,002   | \u002F      |\n| QA                        | SIQA            | -                | 33,410    | 1,954   | \u002F      |\n| QA                        | CommonSenseQA   | -                | 9,741     | 1,221   | \u002F      |\n| QA                        | BoolQ           | wiki             | 9,427     | 3,270   | \u002F      |\n| QA                        | PIQA            | -                | 16,113    | 1,838   | \u002F      |\n| QA                        | Fermi           | wiki             | 8,000     | 1,000   | 1,000  |\n| multi-hop QA              | HotpotQA        | wiki             | 90,447    | 7,405   | \u002F      |\n| multi-hop QA              | 2WikiMultiHopQA | wiki             | 15,000    | 12,576  | \u002F      |\n| multi-hop QA              | Musique         | wiki             | 19,938    | 2,417   | \u002F      |\n| multi-hop QA              | Bamboogle       | wiki             | \u002F         | \u002F       | 125    |\n| multi-hop QA              | StrategyQA      | wiki             | 2290      | \u002F       | \u002F      |\n| Long-form QA              | ASQA            | wiki             | 4,353     | 948     | \u002F      |\n| Long-form QA              | ELI5            | Reddit           | 272,634   | 1,507   | \u002F      |\n| Long-form QA              | WikiPassageQA   | wiki             | 3,332     | 417     | 416    |\n| Open-Domain Summarization | WikiASP         | wiki             | 300,636   | 37,046  | 37,368 |\n| multiple-choice           | MMLU            | -                | 99,842    | 1,531   | 14,042 |\n| multiple-choice           | TruthfulQA      | wiki             | \u002F         | 817     | \u002F      |\n| multiple-choice           | HellaSWAG       | ActivityNet      | 39,905    | 10,042  | \u002F      |\n| multiple-choice           | ARC             | -                | 3,370     | 869     | 3,548  |\n| multiple-choice           | OpenBookQA      | -                | 4,957     | 500     | 500    |\n| multiple-choice           | QuaRTz          | -                | 2696      | 384     | 784    |\n| Fact Verification         | FEVER           | wiki             | 104,966   | 10,444  | \u002F      |\n| Dialog Generation         | WOW             | wiki             | 63,734    | 3,054   | \u002F      |\n| Entity Linking            | AIDA CoNll-yago | Freebase & wiki  | 18,395    | 4,784   | \u002F      |\n| Entity Linking            | WNED            | Wiki             | \u002F         | 8,995   | \u002F      |\n| Slot Filling              | T-REx           | DBPedia          | 2,284,168 | 5,000   | \u002F      |\n| Slot Filling              | Zero-shot RE    | wiki             | 147,909   | 3,724   | \u002F      |\n| In-domain QA              | DomainRAG       | Web pages of RUC | \u002F         | \u002F       | 485    |\n\n### Document Corpus\n\nOur toolkit supports jsonl format for retrieval document collections, with the following structure:\n\n```jsonl\n{\"id\":\"0\", \"contents\": \"...\"}\n{\"id\":\"1\", \"contents\": \"...\"}\n```\n\nThe `contents` key is essential for building the index. For documents that include both text and title, we recommend setting the value of `contents` to `{title}\\n{text}`. The corpus file can also contain other keys to record additional characteristics of the documents.\n\nIn the academic research, Wikipedia and MS MARCO are the most commonly used retrieval document collections. For Wikipedia, we provide a [\u003Cu>comprehensive script\u003C\u002Fu>](.\u002Fdocs\u002Foriginal_docs\u002Fprocess-wiki.md) to process any Wikipedia dump into a clean corpus. Additionally, various processed versions of the Wikipedia corpus are available in many works, and we have listed some reference links.\n\nFor MS MARCO, it is already processed upon release and can be directly downloaded from its [\u003Cu>hosting link\u003C\u002Fu>](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FTevatron\u002Fmsmarco-passage-corpus) on Hugging Face.\n\n### Index\n\nTo facilitate easier replication of the experiments, we now provide a preprocessed index available in the ModelScope dataset page: [FlashRAG_Dataset\u002Fretrieval_corpus\u002Fwiki18_100w_e5_index.zip](https:\u002F\u002Fwww.modelscope.cn\u002Fdatasets\u002Fhhjinjiajie\u002FFlashRAG_Dataset\u002Ffile\u002Fview\u002Fmaster?id=47985&status=2&fileName=retrieval_corpus%252Fwiki18_100w_e5_index.zip).\n\nThe index was created using the e5-base-v2 retriever on our uploaded wiki18_100w dataset, which is consistent with the index used in our experiments.\n\n## :lollipop: Awesome Work using FlashRAG\n\n- [R1-Searcher](https:\u002F\u002Fgithub.com\u002FSsmallSong\u002FR1-Searcher), a method that incentivizes the search capability in LLMs via reinforcement learning\n- [ReSearch](https:\u002F\u002Fgithub.com\u002FAgent-RL\u002FReSearch), a method that learns to reason with search for LLMs via reinforcement learning\n- [AutoCoA](https:\u002F\u002Fgithub.com\u002FADaM-BJTU\u002FAutoCoA), a method that internalizes chain-of-action generation into reasoning models\n\n## :raised_hands: Additional FAQs\n\n- [How should I set different experimental parameters?](.\u002Fdocs\u002Foriginal_docs\u002Fconfiguration.md)\n- [How to build my own corpus, such as a specific segmented Wikipedia?](.\u002Fdocs\u002Foriginal_docs\u002Fprocess-wiki.md)\n- [How to index my own corpus?](.\u002Fdocs\u002Foriginal_docs\u002Fbuilding-index.md)\n- [How to reproduce supporting methods?](.\u002Fdocs\u002Foriginal_docs\u002Freproduce_experiment.md)\n- [How can I debug common RAG failure modes when using FlashRAG?](.\u002Fdocs\u002Frag_failure_modes_and_debug_checklist.md)\n\n## :bookmark: License\n\nFlashRAG is licensed under the [\u003Cu>MIT License\u003C\u002Fu>](.\u002FLICENSE).\n\n## :star2: Citation\n\nPlease kindly cite our paper if helps your research:\n\n```BibTex\n@inproceedings{FlashRAG,\n  author       = {Jiajie Jin and\n                  Yutao Zhu and\n                  Zhicheng Dou and\n                  Guanting Dong and\n                  Xinyu Yang and\n                  Chenghao Zhang and\n                  Tong Zhao and\n                  Zhao Yang and\n                  Ji{-}Rong Wen},\n  editor       = {Guodong Long and\n                  Michale Blumestein and\n                  Yi Chang and\n                  Liane Lewin{-}Eytan and\n                  Zi Helen Huang and\n                  Elad Yom{-}Tov},\n  title        = {FlashRAG: {A} Modular Toolkit for Efficient Retrieval-Augmented Generation\n                  Research},\n  booktitle    = {Companion Proceedings of the {ACM} on Web Conference 2025, {WWW} 2025,\n                  Sydney, NSW, Australia, 28 April 2025 - 2 May 2025},\n  pages        = {737--740},\n  publisher    = {{ACM}},\n  year         = {2025},\n  url          = {https:\u002F\u002Fdoi.org\u002F10.1145\u002F3701716.3715313},\n  doi          = {10.1145\u002F3701716.3715313}\n}\n```\n\n\n## Star History\n\n[![Star History Chart](https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=RUC-NLPIR\u002FFlashRAG&type=Date)](https:\u002F\u002Fstar-history.com\u002F#RUC-NLPIR\u002FFlashRAG&Date)\n","FlashRAG 是一个用于检索增强生成（RAG）研究的高效Python工具包。该项目提供了36个预处理的基准RAG数据集和23种最先进的RAG算法，其中包括7种结合了推理能力与检索的方法。其核心功能包括支持现有SOTA工作的轻松复现、自定义RAG流程及组件的实现，并附带了一个易于使用的用户界面。FlashRAG特别适用于需要在自然语言处理任务中集成检索机制以提高模型性能的研究场景或实际应用开发。",2,"2026-06-11 03:41:10","high_star"]