[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-74248":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":9,"rankLanguage":9,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":22,"topics":25,"createdAt":9,"pushedAt":9,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":15,"starSnapshotCount":15,"syncStatus":16,"lastSyncTime":29,"discoverSource":30},74248,"Idea2Paper","AgentAlphaAGI\u002FIdea2Paper","AgentAlphaAGI","Idea2Paper Offical Demo",null,"Python",1349,113,7,8,0,2,4,16,6,61.77,"MIT License",false,"main",true,[],"2026-06-12 04:01:14","\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fimages\u002Flogo2.png\" alt=\"logo\" width=\"650\">\n\u003C\u002Fp>\n\n\u003Cdiv align=\"center\"> \n\n[![PyPI - Python Version](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-3.10%2B-blue)]()\n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-green)]()\n[![arXiv - Idea2Story](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2601.20833-b31b1b.svg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2601.20833)\n[![Stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FAgentAlphaAGI\u002FIdea2Paper?style=social)](https:\u002F\u002Fgithub.com\u002FAgentAlphaAGI\u002FIdea2Paper\u002Fstargazers)\n[![Website](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FWebsite-Live_Demo-blue)](http:\u002F\u002Fpaperbuild.cn)\n\n[English](README.md) | [中文](README-zh_CN.md)\n\n\u003C\u002Fdiv>\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Ch2>📌 Table of Contents\u003C\u002Fh2>\u003C\u002Fsummary>\n\n  \u003Cbr\u002F>\n\n  - [🆕 Story2Proposal](#-story2proposal)\n  - [📄 Idea2Paper](#-idea2paper)\n  - [💬 User Community](#-user-community)\n  - [✨ Key Features](#-key-features)\n  - [📦 Outputs](#-outputs)\n  - [🚀 Getting Started](#-getting-started)\n  - [🌐 Frontend (Local Web UI)](#-frontend-local-web-ui)\n  - [🔨 Knowledge Graph Builder](#-knowledge-graph-builder)\n  - [🤖 Anchored Multi-Agent Review](#-anchored-multi-agent-review)\n  - [📚 Files & Docs](#-files--docs)\n  - [🤝 Contributing & License](#-contributing--license)\n  - [🙏 Credits](#-credits)\n  - [👥 Contributors](#-contributors)\n  - [📑 Citation (Idea2Story)](#-citation-idea2story)\n\n\u003C\u002Fdetails>\n\n---\n\n## 🆕 Story2Proposal\n\n**AgentAlpha's latest work:** [papers\u002FStory2Proposal.pdf](papers\u002FStory2Proposal.pdf)  \n**Title:** `Story2Proposal: A Scaffold for Structured Scientific Paper Writing`\n\nStory2Proposal is AgentAlpha's latest work on structured scientific writing. It introduces a contract-governed multi-agent framework that transforms a research story into a structured scientific manuscript through a persistent shared visual contract. The system coordinates architect, writer, refiner, renderer, and evaluation agents in a generate-evaluate-adapt loop to preserve section structure, argumentative consistency, figure\u002Ftable alignment, and data fidelity across the full document lifecycle. On tasks derived from the Jericho research corpus, the paper reports stronger structural consistency, visual alignment, and overall expert evaluation than DirectChat and FARS.\n\n## 📄 Idea2Paper\n\nIdea2Paper is an end-to-end research agent framework that aims to systematically define and analyze the major stages of the contemporary research process, along with the core challenges inherent to each stage. Rather than treating paper writing as a monolithic generation problem, Idea2Paper explicitly decomposes scientific research into structured phases and identifies critical bottlenecks that hinder the transformation of raw ideas into coherent, submission-ready academic narratives. Through this analysis, Idea2Paper highlights that one of the most fundamental yet underexplored challenges lies in research paradigm generation—the process of converting an underspecified research idea into a logically consistent, academically grounded research story. Existing systems often struggle to produce stable and reusable research paradigms, especially when reasoning is performed entirely at runtime and under limited contextual grounding.\n\nTo address these challenges in a principled and engineering-oriented manner, Idea2Paper adopts a modular system design. Instead of immediately building a fully end-to-end writing system, the project prioritizes the construction of targeted engineering submodules that tackle specific bottlenecks in the research pipeline. As the first and core engineering submodule, Idea2Story is introduced to directly address the problem of research paradigm generation. Idea2Story focuses on transforming underspecified research ideas into complete, coherent, and submission-ready scientific narrative skeletons. By providing a structured research story as an intermediate representation, Idea2Story establishes a stable foundation for downstream stages such as method development, experiment design, and paper writing.\n \n> **Idea2Paper** : https:\u002F\u002Fwww.researchgate.net\u002Fpublication\u002F400280248_Idea2Paper_What_Should_an_End-to-End_Research_Agent_Really_Do\n\n### Idea2Story (Core Submodule of Idea2Paper)\n\nIdea2Story introduces a pre-computation–driven framework that shifts literature understanding\nfrom runtime reasoning to offline knowledge graph construction, enabling more efficient and\nreliable autonomous scientific discovery.\n\n> **Idea2Story** : https:\u002F\u002Farxiv.org\u002Fabs\u002F2601.20833\n\n### 🧠 Core Philosophy\n- **Knowledge-Driven**: Uses ICLR data to build a comprehensive knowledge graph.\n- **Auditable Review**: Implements an anchored multi-agent review system for objective feedback.\n- **Automated Refinement**: Includes RAG deduplication and intelligent revision to enhance novelty.\n\n\u003Cdiv align=\"center\">\n\u003Cimg src=\"https:\u002F\u002Farxiv.org\u002Fhtml\u002F2601.20833v1\u002Fx1.png\" alt=\"Idea2Paper Architecture\" width=\"800\"\u002F>\n\u003Cbr\u002F>\n\u003Cem>Idea2Story pipeline architecture (a core module within Idea2Paper)\u003C\u002Fem>\n\u003C\u002Fdiv>\n\n## 💬 User Community\n\n| WeChat Group                                                                                | Discord Channel |\n|---------------------------------------------------------------------------------------------| --- |\n| \u003Cp align=\"center\"> \u003Cimg src=\".\u002Fassets\u002Fimages\u002Fidea2paper_code.png\" width=\"200\" \u002F>\u003Cbr\u002F>  \u003C\u002Fp> | https:\u002F\u002Fdiscord.gg\u002FXfAQYRZ4kk |\n[![Website](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FWebsite-Live_Demo-blue)](http:\u002F\u002Fpaperbuild.cn)\n\n## ✨ Key Features\n\n- **🕸️ Knowledge Graph**: Built from ICLR data with Idea\u002FPattern\u002FDomain\u002FPaper nodes.\n- **🎣 Advanced Retrieval**: Three-path retrieval (Idea\u002FDomain\u002FPaper) with two-stage ranking (Jaccard + Embedding).\n- **📝 Idea2Story Generation**: From pattern selection to story generation, anchored review, and smart correction.\n- **🤖 Anchored Multi-Agent Review**: Uses real review statistics as anchors for relative comparisons, producing deterministic and auditable 1-10 scores.\n- **📊 Comprehensive Logging**: Per-run structured logs for full reproducibility and auditing.\n\n## 📦 Outputs\n\n- 📄 `Paper-KG-Pipeline\u002Foutput\u002Ffinal_story.json`: Final structured Story (title\u002Fabstract\u002Fproblem\u002Fmethod\u002Fcontribs\u002Fexperiments).\n- 🔍 `Paper-KG-Pipeline\u002Foutput\u002Fpipeline_result.json`: Full pipeline trace (reviews, corrections, audits).\n- 📂 `log\u002Frun_...\u002F`: Structured logs for every run.\n\n## 🚀 Getting Started\n\n### Prerequisites\n- Python 3.10+\n\n### Installation\n\n```bash\npip install -r Paper-KG-Pipeline\u002Frequirements.txt\n```\n> **Note:** The embedding model is configurable via `EMBEDDING_MODEL` \u002F `EMBEDDING_API_URL` (env or `i2p_config.json`). If you switch models, rebuild novelty\u002Frecall indexes or use model-specific index directories to avoid mismatch.  \n> **Constraint:** the embedding dimension must match your index; if you switch models, rebuild indexes or use model-specific index dirs.  \n> **Recommended (auto_profile):** set `I2P_INDEX_DIR_MODE=auto_profile` to auto-map each embedding model to its own index dirs: `Paper-KG-Pipeline\u002Foutput\u002Fnovelty_index__{model}` and `...\u002Frecall_index__{model}`.  \n> Explicit `I2P_NOVELTY_INDEX_DIR` \u002F `I2P_RECALL_INDEX_DIR` (env or `i2p_config.json`) override auto_profile.  \n> **Tip (speed\u002Fstability):** set `I2P_ANCHOR_DENSIFY_ENABLE=0` to skip Adaptive Densify; otherwise Phase 3 Critic can be much slower and may fail due to strict JSON validation.  \n> **Tip (debug):** if you repeatedly hit Critic JSON errors, set `I2P_CRITIC_STRICT_JSON=0` (or `critic.strict_json=false`) to disable strict mode and allow fallback.  \n> **Tip (LLM temperature):** per-stage temperatures are configurable via `I2P_LLM_TEMPERATURE_*` or `llm.temperature.*`; defaults preserve current behavior. Critic is usually low temp for stability, while story generation can be moderate.  \n> **Tip (Idea Packaging):** optional quality boost via pattern-guided idea packaging + double recall (default off). Enable with `I2P_IDEA_PACKAGING_ENABLE=1` or `idea.packaging_enable=true`.  \n> **Tip (Subdomain taxonomy):** optional quality boost for Path2 to reduce duplicated\u002Flong-tail subdomains. When enabled, the pipeline auto-detects and (if `I2P_INDEX_ALLOW_BUILD=1`) auto-builds `subdomain_taxonomy.json` under `recall_index_dir` (recommended: leave `I2P_SUBDOMAIN_TAXONOMY_PATH` empty). First build uses batched embeddings; you can also build manually via `Paper-KG-Pipeline\u002Fscripts\u002Ftools\u002Fbuild_subdomain_taxonomy.py`.  \n> **Supported (no code changes):** OpenAI-compatible Embeddings APIs (`\u002Fv1\u002Fembeddings`) that accept `input` as a string or a list.  \n> **Not supported yet:** DashScope “native” embeddings endpoint (`\u002Fapi\u002Fv1\u002Fservices\u002Fembeddings\u002F...`) requires an adapter.\n\n### Dataset\n\n👉 **[DATA](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FAgentAlphaAGI\u002FPaper-Review-Dataset\u002Ftree\u002Fmain)** \u003Cbr>\n\nIf you need to use the prebuilt local index, please place the two folders in `paper-embedding` from Hugging Face into `paper-KG-Pipeline\u002Foutput`, \u003Cbr>\n```text\npaper-KG-Pipeline\u002F\n└── output\u002F\n    ├── recall_index__{model}\u002F\n    └── novelty_index__{model}\u002F\n```\nand make sure the embedding model matches the index you downloaded, otherwise errors may occur.\n\n> **Migration note (auto_profile naming change):** if you previously used provider\u002Furlhash-based dirs, you can either (A) rename the old folders to `recall_index__{model}` \u002F `novelty_index__{model}`, or (B) keep old folder names and set `I2P_RECALL_INDEX_DIR` \u002F `I2P_NOVELTY_INDEX_DIR` explicitly to those paths.\n\n\n### Configuration\n\n1. Copy `.env.example` to `.env` and fill in `LLM_API_KEY` (and optionally `LLM_PROVIDER`, `LLM_BASE_URL`).\n2. (Optional) Copy `i2p_config.example.json` to `i2p_config.json` to tweak settings.\n\n### Usage\n\n```bash\npython Paper-KG-Pipeline\u002Fscripts\u002Fidea2story_pipeline.py \"your research idea\"\n```\n\n## 🌐 Frontend (Local Web UI)\n\n> **Status:** The frontend is currently unstable. We recommend running the pipeline from the terminal for now. We will improve the frontend in future updates.\n\nRun a minimal local UI to launch the pipeline and view **only** high-level stage + final results (no raw logs on screen).\n\n### Start\n\n```bash\npython frontend\u002Fserver\u002Fapp.py --host 127.0.0.1 --port 8080\n```\n\nOpen in your browser:\n\n```text\nhttp:\u002F\u002F127.0.0.1:8080\u002F\n```\n\n### What you can do in the UI\n- Run the same pipeline entrypoint (`idea2story_pipeline.py`) from a web page.\n- Configure `LLM_API_KEY`, `LLM_PROVIDER`, `LLM_BASE_URL`\u002F`LLM_API_URL`, `LLM_MODEL` for the current run (not persisted by the server).\n- Toggle Novelty \u002F Verification.\n- Download the current run logs as a zip.\n\nFor more details, see `frontend\u002FREADME.md`.\n\n### 🔨 Knowledge Graph Builder\n\nThe Web UI includes a **KG Builder** page that lets you build a custom knowledge graph from your own paper dataset, instead of relying on the prebuilt ICLR graph.\n\n#### Dataset Format\n\nPrepare a JSONL file where each line is a JSON object with the following fields:\n\n| Field | Required | Description |\n|-------|----------|-------------|\n| `id` or `paper_id` | Yes | Unique paper identifier |\n| `title` | Yes | Paper title |\n| `abstract` | Yes | Paper abstract |\n| `keywords` | No | List of keywords |\n| `year` | No | Publication year |\n\nPlace the file under `Paper-KG-Pipeline\u002Fdata\u002F`. A sample dataset (`paper_reviews_dataset_iclr_sample_100.jsonl`) is included for testing.\n\n#### How to Use\n\n1. Start the server: `python frontend\u002Fserver\u002Fapp.py --host 127.0.0.1 --port 8080`\n2. Open `http:\u002F\u002F127.0.0.1:8080\u002F` and navigate to the **KG Builder** tab.\n3. Select your dataset from the dropdown.\n4. Configure LLM settings (API Key, Model, API URL).\n5. Click **Start Build**.\n\nThe pipeline runs 4 steps sequentially:\n\n| Step | Script | Description |\n|------|--------|-------------|\n| 1 | `extract_patterns_ICLR_en_local.py` | Extract writing patterns from papers via LLM |\n| 2 | `generate_clusters.py` | Cluster patterns using SBERT + HDBSCAN |\n| 3 | `build_entity_v3.py` | Build Idea \u002F Pattern \u002F Domain \u002F Paper nodes |\n| 4 | `build_edges.py` | Build edges and export `.gpickle` graph |\n\nProgress and logs are displayed in real-time on the page. Once completed, the output files are written to `Paper-KG-Pipeline\u002Foutput\u002F` and can be used directly by the `idea2story_pipeline.py`.\n\n### Output\n\n```text\noutput\u002F\n├── final_story.json # Final generated paper story\n├── pipeline_result.json # Full pipeline results\n└── log.json # Detailed logs\n```\nCheck `final_story.json` for the result and `pipeline_result.json` for the full process.\n\n\n## 🤖 Anchored Multi‑Agent Review\n\nInstead of arbitrary scores, this project uses **anchored comparisons**. We select anchor papers with known scores, ask LLMs to compare your target against these anchors (better\u002Ftie\u002Fworse), and then deterministically fit a final numeric score. This ensures the review process is auditable and grounded in real-world data.\n\n## 📚 Files & Docs\n\n\n\n- **Core Code**: `Paper-KG-Pipeline\u002Fsrc\u002Fidea2paper\u002F`\n- **Documentation**:\n\n| No. | Document | Content | Target Audience |\n| ----- |--------------------------| ---------------- | ------- |\n| **0** | [Project Overview](Paper-KG-Pipeline\u002Fdocs\u002F00_PROJECT_OVERVIEW.md) | Overall architecture, core modules, parameter configuration, execution workflow | Everyone |\n| **1** | [Knowledge Graph Construction](docs\u002F01_KG_CONSTRUCTION.md) | Data sources, node\u002Fedge definitions, LLM enhancement, how to run | Developers |\n| **2** | [Retrieval System](docs\u002F02_RECALL_SYSTEM.md) | Three-way retrieval strategies, similarity computation, performance optimization | Developers |\n| **3** | [Idea2Story Pipeline](docs\u002F03_IDEA2STORY_PIPELINE.md) | Pattern selection, Idea fusion, story reflection, critic review | Developers |\n\n- **Review Details**: [MULTIAGENT_REVIEW.md](MULTIAGENT_REVIEW.md)\n\n## 🤝 Contributing & License\n\nWe welcome PRs and Issues! Please follow the contribution guidelines.\nLicensed under the **MIT License**.\n\n## 🙏 Credits\n\n- **Data Source**: ICLR (see KG construction docs)\n- **Inspiration**: Auditable, anchor-centered review processes.\n- **Community Support**: [agentAlpha Community](https:\u002F\u002Fagentalpha.top)\n\n## 👥 Contributors\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FAgentAlphaAGI\u002FIdea2Paper\u002Fgraphs\u002Fcontributors\">\n  \u003Cimg src=\"https:\u002F\u002Fcontrib.rocks\u002Fimage?repo=AgentAlphaAGI\u002FIdea2Paper\" \u002F>\n\u003C\u002Fa>\n\n## 📑 Citation (Idea2Story)\n\nIf you find **Idea2Story** useful, please cite:\n\n```bibtex\n@misc{xu2026idea2storyautomatedpipelinetransforming,\n  title={Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives},\n  author={Tengyue Xu and Zhuoyang Qian and Gaoge Liu and Li Ling and Zhentao Zhang and Biao Wu and Shuo Zhang and Ke Lu and Wei Shi and Ziqi Wang and Zheng Feng and Yan Luo and Shu Xu and Yongjin Chen and Zhibo Feng and Zhuo Chen and Bruce Yuan and Harry Wang and Kris Chen},\n  year={2026},\n  eprint={2601.20833},\n  archivePrefix={arXiv},\n  primaryClass={cs.CE},\n  url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2601.20833}\n}\n```\n\n\n```bibtex\n@article{xu2026idea2paper,\n  title={Idea2Paper: What Should an End-to-End Research Agent Really Do?},\n  author={Xu, Tengyue and Qian, Zhuoyang and Liu, Gaoge and Zhang, Zhentao and Ling, Li and Wu, Biao and Zhang, Shuo and Lu, Ke and Shi, Wei and Wang, Ziqi and others},\n  year={2026}\n}\n\n```\n\n\n---\n\n## 📈 Star History\n\n\u003Ca href=\"https:\u002F\u002Fstar-history.com\u002F#AgentAlphaAGI\u002FIdea2Paper&Date\">\n \u003Cpicture>\n   \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=AgentAlphaAGI\u002FIdea2Paper&type=Date&theme=dark\" \u002F>\n   \u003Csource media=\"(prefers-color-scheme: light)\" srcset=\"https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=AgentAlphaAGI\u002FIdea2Paper&type=Date\" \u002F>\n   \u003Cimg alt=\"Star History Chart\"\n     src=\"https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=AgentAlphaAGI\u002FIdea2Paper&type=Date&v=20260130\" \u002F>\n \u003C\u002Fpicture>\n\u003C\u002Fa>\n\n\n---\n","Idea2Paper 是一个研究代理框架，旨在将原始研究想法系统地转化为结构化、逻辑一致且适合提交的学术论文。其核心功能包括通过多代理协作机制来分解科学研究过程，并针对每个阶段的核心挑战提供解决方案，如使用架构师、撰写者、精炼者等角色在生成-评估-适应循环中协同工作，以确保文档各部分的一致性和数据准确性。此外，该项目还引入了Story2Proposal模块，进一步支持从研究故事到结构化科学手稿的转换。该工具特别适用于需要将初步想法发展为完整科研论文的研究人员或学生，尤其是在希望提高论文写作效率和质量时。基于Python开发，采用MIT许可证开源。","2026-06-11 03:49:39","high_star"]