[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-82266":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":15,"stars7d":15,"stars30d":15,"stars90d":15,"forks30d":15,"starsTrendScore":15,"compositeScore":16,"rankGlobal":10,"rankLanguage":10,"license":17,"archived":18,"fork":18,"defaultBranch":19,"hasWiki":20,"hasPages":18,"topics":21,"createdAt":10,"pushedAt":10,"updatedAt":22,"readmeContent":23,"aiSummary":24,"trendingCount":15,"starSnapshotCount":15,"syncStatus":25,"lastSyncTime":26,"discoverSource":27},82266,"search-bibtex","steeliron550-ui\u002Fsearch-bibtex","steeliron550-ui","[PolyCite] 基于多智能体协同、多源汇聚的论文引用元数据检索工具 A command-line tool for quickly scraping BibTeX information from computer science conferences and journals, optimized for LLM Multi-Agents.","",null,"TypeScript",142,4,3,0,2.1,"MIT License",false,"main",true,[],"2026-06-12 02:04:24","# PolyCite \u002F 基于多智能体协同、多源汇聚的论文引用元数据检索工具\n\n[**English**](README.en.md) | **中文**\n\n## 需求分析\n\n在学术写作过程中，参考文献质量直接影响论文可信度、审稿结果和学术诚信。当前痛点主要包括：\n\n1. 幻觉引用风险严重\n\n   使用大模型、搜索引擎或人工记忆整理参考文献时，可能出现不存在的论文、错误标题、错误作者、错误 DOI、错误年份等“幻觉引用”。\n   这类问题可能导致：\n\n   - 审稿人无法检索到引用来源；\n   - 论文被质疑可靠性；\n   - 轻则返修、拒稿，重则涉及学术道德问题。\n\n2. 单一数据源不可靠\n\n   DBLP、Crossref、OpenAlex、arXiv、Semantic Scholar、DOI 内容协商等数据源各有优势和缺陷。\n   例如：\n\n   - DBLP 在计算机领域质量较高，但覆盖面有限；\n   - Crossref DOI 信息权威，但有时字段不完整；\n   - arXiv 适合预印本，但不一定对应最终发表版本；\n   - Semantic Scholar 覆盖广，但可能存在限流或元数据差异。\n\n   因此需要多数据源交叉验证、优先级排序和冲突处理。\n\n3. 手工整理 BibTeX 成本高\n\n   高质量学术论文通常引用几十甚至上百篇文献。人工逐条搜索、复制 BibTeX、修正格式非常耗时，并且容易出现：\n\n   - BibTeX 类型不一致；\n   - citation key 风格混乱；\n   - 作者名格式不统一；\n   - 会议\u002F期刊名称不统一；\n   - DOI、URL、arXiv ID 缺失或错误；\n   - 重复条目难以发现。\n\n## 工具介绍\n\n`search-bibtex` 是一个独立的论文 PDF 到 BibTeX 命令行工具。它从本地论文 PDF 中提取 DOI、arXiv ID、标题、作者和年份，查询 DBLP、arXiv、Crossref、OpenAlex、DOI 内容协商、Semantic Scholar 以及可选的自定义 HTTP JSON 来源，然后按配置的来源优先级和字段权重排序候选结果。用户可以在终端中交互选择 BibTeX，也可以用 `--select-index` 做非交互选择。\n\n项目以多平台二进制分发，不走 npm 发布。运行时代码不依赖 Paperlib，也不接入 Grok search；Grok search 只可作为开发期资料检索辅助工具。\n\n## 功能\n\n- 从论文 PDF 前若干页提取可搜索元数据。\n- 支持 PDF、论文标题字符串、stdin 标题输入和现有 `.bib` 文件更新。\n- 检索内置书目信息源：DBLP、arXiv、Crossref、OpenAlex、DOI、Semantic Scholar。\n- 支持声明式 `config.toml`，可配置来源顺序、排序权重、结果数量、并行搜索和自定义 HTTP JSON 来源。\n- 交互选择器支持 Vim 风格键位和过滤；脚本场景可直接选择 0-based index。\n- 更新 `.bib` 文件时保留原 citation key，只替换条目内容。\n- 网络失败、解析失败、无候选和无效配置会显式报错或写入 `sourceErrors`。\n- 多数据源交叉验证。\n- 支持批量处理。\n\n## 安装\n\n### 下载二进制\n\n二进制按平台和架构放在 `dist-bin\u002F`：\n\n```text\ndist-bin\u002F\u003Cplatform-arch>\u002Fsearch-bibtex\ndist-bin\u002F\u003Cplatform-arch>\u002Fsearch-bibtex.exe\n```\n\n把对应平台目录加入 `PATH`，或直接用绝对路径运行。运行二进制不需要本机安装 Node.js。\n\n### 从源码构建\n\n```bash\npnpm install\npnpm build\npnpm build:binary\n```\n\n也可以使用 Makefile：\n\n```bash\nmake install\nmake build\nmake binary\nmake build-binaries\n```\n\n`make binary` 生成当前平台二进制，`make build-binaries` 生成全部平台目标。\n\n## 快速开始\n\n查看帮助和默认配置：\n\n```bash\nsearch-bibtex --help\nsearch-bibtex config-defaults\nsearch-bibtex config-template\n```\n\n从 PDF 提取元数据：\n\n```bash\nsearch-bibtex metadata paper.pdf\n```\n\n搜索 PDF 并在 TTY 中选择候选；重定向或管道环境会输出 JSON：\n\n```bash\nsearch-bibtex search paper.pdf \\\n  --source-priority dblp,arxiv,crossref,openalex,doi \\\n  --limit 5 \\\n  --timeout 30\n```\n\n直接输出第 0 个候选的 BibTeX：\n\n```bash\nsearch-bibtex select paper.pdf --select-index 0 --format bibtex\n```\n\n从标题字符串搜索，多个标题默认用英文分号分隔：\n\n```bash\nsearch-bibtex search-title \"Self-Instruct: Aligning Language Models with Self-Generated Instructions; DFlash: Block Diffusion for Flash Speculative Decoding\"\nprintf 'Self-Instruct: Aligning Language Models with Self-Generated Instructions; DFlash: Block Diffusion for Flash Speculative Decoding' | search-bibtex search-title\n```\n\n更新现有 BibTeX 文件并保留引用名：\n\n```bash\nsearch-bibtex update references.bib --in-place\nsearch-bibtex update references.bib --output updated.bib\n```\n\n## 配置\n\n默认配置文件路径是 `~\u002F.config\u002Fsearch-bibtex\u002Fconfig.toml`。缺省路径文件不存在时会直接使用内置默认值；显式传入 `--config \u003Cpath>` 且文件不存在时会报错。命令行参数优先于配置文件。\n\n最小配置：\n\n```toml\n[search]\nlimit = 10\ntimeout_seconds = 30\nparallel = true\nsource_priority = [\"dblp\", \"arxiv\", \"crossref\", \"openalex\", \"doi\", \"semantic-scholar\"]\n\n[search.weights]\ntitle = 0.45\nauthor = 0.20\nyear = 0.10\nidentifier = 0.20\nsource = 0.05\n```\n\n完整配置说明见 [中文配置文档](docs\u002FCONFIGURATION.zh-CN.md) 和 [English configuration docs](docs\u002FCONFIGURATION.md)。\n\n## CLI 命令\n\n| Command | 用途 |\n|---|---|\n| `config-defaults` | 输出默认搜索和排序配置 JSON。 |\n| `config-template` | 输出可修改的 TOML 配置样板。 |\n| `metadata \u003Cpdf>` | 从 PDF 提取元数据和查询候选。 |\n| `search \u003Cpdf>` | 搜索并排序候选；TTY 中进入交互选择器，非 TTY 输出 JSON。 |\n| `select \u003Cpdf>` | 搜索后交互选择，或用 `--select-index` 输出指定候选。 |\n| `search-title [titles...]` | 从标题字符串或 stdin 搜索候选。 |\n| `update \u003Cbibtex>` | 刷新现有 `.bib` 文件条目并保留 citation key。 |\n\n交互选择器键位：\n\n```text\nj \u002F Down     向下移动\nk \u002F Up       向上移动\ng            跳到第一项\nG            跳到最后一项\n\u002F            进入过滤模式\nEnter        确认过滤或选择当前候选\nEsc          退出过滤或取消选择\nq            取消选择\nCtrl-C       取消选择\n```\n\n## 典型使用场景\n  \n  场景 A：论文作者整理参考文献\n\n  用户已经有若干 PDF，希望快速生成 BibTeX。\n\n  流程：\n\n  1. 输入论文 PDF；\n  2. 工具提取标题、作者、年份、DOI；\n  3. 查询多个数据源；\n  4. 返回候选 BibTeX；\n  5. 用户选择最可信条目；\n  6. 输出 BibTeX。\n\n  价值：\n\n  - 减少手动搜索；\n  - 降低复制错误；\n  - 优先获取权威 BibTeX。\n\n  ---\n  场景 B：校验 AI 生成的参考文献\n  \n  用户有一批 LLM 生成的参考文献标题，担心存在幻觉引用。\n\n  流程：\n\n  1. 用户输入标题列表；\n  2. 工具逐条检索；\n  3. 能找到可靠来源的条目生成 BibTeX；\n  4. 找不到的条目标记为高风险；\n  5. 用户人工复核高风险项。\n\n  价值：\n\n  - 发现不存在或错误引用；\n  - 避免将幻觉引用写入论文；\n  - 降低学术诚信风险。\n\n  ---\n  场景 C：刷新已有 BibTeX 文件\n  \n  用户已有 .bib 文件，但条目格式混乱或字段缺失。\n\n  流程：\n\n  1. 输入 references.bib；\n  2. 工具解析每个条目标题；\n  3. 检索多个来源；\n  4. 替换条目内容；\n  5. 保留原 citation key；\n  6. 输出更新后的 .bib。\n\n  价值：\n\n  - 统一引用元数据；\n  - 保留正文中的引用键；\n  - 降低大规模手动修正成本。\n\n  ---\n  场景 D：团队或 CI 检查引用质量\n  \n  团队希望在提交论文前检查 .bib 文件是否包含可疑条目。\n\n  流程：\n\n  1. CI 运行工具；\n  2. 对 .bib 中每个条目检索验证；\n  3. 对找不到可靠来源的条目给出错误或警告；\n  4. 阻止明显可疑引用进入最终版本。\n\n  价值：\n\n  - 提前发现问题；\n  - 建立论文引用质量门禁；\n  - 适合团队协作。\n\n## 测试 \u002F 例子\n\n配置bibtex源\n\n```bash\n> .\u002Fsearch-bibtex config-defaults\n{\n  \"sourcePriority\": [\n    \"dblp\",\n    \"arxiv\",\n    \"crossref\",\n    \"openalex\",\n    \"doi\",\n    \"semantic-scholar\"\n  ],\n  \"weights\": {\n    \"title\": 0.45,\n    \"author\": 0.2,\n    \"year\": 0.1,\n    \"identifier\": 0.2,\n    \"source\": 0.05\n  },\n  \"limit\": 10\n}\n```\n\n指定单个论文标题进行搜索。\n\n```bash\n> .\u002Fsearch-bibtex search-title \"Tackling System and Statistical Heterogeneity for Federated Learning with Adaptive Client Sampling\"\nsearch-title: searching 6 source channels...\nsearch-title: 1\u002F6 source channels completed [doi]\nsearch-title: 2\u002F6 source channels completed [doi] failed [semantic-scholar]\nsearch-title: 3\u002F6 source channels completed [doi] failed [arxiv, semantic-scholar]\nsearch-title: 4\u002F6 source channels completed [crossref, doi] failed [arxiv, semantic-scholar]\nsearch-title: 5\u002F6 source channels completed [crossref, openalex, doi] failed [arxiv, semantic-scholar]\nsearch-title: 6\u002F6 source channels completed [crossref, openalex, doi] failed [dblp, arxiv, semantic-scholar]\nsearch-bibtex candidate selection\nSource issues:\n  dblp 500 HTTP 500 from https:\u002F\u002Fdblp.org\u002Fsearch\u002Fpubl\u002Fap..., arxiv 429 HTTP 429 from\n  https:\u002F\u002Fexport.arxiv.org\u002Fapi\u002Fqu..., semantic-scholar 429 HTTP 429 from\n  https:\u002F\u002Fapi.semanticscholar.org...\nFilter: \nKeys: j\u002Fk move, g\u002FG jump, \u002F filter, Ctrl+O preview, Enter select, q cancel\n\n> [0] crossref         0.480 Tackling System and Statistical Heterogeneity for Federated Learning wi...\n  [1] openalex         0.470 Tackling System and Statistical Heterogeneity for Federated Learning wi...\n  [2] openalex         0.470 Tackling System and Statistical Heterogeneity for Federated Learning wi...\n  [3] crossref         0.210 FedCSGA: Evolutionary client selection with joint statistical and syste...\n  [4] crossref         0.207 FedDiverse: Tackling Data Heterogeneity in Federated Learning with Dive...\n  [5] crossref         0.202 Adaptive Heterogeneous Client Sampling for Federated Learning Over Wire...\n  [6] openalex         0.192 Adaptive Heterogeneous Client Sampling for Federated Learning Over Wire...\n  [7] crossref         0.182 Tackling Privacy Heterogeneity in Federated Learning\n  [8] crossref         0.180 FedClust: Tackling Data Heterogeneity in Federated Learning through Wei...\n  [9] crossref         0.178 RingSFL: An Adaptive Split Federated Learning Towards Taming Client Het...\n\nTitle: Tackling System and Statistical Heterogeneity for Federated Learning with Adaptive Client Sampling\nAuthors: Bing Luo and Wenli Xiao and Shiqiang Wang and ... (+2 more)\nYear: 2022  Venue: IEEE INFOCOM 2022 - IEEE Conference on Computer Communications\nIDs: DOI 10.1109\u002Finfocom48880.2022.9796935\n\nBibTeX preview: compact\n@inproceedings{Luo_2022, title={Tackling System and Statistical Heterogeneity for Federated Learning with Adaptive Client Sampling}, url={http:\u002F\u002Fdx.doi.org\u002F10.1109\u002Finfocom48880.2022.9796935}, DOI={10.1109\u002Finfocom48880.2022.9796935}, booktitle={IEEE INFOCOM 2022 - IEEE Conference on Computer Communications}, publisher={IEEE}, author={Luo, Bing and Xiao, Wenli and Wang, Shiqiang and Huang, Jianwei and Tassiulas, Leandros}, year={2022}, month=May, pages={1739–1748} }\n  title = {Tackling System and Statistical Heterogeneity for Federated Learning with Adaptive Cl...}\n  author = {Bing Luo and Wenli Xiao and Shiqiang Wang and Jianwei Huang and ... (+1 more)}\n  year = {2022}\n  booktitle = {IEEE INFOCOM 2022 - IEEE Conference on Computer Communications}\n  doi = {10.1109\u002Finfocom48880.2022.9796935}\n  url = {https:\u002F\u002Fdoi.org\u002F10.1109\u002Finfocom48880.2022.9796935}\n}\nsearch-bibtex selection confirmed\nTitle: Tackling System and Statistical Heterogeneity for Federated Learning with Adaptive Client Sampling\nSource: crossref  Score: 0.480\nClipboard: clipboard unavailable\n\n@inproceedings{Luo_2022,\n  title = {Tackling System and Statistical Heterogeneity for Federated Learning with Adaptive Client Sampling},\n  url = {http:\u002F\u002Fdx.doi.org\u002F10.1109\u002Finfocom48880.2022.9796935},\n  doi = {10.1109\u002Finfocom48880.2022.9796935},\n  booktitle = {IEEE INFOCOM 2022 - IEEE Conference on Computer Communications},\n  publisher = {IEEE},\n  author = {Luo, Bing and Xiao, Wenli and Wang, Shiqiang and Huang, Jianwei and Tassiulas, Leandros},\n  year = {2022},\n  month = May,\n  pages = {1739–1748},\n}\n```\n\n指定多个论文标题进行搜索。\n\n```bash\n> .\u002Fsearch-bibtex search-title \"Tackling System and Statistical Heterogeneity for Federated Learning with Adaptive Client Sampling\" \"Dp-forward: Fine-tuning and inference on language models with differential privacy in forward pass\"\nsearch-title[1]: searching 6 source channels...\nsearch-title[1]: 1\u002F6 source channels completed [doi]\nsearch-title[1]: 2\u002F6 source channels completed [doi] failed [dblp]\nsearch-title[1]: 3\u002F6 source channels completed [doi] failed [dblp, arxiv]\nsearch-title[1]: 4\u002F6 source channels completed [doi] failed [dblp, arxiv, crossref]\nsearch-title[1]: 5\u002F6 source channels completed [doi] failed [dblp, arxiv, crossref, semantic-scholar]\nsearch-title[1]: 6\u002F6 source channels completed [openalex, doi] failed [dblp, arxiv, crossref, semantic-scholar]\nsearch-bibtex candidate selection\nSource issues:\n  dblp fetch failed, arxiv fetch failed, crossref fetch failed, semantic-scholar 429 HTTP\n  429 from https:\u002F\u002Fapi.semanticscholar.org...\nFilter: \nKeys: j\u002Fk move, g\u002FG jump, \u002F filter, Ctrl+O preview, Enter select, q cancel\n\n> [0] openalex         0.470 Tackling System and Statistical Heterogeneity for Federated Learning wi...\n  [1] openalex         0.470 Tackling System and Statistical Heterogeneity for Federated Learning wi...\n  [2] openalex         0.192 Adaptive Heterogeneous Client Sampling for Federated Learning Over Wire...\n  [3] openalex         0.133 FedPARL: Client Activity and Resource-Oriented Lightweight Federated Le...\n  [4] openalex         0.123 Advances and Open Problems in Federated Learning\n  [5] openalex         0.103 Federated Learning: A Survey on Enabling Technologies, Protocols, and A...\n  [6] openalex         0.102 Towards Personalized Federated Learning\n  [7] openalex         0.093 FedProto: Federated Prototype Learning across Heterogeneous Clients\n  [8] openalex         0.093 Edge Artificial Intelligence for 6G: Vision, Enabling Technologies, and...\n  [9] openalex         0.072 Pushing AI to wireless network edge: an overview on integrated sensing,...\n\nTitle: Tackling System and Statistical Heterogeneity for Federated Learning with Adaptive Client Sampling\nAuthors: Bing Luo and Wenli Xiao and Shiqiang Wang and ... (+2 more)\nYear: 2022  Venue: IEEE INFOCOM 2022 - IEEE Conference on Computer Communications\nIDs: DOI https:\u002F\u002Fdoi.org\u002F10.1109\u002Finfocom48880.2022.9796935\n\nBibTeX preview: compact\n@inproceedings{Luo_2022, title={Tackling System and Statistical Heterogeneity for Federated Learning with Adaptive Client Sampling}, url={http:\u002F\u002Fdx.doi.org\u002F10.1109\u002Finfocom48880.2022.9796935}, DOI={10.1109\u002Finfocom48880.2022.9796935}, booktitle={IEEE INFOCOM 2022 - IEEE Conference on Computer Communications}, publisher={IEEE}, author={Luo, Bing and Xiao, Wenli and Wang, Shiqiang and Huang, Jianwei and Tassiulas, Leandros}, year={2022}, month=May, pages={1739–1748} }\n  title = {Tackling System and Statistical Heterogeneity for Federated Learning with Adaptive Cl...}\n  author = {Bing Luo and Wenli Xiao and Shiqiang Wang and Jianwei Huang and ... (+1 more)}\n  year = {2022}\n  booktitle = {IEEE INFOCOM 2022 - IEEE Conference on Computer Communications}\n  doi = {https:\u002F\u002Fdoi.org\u002F10.1109\u002Finfocom48880.2022.9796935}\n  url = {https:\u002F\u002Fopenalex.org\u002FW4226183928}\n}\nsearch-bibtex selection confirmed\nTitle: Tackling System and Statistical Heterogeneity for Federated Learning with Adaptive Client Sampling\nSource: openalex  Score: 0.470\nClipboard: clipboard unavailable\n\n@inproceedings{Luo_2022,\n  title = {Tackling System and Statistical Heterogeneity for Federated Learning with Adaptive Client Sampling},\n  url = {http:\u002F\u002Fdx.doi.org\u002F10.1109\u002Finfocom48880.2022.9796935},\n  doi = {10.1109\u002Finfocom48880.2022.9796935},\n  booktitle = {IEEE INFOCOM 2022 - IEEE Conference on Computer Communications},\n  publisher = {IEEE},\n  author = {Luo, Bing and Xiao, Wenli and Wang, Shiqiang and Huang, Jianwei and Tassiulas, Leandros},\n  year = {2022},\n  month = May,\n  pages = {1739–1748},\n}search-title[2]: searching 6 source channels...\nsearch-title[2]: 1\u002F6 source channels completed [doi]\nsearch-title[2]: 2\u002F6 source channels completed [doi] failed [arxiv]\nsearch-title[2]: 3\u002F6 source channels completed [doi] failed [arxiv, semantic-scholar]\nsearch-title[2]: 4\u002F6 source channels completed [doi] failed [dblp, arxiv, semantic-scholar]\nsearch-title[2]: 5\u002F6 source channels completed [crossref, doi] failed [dblp, arxiv, semantic-scholar]\nsearch-title[2]: 6\u002F6 source channels completed [crossref, openalex, doi] failed [dblp, arxiv, semantic-scholar]\nsearch-bibtex candidate selection\nSource issues:\n  dblp 500 HTTP 500 from https:\u002F\u002Fdblp.org\u002Fsearch\u002Fpubl\u002Fap..., arxiv 429 HTTP 429 from\n  https:\u002F\u002Fexport.arxiv.org\u002Fapi\u002Fqu..., semantic-scholar 429 HTTP 429 from\n  https:\u002F\u002Fapi.semanticscholar.org...\nFilter: \nKeys: j\u002Fk move, g\u002FG jump, \u002F filter, Ctrl+O preview, Enter select, q cancel\n\n> [0] crossref         0.480 DP-Forward: Fine-tuning and Inference on Language Models with Different...\n  [1] openalex         0.470 DP-Forward: Fine-tuning and Inference on Language Models with Different...\n  [2] crossref         0.211 Fine-Tuning Language Models with Just Forward Passes\n  [3] crossref         0.210 Fine-Tuning Language Models with Differential Privacy through Adaptive ...\n  [4] crossref         0.207 Towards Fine-tuning Pre-trained Language Models with Integer Forward an...\n  [5] crossref         0.197 EW-Tune: A Framework for Privately Fine-Tuning Large Language Models wi...\n  [6] crossref         0.175 DP-FedLoRA: Privacy-Enhanced Federated Fine-Tuning for On-Device Large ...\n  [7] crossref         0.158 Privacy-Aware Federated Fine-Tuning of Large Pretrained Models With Jus...\n  [8] crossref         0.150 Is Differential Privacy-Enhanced Parameter-Efficient Fine-Tuning Effect...\n  [9] crossref         0.145 Extractive Fact Decomposition for Interpretable Natural Language Infere...\n\nTitle: DP-Forward: Fine-tuning and Inference on Language Models with Differential Privacy in Forward Pass\nAuthors: Minxin Du and Xiang Yue and Sherman S. M. Chow and ... (+3 more)\nYear: 2023  Venue: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security\nIDs: DOI 10.1145\u002F3576915.3616592\n\nBibTeX preview: compact\n@inproceedings{Du_2023, series={CCS ’23}, title={DP-Forward: Fine-tuning and Inference on Language Models with Differential Privacy in Forward Pass}, url={http:\u002F\u002Fdx.doi.org\u002F10.1145\u002F3576915.3616592}, DOI={10.1145\u002F3576915.3616592}, booktitle={Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security}, publisher={ACM}, author={Du, Minxin and Yue, Xiang and Chow, Sherman S. M. and Wang, Tianhao and Huang, Chenyu and Sun, Huan}, year={2023}, month=Nov, pages={2665–2679}, collection={CCS ’23} }\n  title = {DP-Forward: Fine-tuning and Inference on Language Models with Differential Privacy in...}\n  author = {Minxin Du and Xiang Yue and Sherman S. M. Chow and Tianhao Wang and ... (+2 more)}\n  year = {2023}\n  booktitle = {Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security}\n  doi = {10.1145\u002F3576915.3616592}\n  url = {https:\u002F\u002Fdoi.org\u002F10.1145\u002F3576915.3616592}\n}\nsearch-bibtex selection confirmed\nTitle: DP-Forward: Fine-tuning and Inference on Language Models with Differential Privacy in Forward Pass\nSource: crossref  Score: 0.480\nClipboard: clipboard unavailable\n\n@inproceedings{Du_2023,\n  series = {CCS ’23},\n  title = {DP-Forward: Fine-tuning and Inference on Language Models with Differential Privacy in Forward Pass},\n  url = {http:\u002F\u002Fdx.doi.org\u002F10.1145\u002F3576915.3616592},\n  doi = {10.1145\u002F3576915.3616592},\n  booktitle = {Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security},\n  publisher = {ACM},\n  author = {Du, Minxin and Yue, Xiang and Chow, Sherman S. M. and Wang, Tianhao and Huang, Chenyu and Sun, Huan},\n  year = {2023},\n  month = Nov,\n  pages = {2665–2679},\n  collection = {CCS ’23},\n}@inproceedings{Luo_2022,\n  title = {Tackling System and Statistical Heterogeneity for Federated Learning with Adaptive Client Sampling},\n  url = {http:\u002F\u002Fdx.doi.org\u002F10.1109\u002Finfocom48880.2022.9796935},\n  doi = {10.1109\u002Finfocom48880.2022.9796935},\n  booktitle = {IEEE INFOCOM 2022 - IEEE Conference on Computer Communications},\n  publisher = {IEEE},\n  author = {Luo, Bing and Xiao, Wenli and Wang, Shiqiang and Huang, Jianwei and Tassiulas, Leandros},\n  year = {2022},\n  month = May,\n  pages = {1739–1748},\n}\n\n@inproceedings{Du_2023,\n  series = {CCS ’23},\n  title = {DP-Forward: Fine-tuning and Inference on Language Models with Differential Privacy in Forward Pass},\n  url = {http:\u002F\u002Fdx.doi.org\u002F10.1145\u002F3576915.3616592},\n  doi = {10.1145\u002F3576915.3616592},\n  booktitle = {Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security},\n  publisher = {ACM},\n  author = {Du, Minxin and Yue, Xiang and Chow, Sherman S. M. and Wang, Tianhao and Huang, Chenyu and Sun, Huan},\n  year = {2023},\n  month = Nov,\n  pages = {2665–2679},\n  collection = {CCS ’23},\n}\n```\n\n提取论文元数据\n\n```bash\n> .\u002Fsearch-bibtex metadata ..\u002F..\u002Ftests\u002Fpdf\u002F\"RollPacker Taming Long-Tail Rollouts for RL Post-Training with Tail Batching.pdf\"        \n{\n  \"metadata\": {\n    \"filePath\": \"\u002Fhome\u002Fwhr\u002Fprojects\u002Fsearch-bibtex\u002Ftests\u002Fpdf\u002FRollPacker Taming Long-Tail Rollouts for RL Post-Training with Tail Batching.pdf\",\n    \"pageCount\": 18,\n    \"title\": \"RollPacker: Taming Long-Tail Rollouts for RL Post-Training with Tail Batching Wei Gao\",\n    \"authors\": [\n      \"Yuheng Zhao\",\n      \"Dakai An\",\n      \"Tianyuan Wu\",\n      \"Lunxi Cao\",\n      \"Shaopan Xiong\",\n      \"Ju Huang\",\n      \"Weixun Wang\",\n      \"Siran Yang\",\n      \"Wenbo Su\",\n      \"Jiamang Wang\",\n      \"Lin Qu\",\n      \"Bo Zheng\"\n    ],\n    \"textSample\": \"RollPacker: Taming Long-Tail Rollouts for RL Post-Training with Tail Batching Wei Gao †∗ , Yuheng Zhao †∗ , Dakai An † , Tianyuan Wu † , Lunxi Cao † , Shaopan Xiong ‡ , Ju Huang ‡ , Weixun Wang ‡ , Siran Yang ‡ , Wenbo Su ‡ , Jiamang Wang ‡ , Lin Qu ‡ , Bo Zheng ‡ , Wei Wang † † HKUST ‡ Alibaba Group Abstract Reinforcement Learning (RL) is a pivotal post-training technique for enhancing the reasoning capabilities of Large Language Models (LLMs). However, synchronous RL post-training frequently suffers from significant GPU underutilization—often referred to as pipeline “bubbles”— caused by imbalanced response lengths within rollout steps. Many RL systems attempt to alleviate this problem by relax- ing synchronization, but this can compromise training accu- racy. In this paper, we introduce tail batching, a novel roll- out scheduling strategy for synchronous RL. Tail batching systematically consolidates prompts leading to long-tail re- sponses into a few designated “long rounds”, ensuring that the majority of rollout steps (“short rounds”) contain only balanced, short responses. By strategically reordering exe- cution, this approach dramatically reduces GPU idle time and accelerates \"\n  },\n  \"queries\": [\n    {\n      \"kind\": \"title\",\n      \"value\": \"RollPacker: Taming Long-Tail Rollouts for RL Post-Training with Tail Batching Wei Gao\",\n      \"confidence\": 0.78\n    },\n    {\n      \"kind\": \"title-author\",\n      \"value\": \"RollPacker: Taming Long-Tail Rollouts for RL Post-Training with Tail Batching Wei Gao Yuheng Zhao\",\n      \"confidence\": 0.72\n    }\n  ]\n}\n```\n\n检索指定pdf论文的bibtex\n\n```bash\n> .\u002Fsearch-bibtex search ..\u002F..\u002Ftests\u002Fpdf\u002F\"DP-Forward Fine-tuning and Inference on Language Models with.pdf\"\nsearch: searching 6 source channels...\nsearch: 1\u002F6 source channels completed [doi]\nsearch: 2\u002F6 source channels completed [arxiv, doi]\nsearch: 3\u002F6 source channels completed [arxiv, doi, semantic-scholar]\nsearch: 4\u002F6 source channels completed [dblp, arxiv, doi, semantic-scholar]\nsearch: 5\u002F6 source channels completed [dblp, arxiv, crossref, doi, semantic-scholar]\nsearch: 6\u002F6 source channels completed [dblp, arxiv, crossref, openalex, doi, semantic-scholar]\nsearch-bibtex candidate selection\nFilter: \nKeys: j\u002Fk move, g\u002FG jump, \u002F filter, Ctrl+O preview, Enter select, q cancel\n\n> [0] arxiv            0.990 DP-Forward: Fine-tuning and Inference on Language Models with Different...\n  [1] crossref         0.980 DP-Forward: Fine-tuning and Inference on Language Models with Different...\n  [2] openalex         0.970 DP-Forward: Fine-tuning and Inference on Language Models with Different...\n  [3] doi              0.960 DP-Forward: Fine-tuning and Inference on Language Models with Different...\n  [4] semantic-scholar 0.950 DP-Forward: Fine-tuning and Inference on Language Models with Different...\n\nTitle: DP-Forward: Fine-tuning and Inference on Language Models with Differential Privacy in Forward Pass\nAuthors: Minxin Du and Xiang Yue and Sherman S. M. Chow and ... (+3 more)\nYear: 2023  Venue: arXiv\nIDs: DOI 10.1145\u002F3576915.3616592  arXiv 2309.06746v2\n\nBibTeX preview: compact\n@inproceedings{Du_2023, series={CCS ’23}, title={DP-Forward: Fine-tuning and Inference on Language Models with Differential Privacy in Forward Pass}, url={http:\u002F\u002Fdx.doi.org\u002F10.1145\u002F3576915.3616592}, DOI={10.1145\u002F3576915.3616592}, booktitle={Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security}, publisher={ACM}, author={Du, Minxin and Yue, Xiang and Chow, Sherman S. M. and Wang, Tianhao and Huang, Chenyu and Sun, Huan}, year={2023}, month=Nov, pages={2665–2679}, collection={CCS ’23} }\n  title = {DP-Forward: Fine-tuning and Inference on Language Models with Differential Privacy in...}\n  author = {Minxin Du and Xiang Yue and Sherman S. M. Chow and Tianhao Wang and ... (+2 more)}\n  year = {2023}\n  booktitle = {arXiv}\n  doi = {10.1145\u002F3576915.3616592}\n  eprint = {2309.06746v2}\n  url = {https:\u002F\u002Farxiv.org\u002Fabs\u002F2309.06746v2}\n}\nsearch-bibtex selection confirmed\nTitle: DP-Forward: Fine-tuning and Inference on Language Models with Differential Privacy in Forward Pass\nSource: arxiv  Score: 0.990\nClipboard: clipboard unavailable\n\n@inproceedings{Du_2023,\n  series = {CCS ’23},\n  title = {DP-Forward: Fine-tuning and Inference on Language Models with Differential Privacy in Forward Pass},\n  url = {http:\u002F\u002Fdx.doi.org\u002F10.1145\u002F3576915.3616592},\n  doi = {10.1145\u002F3576915.3616592},\n  booktitle = {Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security},\n  publisher = {ACM},\n  author = {Du, Minxin and Yue, Xiang and Chow, Sherman S. M. and Wang, Tianhao and Huang, Chenyu and Sun, Huan},\n  year = {2023},\n  month = Nov,\n  pages = {2665–2679},\n  collection = {CCS ’23},\n}\n```\n\n## 开发文档\n\n- [配置](docs\u002FCONFIGURATION.zh-CN.md) \u002F [Configuration](docs\u002FCONFIGURATION.md)\n- [架构](docs\u002FARCHITECTURE.zh-CN.md) \u002F [Architecture](docs\u002FARCHITECTURE.md)\n- [测试](docs\u002FTESTING.zh-CN.md) \u002F [Testing](docs\u002FTESTING.md)\n- [贡献](CONTRIBUTING.zh-CN.md) \u002F [Contributing](CONTRIBUTING.md)\n- [发布](RELEASING.zh-CN.md) \u002F [Releasing](RELEASING.md)\n- [变更记录](CHANGELOG.zh-CN.md) \u002F [Changelog](CHANGELOG.md)\n\n## 限制\n\nPDF 文本抽取依赖文件本身的可抽取文本质量；扫描版 PDF 需要先做 OCR。Semantic Scholar 匿名访问可能触发限流，限流会显示为源错误。外部书目信息源的 BibTeX 风格不完全一致，本工具保留源返回的 BibTeX，只做必要的首尾空白规范化。\n\n## 许可证\n\nMIT，见 [LICENSE](LICENSE)。\n","PolyCite 是一个基于多智能体协同和多源汇聚的论文引用元数据检索工具，旨在通过命令行快速从计算机科学会议和期刊中抓取 BibTeX 信息。其核心功能包括从 PDF 文件或标题字符串中提取元数据，并利用 DBLP、arXiv、Crossref 等多个权威数据库进行交叉验证与优先级排序以确保引用准确性。此外，该工具支持自定义配置文件调整搜索行为，如来源顺序、结果数量等，并提供交互式选择及批量处理能力。适用于学术写作过程中需要高效准确整理参考文献的场景，特别适合于提高文献管理效率和减少错误引用风险的研究者使用。",2,"2026-06-11 04:08:12","CREATED_QUERY"]