[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-79883":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":8,"htmlUrl":8,"language":9,"languages":8,"totalLinesOfCode":8,"stars":10,"forks":11,"watchers":12,"openIssues":13,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":14,"stars7d":14,"stars30d":14,"stars90d":14,"forks30d":14,"starsTrendScore":14,"compositeScore":15,"rankGlobal":8,"rankLanguage":8,"license":16,"archived":17,"fork":17,"defaultBranch":18,"hasWiki":19,"hasPages":17,"topics":20,"createdAt":8,"pushedAt":8,"updatedAt":21,"readmeContent":22,"aiSummary":23,"trendingCount":14,"starSnapshotCount":14,"syncStatus":24,"lastSyncTime":25,"discoverSource":26},79883,"paper_format_agent","zxyasfas\u002Fpaper_format_agent","zxyasfas",null,"Python",97,3,99,10,0,1.81,"MIT License",false,"main",true,[],"2026-06-12 02:03:55","# Paper Format Agent\r\n\r\n[中文说明](README.zh-CN.md) | English\r\n\r\n![Local-first](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flocal--first-DOCX-blue)\r\n![Content Guard](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fcontent--guard-enabled-green)\r\n![Python](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-3.9%2B-3776AB)\r\n![CI](https:\u002F\u002Fgithub.com\u002Fzxyasfas\u002Fpaper_format_agent\u002Factions\u002Fworkflows\u002Fci.yml\u002Fbadge.svg)\r\n![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-lightgrey)\r\n\r\nLocal-first academic paper formatting for DOCX files, packaged as both a Python tool and an agent skill.\r\n\r\nPaper Format Agent extracts formatting rules from a guide, applies deterministic DOCX repairs, and produces machine-readable plus human-readable reports. It is built for thesis, journal, and conference formatting workflows where privacy and content preservation matter.\r\n\r\n## Status\r\n\r\nThis project is a practical open-source MVP moving toward commercial readiness. It is suitable for demos, internal pilots, agent workflows, and synthetic benchmark development. Before paid production use, expand the regression corpus, template coverage, and object-level scoring for tables, figures, equations, footnotes, headers, and footers.\r\n\r\n## Why This Exists\r\n\r\nAcademic formatting is tedious, repetitive, and hard to review manually. This project focuses on formatting-only automation:\r\n\r\n- margins, fonts, line spacing, headings, captions, tables, and references\r\n- generated running headers and centered page-number footers\r\n- required section checks such as abstracts, keywords, and table of contents\r\n- content fingerprint guards to detect accidental academic content changes\r\n- local execution for private papers and school templates\r\n- reports that can be used by students, supervisors, reviewers, and CI\r\n\r\n## Agent Skill\r\n\r\nThis repository includes a top-level [SKILL.md](SKILL.md) and [agents\u002Fopenai.yaml](agents\u002Fopenai.yaml), so agent users can treat the repo as an installable skill.\r\n\r\nThe skill teaches an agent how to:\r\n\r\n- inspect input files safely\r\n- run the formatter in content-preserving mode\r\n- review `format_report.json`\r\n- validate changes before returning results\r\n- add new template rules with tests\r\n\r\n## Quick Start\r\n\r\n```bash\r\npip install -r requirements.txt\r\n\r\npython -m paper_format_agent.cli \\\r\n  --format-file \"format_guide.docx\" \\\r\n  --paper-file \"paper.docx\" \\\r\n  --out-dir \".\u002Foutput\" \\\r\n  --engine auto \\\r\n  --strict-required-sections\r\n```\r\n\r\nOptional GUI:\r\n\r\n```bash\r\npython run_gui.py\r\n```\r\n\r\nBatch processing:\r\n\r\n```bash\r\npython -m paper_format_agent.cli \\\r\n  --format-file \"format_guide.docx\" \\\r\n  --paper-dir \".\u002Fpapers\" \\\r\n  --out-dir \".\u002Fbatch_output\" \\\r\n  --engine python \\\r\n  --strict-required-sections\r\n```\r\n\r\nBatch mode writes one output folder per paper plus `batch_summary.json`, including pass rate, score averages, content-change count, and per-paper report locations.\r\n\r\n## Template Packs And Synthetic Examples\r\n\r\nThe repository includes privacy-safe template packs and synthetic examples so users can try the workflow without uploading real papers:\r\n\r\n- [templates\u002F](templates\u002F) contains JSON presets for Chinese thesis, journal article, and IEEE-style conference formatting.\r\n- [examples\u002F](examples\u002F) contains a synthetic format guide and sample reports for demos, issues, and PRs.\r\n- [docs\u002FTEMPLATE_PACKS.md](docs\u002FTEMPLATE_PACKS.md) explains the template contract and contribution checklist.\r\n\r\nTemplate files are intentionally plain JSON. They are easy to review, easy to customize locally, and safe to extend through small PRs.\r\n\r\n## Outputs\r\n\r\n| File | Purpose |\r\n| --- | --- |\r\n| `formatted_paper_v3.docx` | repaired DOCX document |\r\n| `format_rules.json` | extracted formatting rules |\r\n| `format_report.json` | machine-readable score and checks |\r\n| `format_report.html` | human-readable report |\r\n| `modify_log.json` | formatting operation log |\r\n| `engine_report.json` | Word COM \u002F LibreOffice \u002F Python post-process result |\r\n| `marker_dump.json` | optional paragraph classification dump |\r\n\r\n## Safety Model\r\n\r\nBy default, the pipeline enforces a content guard. Reports include:\r\n\r\n- `content_changed`\r\n- `content_guard_enforced`\r\n- `content_fingerprint_before`\r\n- `content_fingerprint_after`\r\n- `diagnostics` with severity, evidence, and suggested fixes for failed checks\r\n\r\nFor normal academic formatting, `content_changed` should be `false`.\r\n\r\n## Validation\r\n\r\n```bash\r\npython tools\u002Fvalidate_skill.py\r\npython -m unittest discover -s tests -p \"test_*.py\"\r\npython tools\u002Fcompile_check.py\r\npython tools\u002Frelease_audit.py\r\n```\r\n\r\nBefore publishing from a local workspace, also run:\r\n\r\n```bash\r\npython tools\u002Frelease_audit.py --include-local\r\n```\r\n\r\nThis optional check includes untracked and ignored local artifacts, such as generated outputs, scratch files, caches, and private document formats.\r\n\r\n## Good First PRs\n\nWe want many small, reviewable PRs. Good contribution areas:\n\r\n- Add a synthetic test for a school, journal, or conference formatting rule.\r\n- Add a new synthetic template pack in `templates\u002F`.\r\n- Improve a narrowly scoped rule extractor.\r\n- Add scoring coverage for tables, figures, references, equations, headers, or footers.\r\n- Improve report wording or diagnostics.\r\n- Add local-first integrations such as MCP, GitHub Actions, or batch processing.\r\n- Improve this repo's `SKILL.md` workflow for agent users.\r\n\nNew contributors can start from the task-ready board in\n[docs\u002FCONTRIBUTOR_TASKS.md](docs\u002FCONTRIBUTOR_TASKS.md). Each task lists user\npain, expected PR shape, and suggested labels.\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md), [ROADMAP.md](ROADMAP.md), and [AGENTS.md](AGENTS.md).\n\r\n## Architecture\r\n\r\n```text\r\nformat guide + paper.docx\r\n  -> rule extraction\r\n  -> paragraph type tagging\r\n  -> style application\r\n  -> numbering cleanup\r\n  -> optional engine post-process\r\n  -> scoring and reports\r\n```\r\n\r\nDetailed notes:\r\n\r\n- [docs\u002FARCHITECTURE.md](docs\u002FARCHITECTURE.md)\r\n- [docs\u002FPRODUCTION_STANDARD.md](docs\u002FPRODUCTION_STANDARD.md)\r\n- [README_V3.md](README_V3.md)\r\n\r\n## Privacy\r\n\r\nDo not commit real papers, private school templates, reviewer comments, API keys, or generated documents. Use synthetic fixtures or anonymized snippets in tests.\r\n\r\n## License\r\n\r\nMIT. See [LICENSE](LICENSE).\r\n","Paper Format Agent 是一个专注于学术论文格式自动化的工具，支持DOCX文件的本地优先处理，并以Python工具和代理技能的形式提供。其核心功能包括从指南中提取格式规则、执行确定性的DOCX修复以及生成机器可读与人类可读报告。技术特点上，该工具强调隐私保护与内容保全，适用于需要严格遵循特定格式要求的学位论文、期刊文章及会议论文撰写场景。此外，通过集成的内容指纹保护机制，能够有效防止文档在格式调整过程中发生意外修改。适合用于学术机构内部试用、演示项目或作为合成基准开发的基础。",2,"2026-06-11 03:58:24","CREATED_QUERY"]