[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-83421":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":24,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":34,"readmeContent":35,"aiSummary":10,"trendingCount":16,"starSnapshotCount":16,"syncStatus":36,"lastSyncTime":37,"discoverSource":38},83421,"paperjury","u7079256\u002Fpaperjury","u7079256","Pre-submission AI review stress-test for research papers. A Claude Code skill: review, verdict, revise, verify.","https:\u002F\u002Fu7079256.github.io\u002Fpaperjury\u002F",null,"JavaScript",126,11,7,1,0,12,70,55,3.24,"MIT License",false,"main",true,[26,27,28,29,30,31,32,33],"academic-writing","ai-agents","claude-code","latex","llm-agents","paper-review","peer-review","research-tool","2026-06-12 02:04:34","**English** · [中文](README.zh-CN.md)\n\n# PaperJury\n\n> A pre-submission AI review stress-test for research papers.\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fu7079256.github.io\u002Fpaperjury\u002Foverview.html?lang=en\">\u003Cimg alt=\"Open the live interactive overview\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FOpen_the_interactive_overview-d6a14b?style=for-the-badge&logo=githubpages&logoColor=white\">\u003C\u002Fa>\n  \u003Cimg alt=\"License: MIT\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-3b3d47?style=for-the-badge\">\n\u003C\u002Fp>\n\n*Before a reviewer tears it apart, let a jury do it first.*\n\nPaperJury turns paper feedback into a closed loop: review → verdict → revise → verify. Instead of taking every AI suggestion at face value, it sorts each issue into one of three outcomes:\n\n- **Fixable**: safe, text-level issues that can be patched automatically.\n- **Author-required**: missing experiments, missing evidence, or research decisions that stay with you.\n- **Invalid**: reviewer misreadings or unsupported critiques that should not be applied.\n\nIt offers three modes: direct-edit, review, and auto. PaperJury is built for pre-submission self-checking. It does not replace peer review, it does not invent missing experiments, and it keeps research-level decisions with the author.\n\nInteractive overview: the [live site](https:\u002F\u002Fu7079256.github.io\u002Fpaperjury\u002Foverview.html?lang=en) (GitHub Pages), or [`docs\u002Foverview.html`](docs\u002Foverview.html) in-repo.\n\n---\n\n## 🎉 News\n\n> **🚀 2026-06-05: PaperJury's Codex-first port has shipped.**\n> Open it here: [paperjury-codex](https:\u002F\u002Fgithub.com\u002Fu7079256\u002Fpaperjury-codex).\n>\n> **🧪 Dogfood sample added:** this repo now includes a compact [dogfood sample](samples\u002Fdogfood\u002F) with before\u002Fafter PDFs and a human-verified run report.\n\n---\n\n## TODO\n\n- [ ] **Fast mode \u002F quick version.** A lower-latency, lower-token path for fast checks when you want useful triage more than full courtroom depth.\n\n---\n\n## Responsible Use\n\nPaperJury is a pre-submission self-check workflow. It does not replace the author's scientific judgment, and it does not replace peer review. It should never be used to invent experiments, fabricate results, add unsupported claims, or hide a paper's limitations.\n\nWhen an issue needs a new experiment, missing evidence, private knowledge, or a research-level decision, PaperJury routes it to the author instead of patching it automatically. The Fixable \u002F Author-required \u002F Invalid outcomes exist precisely so that judgment calls stay with you.\n\nThe intended use is to surface avoidable problems earlier, while you can still act on them: unclear claims, weak logical connections, unsupported wording, formatting risks, and the kind of reviewer-style concerns worth checking before submission.\n\n---\n\n## Install\n\nIt is a Claude Code skill, installable two ways. For the Codex-first port, use [paperjury-codex](https:\u002F\u002Fgithub.com\u002Fu7079256\u002Fpaperjury-codex).\n\n**Option A: Claude Code plugin (one command).** From inside Claude Code:\n\n```text\n\u002Fplugin marketplace add u7079256\u002Fpaperjury\n\u002Fplugin install paperjury@u7079256\n```\n\n**Option B: clone as a skill.** Clone the repo into the folder Claude Code reads skills from:\n\n```bash\n# macOS \u002F Linux\ngit clone https:\u002F\u002Fgithub.com\u002Fu7079256\u002Fpaperjury ~\u002F.claude\u002Fskills\u002Fpaperjury\n```\n\n```powershell\n# Windows (PowerShell)\ngit clone https:\u002F\u002Fgithub.com\u002Fu7079256\u002Fpaperjury \"$env:USERPROFILE\\.claude\\skills\\paperjury\"\n```\n\n(or under `\u003Cproject>\u002F.claude\u002Fskills\u002F` to scope it to one project). Claude Code auto-discovers it through `SKILL.md` and it shows up as the `paperjury` skill. `node` is required (the deterministic checks run on it); a LaTeX toolchain is optional (the real-compile and layout checks use it, and degrade honestly when it is absent).\n\n**For Claude \u002F coding agents:** the deep \"how to drive this\" reference is [`docs\u002FAGENT-GUIDE.md`](docs\u002FAGENT-GUIDE.md): install, the three modes and their triggers, the engine pipeline, the `auto` vs `\u002Fgoal` distinction, and how the fan-out launches, written for an agent to read. Curious about the internals? Just point Claude at that file and ask.\n\n---\n\n## What you get\n\nMost writing tools only push your paper forward: they draft and they polish. None of them argues the other side of your claims the way a reviewer will. PaperJury is built around that gap, in four parts.\n\n- **Adversarial by construction.** Your paper gets due process, not one pass of suggestions: N domain reviewers read the whole paper, a contestability router sends the real disputes to a two-sided trial, a jury of 5 (escalating to 12 only when it cannot reach a clear majority) deliberates under isolation, and a judge returns one of three verdicts: fix it, needs you, or no fix. A verdict can land \"no fix\", which a yes-and rewriter structurally cannot return.\n- **Closed-loop, not forward-only.** Each round is a clean re-review of the edited paper (the panel never sees the prior ledger, so a re-raised issue is real corroboration, not anchoring), and a deterministic clerk reconciles every round into one ledger until a clean round surfaces nothing new. Before any edit, fresh skeptics try to revive whatever got wrongly dropped and stress-test strong-consensus verdicts.\n- **Guardrails, not autopilot.** Safe fixes land under risk-proportional safety (frozen anchors, a per-passage edit cap, an anchor and cross-section meaning audit), always behind your sign-off. Risky edits are not applied silently; they queue for one human pass.\n- **Real compile, not just critique.** It runs an actual LaTeX build on your machine and reports true errors, undefined refs, overfull boxes, and the page count, or degrades honestly to a structural lint when no toolchain is present. Deterministic desk-reject checks catch the classics: anonymization leaks, margin and spacing hacks, documentclass drift, missing required sections, and page-limit overflow, checked against your project's own constraints.\n\n---\n\n## Three modes\n\n### Direct-Edit (common)\n\n- **Trigger:** describe a change in Chinese or English and have the LaTeX edited directly.\n- **Example utterances:** \"把这段改成…\", \"polish this paragraph\", \"把我对 intro 的想法写成 LaTeX\", \"tighten this\".\n- **Behavior:** no review panel; go straight to drafting the patch through the writing toolkit, with author sign-off.\n\n### Review (occasional)\n\n- **Trigger:** ask for the paper to be critiqued or hardened: review \u002F critique \u002F 审稿 \u002F 评审 \u002F mock-review, or iterating a draft to clear reviewer-raised issues.\n- **Behavior:** runs the courtroom review engine (`references\u002Freview-engine-v3.md`).\n- **Scope sub-trigger:** `full` (whole paper) or `passage` (one section \u002F paragraph \u002F claim).\n\n### Auto (unattended)\n\n- **Trigger (explicit only):** opt in via `\u002Fgoal` (or config `mode: auto`) to run the review-revise loop unattended toward a verifiable goal.\n- **Hard constraint:** **auto is never self-detected; it is explicit only.** There is no runtime signal for it, so it is entered only via a `\u002Fgoal` context or a project config `mode: auto`.\n- **Behavior:** establish the `spine` and the reviewer assignment up front (the human steps), then the engine applies safe fixes under the bounded-aggressive + edit-safety policy, queues the rest, and runs multiple rounds until it stops: on clerk convergence, or an applied-quiescence \u002F hard-limit backstop. See `references\u002Fauto-mode.md`.\n\n---\n\n## Usage examples: what to do when\n\nYou don't run commands; you say what you want and the skill picks the mode.\n\n**Edit one thing (the everyday case → direct-edit):**\n- \"Polish this paragraph.\" \u002F \"把这段 intro 改紧一些。\"\n- \"Turn my Chinese note for the intro into LaTeX: `\u003Cyour idea>`.\"\n- \"De-AI this paragraph.\" \u002F \"Compress this sentence to one line.\" \u002F \"Rewrite this caption.\"\n- → it drafts the LaTeX change, self-checks it, shows you the patch, and applies it after you approve. No panel.\n\n**Get the paper critiqued before submission (→ review):**\n- \"Review my paper.\" \u002F \"审稿。\" \u002F \"Mock-review this before I submit.\"\n- \"Critique just Section 3.2.\" \u002F \"review passage `\u003Cthe claim you paste>`.\"\n- \"Here are the issues a reviewer raised; iterate the draft to clear them.\"\n- → it runs the adversarial engine, surfaces the real weaknesses (separating fatal flaws from nits), and walks you through each: you give direction, it drafts fixes you authorize. Nothing changes without your sign-off.\n\n**Harden it unattended toward a goal (→ auto, needs `\u002Fgoal`):**\n- `\u002Fgoal \"harden the paper until ledger.js gate passes (0 gate-blocking major)\"`\n- → it runs the review-revise loop across many rounds on its own, applying safe fixes and queueing risky ones for one pass when you return. This needs the `\u002Fgoal` driver: turning on \"auto\" tool-permission and sending a normal prompt runs one round and stops, it does not loop (see [`docs\u002FAGENT-GUIDE.md`](docs\u002FAGENT-GUIDE.md) §3).\n\n**Make sure it won't get desk-rejected:**\n- \"Run the submission-readiness \u002F compliance check.\" → deterministic format screening + a compile-driven layout check.\n\nRule of thumb: **one change → just say it; want it picked apart → say \"review\"; want it run unattended → `\u002Fgoal`.**\n\n---\n\n## Engine overview\n\nThe courtroom engine is `assign-reviewers → reading-check → coverage-auditor → merge → {trial ‖ polish} → recall-audit → drafter → {edit-audit | meaning-audit} → clerk`. Generation is bounded (N holistic domain reviewers, not a per-(unit × lens) flood); adjudication is routed by contestability; edits are guarded by risk; the multi-round loop converges via a deterministic clerk. The **deterministic guards in `scripts\u002F`** run orchestrator-side via Bash between workflow calls.\n\n### Deterministic stages (orchestrator-side, Node via Bash)\n\n1. `decompose`: split manuscript into reading units, the canonical section list, and stable `passage-id`s (which prevent text drift and give jurors local context).\n2. `spine` (auto only): extract anchors, author confirm, freeze → `spine.json`.\n3. `ledger.js`: JSON ledger plus MD view; **gate = `\u002Fgoal` completion fact** (0 gate-blocking active major; author-required is gate-OK and accumulates to the human queue). CLI: init\u002Fadd\u002Fset\u002Fcount\u002Fgate\u002Fget\u002Fdocket\u002Funadjudicated\u002Frender.\n4. `journal.js`: append-only per-edit revert log (JSONL).\n5. `apply-patch.js`: atomic apply plus journal of a drafted patch, and revert (exact-once guard on `before` text).\n6. `anchor-diff.js`: locate frozen anchors; flag which `need_audit` when the support region changed.\n7. `cross-ref.js`: edit-safety risk pre-filter: does a changed salient token in a patch appear in other passages?\n8. `compile-guard.js`: real LaTeX compile (latexmk\u002Fpdflatex) or a degraded structural-lint path with `compiled:null` (it reports when it cannot verify).\n9. `compliance-check.js`: submission-readiness A: deterministic desk-reject screening.\n\n### Semantic stages (workflow fan-out)\n\n1. `assign-reviewers`: name N subfields, instantiate N domain reviewers from the project gatekeeper core + a generated domain overlay; config-pin \u002F verifier \u002F per-slot degrade headless.\n2. `reading-check`: N holistic reviewers each read the WHOLE paper once → weaknesses (significance + kind + verbatim quote; a reviewer that cannot quote the source did not read it) + one overall_confidence + a per-section coverage report; targeted re-invoke mode for anti-skim.\n3. `coverage-auditor`: anti-skim L2: flag skimmed (reviewer, section) pairs across the coverage reports.\n4. `merge`: semantic dedup across reviewers; the workflow derives significance (MAX) \u002F kind (substantive-dominates) \u002F corroboration deterministically.\n5. `trial`: a 5-juror trial tier: whole-paper defense → independent local-context jury (with on-demand context expansion) → a deterministic majority verdict (quorum reached, one side >60%) + a judge that routes a decided-valid charge (valid-fixable vs author-required); escalate to a 12-juror tier only on no clear majority.\n6. `polish`: the track that skips the jury: batch copy-edit (mechanical) + batch light-check (minor-substantive); can escalate a misrouted major back to trial.\n7. `recall-audit`: Mode A revives wrongly-dropped charges (bias to revive); Mode B spot-checks strong-consensus majors before the edit (guards against the whole panel agreeing on the same mistake).\n8. `drafter`: minimal-edit patch for valid-fixable charges.\n9. `edit-audit` \u002F `meaning-audit`: the edit-safety semantic half: `edit-audit` checks a risky non-anchor edit (make-sense + cross-section alignment); `meaning-audit` is the four-state frozen-anchor + arc audit.\n10. `clerk`: the round boundary: reconcile carried open-questions against this round's edits, dedup re-raises via a deterministic passage_id + similarity merge key, and emit the deterministic convergence counts.\n\nAlso present: `review-panel.workflow.js`: a quick simple 3-lens panel (fast path).\n\n---\n\n## The three primitives: Skill + Workflow + Memory\n\n1. **Skill (entry point + methodology):** the protocol, the reviewer assignment, the consensus gate, the writing toolkit, the human gates. Detail in `references\u002Freview-engine-v3.md`, `references\u002Freviewer-personas.md`, `references\u002Fwriting-toolkit.md`.\n\n2. **Workflow (fan-out engine):** the semantic, no-human-in-the-middle steps run as Workflows (parallelism plus schema-validated output by construction). Simple panel = `workflows\u002Freview-panel.workflow.js`; the courtroom engine = `assign-reviewers → reading-check → coverage-auditor → merge → {trial ‖ polish} → recall-audit → drafter → {edit-audit | meaning-audit} → clerk`. The deterministic guards run orchestrator-side via Bash because the Workflow sandbox has no fs: `scripts\u002F` holds `decompose`, `ledger`, `journal`, `apply-patch`, `anchor-diff`, `cross-ref`, `spine`, `compile-guard`, `compliance-check`.\n\n3. **Memory (durable state + learned conventions), two layers:**\n   - **Ledger**: `LEDGER.json` resolved at runtime = the machine source of truth, plus a rendered `LEDGER.md` view; managed by `scripts\u002Fledger.js`. The live, mutable issue state across rounds and sessions. Schema plus status state machine: `references\u002Fledger-schema.md`.\n   - **Claude memory**: the active project's memory: stable conventions worth recalling next session (this paper's house style, venue, persona tuning).\n\n### Reviewers\n\nThe panel is N domain-expert HOLISTIC reviewers (default 3, range 2-4), assigned at runtime to the paper's subfields, all sharing a senior-reviewer gatekeeper core (harsh, precise, constructive; separate fatal flaws from fixable nits; reason across sections). When a reviewer slot cannot be confirmed (headless, unverifiable), that slot degrades to a generic gatekeeper (one bad slot never degrades the whole panel); the generic fallback lenses are:\n\n- **Theory \u002F Foundations**: definitions, proof gaps, notation, invariance\u002Foptimality\u002Fgenerality claims.\n- **Empirical \u002F Benchmark**: baseline fairness\u002Fvintage, metric correctness, dataset splits, variance, ablation coverage, cherry-picking.\n- **Applied \u002F Systems**: practicality, efficiency\u002Flatency\u002Fmemory claims, reproducibility, deployment realism, scaling.\n\n(These are an unordered tendency, not fixed slots; reviewer IDs `R1..RN` are positional, assigned by subfield order.)\n\nThe writing toolkit names (prompt bodies not shown here): `translate-to-english`, `polish-english`, `de-ai`, `compress`, `expand`, `caption`, `experiment-analysis`, `logic-check`.\n\n---\n\n## The six hard rules\n\n1. **Never edit the manuscript without explicit author sign-off.** Auto-mode carve-out: the rule HOLDS; auto satisfies it via UP-FRONT sign-off (the `spine` + reviewer-assignment confirmation plus the pre-authorized bounded-aggressive policy) plus the return queue, not per-edit sign-off.\n2. **Reviewers \u002F jurors are isolated.** Fresh eyes per round: no cross-talk, no prior-round leakage, no sight of the `ledger`. Enforced by (a) what goes into each agent's prompt AND (b) an explicit ISOLATION instruction in every reviewer-type prompt.\n3. **Every valid-fixable issue carries a `close_criterion`** (one concrete sentence describing what an edit must satisfy), set by the judge.\n4. **No leakage into the reviewed text.** Revision logs, back-translations, and self-check verdicts are author-side aids; they never enter the manuscript or any frozen snapshot.\n5. **Disagreement resolves through discussion, then override (logged), never a silent dismissal.**\n6. **No hardcoded paths or project files in the skill.** Resolve at runtime.\n\n---\n\n## Architecture notes\n\n- The Workflow sandbox has **no filesystem and no subprocess**; that is why all deterministic guards run orchestrator-side via Bash between workflow calls.\n- `compile-guard.js` is explicit about what it cannot verify: when it cannot truly compile, it degrades to structural lint and reports `compiled:null`.\n- Submission-readiness is cross-mode, two parts: **A** = `compliance-check.js` plus a semantic agent; **B** = a compile-driven layout loop reusing `compile-guard.js` plus Read-on-PDF.\n\nYour project files, ledger, journal, and patches stay inside your local paper project. PaperJury has no backend or server of its own, so nothing is sent to a PaperJury server. The review runs through your own Claude Code session, which means the model itself runs in the cloud: how your content is handled there follows the terms and settings of that Claude Code environment, not anything PaperJury adds on top.\n\n---\n\n## Roadmap\n\nWhere this is going (planned, not yet shipped):\n\n- **Reviewer personas tuned to each venue community's taste.** CVPR, ACL, and NeurIPS reviewers do not critique the same way; the goal is a reviewer that carries each community's expectations, beyond the current three-family style context.\n- **Vision-based layout verification**: compile, render, and check the visual layout (column overflow, figure placement), not just the compile log.\n- **Automatic venue detection** from your `.cls` \u002F template.\n- **Validation of the engine on real papers at scale.**\n\n---\n\n## File and path reference\n\n- Engine protocol (every orchestrator seam): `references\u002Freview-engine-v3.md`\n- Auto protocol: `references\u002Fauto-mode.md`\n- Personas \u002F writing toolkit \u002F methodology: `references\u002Freviewer-personas.md`, `references\u002Fwriting-toolkit.md`, `references\u002Fmethodology.md`\n- Ledger schema + status machine: `references\u002Fledger-schema.md`\n- Submission compliance: `references\u002Fsubmission-compliance.md`\n- Design rationale: `docs\u002FREVIEW_ENGINE_V3_DESIGN.md`\n- Scripts dir: `scripts\u002F` (decompose, ledger, journal, apply-patch, anchor-diff, cross-ref, spine, compile-guard, compliance-check)\n- Workflows dir: `workflows\u002F` (assign-reviewers, reading-check, coverage-auditor, merge, trial, polish, recall-audit, drafter, edit-audit, meaning-audit, clerk, review-panel)\n\n---\n\n## Credits\n\nThe spine and anti-drift design (the anchor logic-transfer audit, the claim register, and the minimal-edit, intent-preserving revision policy) is inspired by [PaperSpine](https:\u002F\u002Fgithub.com\u002FWUBING2023\u002FPaperSpine), a motivation-driven paper drafting and rewriting skill. PaperSpine is a forward generate\u002Frewrite tool with no adversarial loop; PaperJury borrows its anchoring idea and its \"deterministic scripts for checkable steps, model agents for judgment\" mechanism, then adds the adversarial courtroom review engine on top.\n",2,"2026-06-11 04:11:07","CREATED_QUERY"]