[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-76375":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":12,"openIssues":13,"contributorsCount":13,"subscribersCount":13,"size":13,"stars1d":13,"stars7d":13,"stars30d":14,"stars90d":13,"forks30d":13,"starsTrendScore":13,"compositeScore":15,"rankGlobal":9,"rankLanguage":9,"license":9,"archived":16,"fork":16,"defaultBranch":17,"hasWiki":16,"hasPages":16,"topics":18,"createdAt":9,"pushedAt":9,"updatedAt":19,"readmeContent":20,"aiSummary":21,"trendingCount":13,"starSnapshotCount":13,"syncStatus":22,"lastSyncTime":23,"discoverSource":24},76375,"Astro_Agent","wangnengdejiamao\u002FAstro_Agent","wangnengdejiamao","End-to-end astronomy research agent: a LangGraph state machine that resolves a target, fetches multi-survey data via    a 20+ archive toolbox, runs three mandatory modeling iterations with supervisor-issued repair actions, and drafts an    ApJ-style manuscript only after an evidence-gated QA pass",null,"Python",100,8,0,45,44.36,false,"main",[],"2026-06-12 04:01:21","# Astro Agent\n\nAn end-to-end astronomy research agent that connects survey-data acquisition,\nquantitative modeling, evidence auditing, and manuscript drafting into a single\nauditable LangGraph workflow.\n\nThis repository ships **code, prompts, configs, and reproducible scripts** only.\nPrivate API keys, downloaded papers, FITS\u002Fspectra products, SQLite indexes,\nlocal knowledge-graph workspaces, and personal reports are kept out of version\ncontrol and loaded at runtime from local `.env` files.\n\n---\n\n## 1. What This Project Is\n\nAstro Agent is built around the question *\"can a science-grade research workflow\nbe expressed as a graph of deterministic, reviewable agents?\"* The system is\ntwo cooperating layers:\n\n- **`Astro_Agent\u002Fanalysis_agent`** — a LangGraph state machine that resolves a\n  target, fetches multi-survey data, runs modeling iterations, audits results\n  against physics, retrieves comparable methods from local literature, and (when\n  the QA gate clears) produces an ApJ-style manuscript with peer-review notes.\n- **`Astro_Agent\u002Fastro_toolbox`** — a domain toolbox of survey clients and\n  scientific modeling modules (spectra, photometry, light curves, SED, white\n  dwarf fitting, RV\u002Fperiod analysis, extinction, kinematic traceback, cluster\n  membership, compact-binary diagnostics).\n\nA third local-only system (a literature-derived knowledge graph) can be plugged\nin via `ASTRO_AGENT_KG_WORKSPACE`. The graph itself, its corpus, and its\nextraction pipeline are **not part of this repository** — only the navigator\nnode that queries an external workspace is shipped.\n\n---\n\n## 2. System Architecture\n\n```\n                    ┌────────────────────────────────────────┐\n                    │  CLI  ·  FastAPI server  ·  Web UI     │\n                    └───────────────────┬────────────────────┘\n                                        │\n                ┌───────────────────────▼───────────────────────┐\n                │       analysis_agent  (LangGraph state machine) │\n                │   resolve → data_fetch → memory_advisor →     │\n                │   structure_planner → rag\u002Fkg navigator →      │\n                │   method_scout → source_research →            │\n                │   iteration_1\u002F2\u002F3 → model_supervisor →        │\n                │   claude_code_delegate → qa_gate ⇄ replan →   │\n                │   drafter → paper_qc ⇄ reflexion → peer_review │\n                └───────┬─────────────────────────────┬─────────┘\n                        │                             │\n            ┌───────────▼──────────┐     ┌────────────▼────────────┐\n            │     astro_toolbox    │     │  local RAG + optional   │\n            │  (survey + modeling) │     │  KG workspace (private) │\n            └──────────────────────┘     └─────────────────────────┘\n```\n\nThe agent and the toolbox communicate only through structured JSON artifacts\nwritten to a per-run directory. This makes every step inspectable and replayable\nwithout re-running upstream nodes.\n\n---\n\n## 3. Workflow Design\n\n### 3.1 Why a state graph instead of free ReAct loops\n\nAstronomical analysis is **long-horizon, audit-heavy, and partially\ndeterministic**: target classification dictates which physical model is valid,\nand modeling claims must be backed by specific evidence. Free chain-of-thought\nagents tend to hallucinate parameter values when evidence is missing. The\nLangGraph state machine instead enforces:\n\n- explicit per-node responsibilities and typed state (`AnalysisState` TypedDict),\n- conditional edges for replan \u002F reflexion \u002F abnormal exit,\n- a hard cap of three modeling iterations and at most two replans \u002F two\n  reflexion rewrites,\n- file-system checkpoints (`01_resolved_target.json`, `02_data_fetch.json`, …)\n  that double as human-readable provenance.\n\n### 3.2 Node responsibilities\n\n| Node | Responsibility | Notes |\n|------|----------------|-------|\n| `resolve` | SIMBAD cross-identification of name ↔ RA\u002FDec | offline-tolerant via `--skip-simbad` |\n| `data_fetcher` | Parallel calls to 20+ survey clients in `astro_toolbox` | writes a unified `run_summary` |\n| `memory_advisor` | Reads a SQLite ledger of past method\u002Ftool outcomes | guides planner toward known-good paths |\n| `structure_planner` | Routes per SIMBAD class (WD \u002F sdOB \u002F CV \u002F Polar \u002F …) into spectroscopy+SED, HRD+SED photometric fallback, SED-only, or insufficient-data | each branch unlocks a different evidence set |\n| `rag_navigator` | BM25 search over a local SQLite literature index, pre-tagged with 46 instruments and 24 method families | precise on domain jargon |\n| `kg_navigator` | Optional method-transfer search over an external KG workspace | gracefully degrades when absent |\n| `method_scout` | Compares RAG\u002FKG hits to current toolbox capabilities, flags capability gaps | optionally LLM-assisted |\n| `source_research` | Per-target evidence pack: SIMBAD-linked references, exact RAG matches, HST\u002FSED\u002Fspectral-line QA | gates downstream modeling claims |\n| `iteration_1\u002F2\u002F3` | Mandatory baseline → residuals → systematics passes | each must converge or be marked non-converged |\n| `model_supervisor` | Audits residuals, grid-boundary fits, missing exports, no-spectrum claims, generates repair actions with `owner \u002F priority \u002F acceptance` | |\n| `claude_code_delegate` | Optional handoff of repair actions to a Claude Code subprocess | |\n| `qa_gate` | Routes `clear_for_draft` → drafter, `model_mismatch` → replan, otherwise → abnormal report | |\n| `drafter` | PaperOrchestra five-agent manuscript pipeline (outline \u002F plotting \u002F lit-review \u002F section-writing \u002F refinement) producing `aastex631` LaTeX | |\n| `paper_qc` | ApJ checklist: parameter table, units, citations, figures, tables | |\n| `reflexion` | QC-driven targeted rewrite, hard-capped | |\n| `peer_reviewer` | Generates four scientific-question review notes | |\n| `toolbox_evolution` | Records confirmed capability gaps and required code\u002Fdoc updates | |\n\n### 3.3 Mandatory three modeling iterations\n\n`iteration_1_baseline → iteration_2_residuals → iteration_3_systematics` is\nnon-skippable: a single best-fit number without residual diagnostics and\nsystematic checks is rejected by `qa_gate`. This is the main mechanism by\nwhich the agent refuses to publish under-supported claims.\n\n### 3.4 Model-mismatch self-heal\n\nWhen SIMBAD identifies a target as e.g. sdOB or CV but the active branch is\nwhite-dwarf fitting, `qa_gate` emits `model_mismatch`, the conditional edge\nroutes back to `structure_planner`, and a retry counter limits replanning to\ntwo attempts. If the target's pipeline is not implemented, the run terminates\nwith `replan_blocked` and a human trigger entry instead of producing a paper.\n\n### 3.5 Reflexion loop\n\n`paper_qc` failures (missing parameter table, unsupported claim, citation gap)\nfeed structured findings into `reflexion`, which performs a *targeted* rewrite\nof only the offending section rather than re-running the whole drafter. The\nloop is bounded at two rewrites.\n\n---\n\n## 4. Methodological Choices\n\n### 4.1 BM25 + rule-tagged retrieval over dense vectors\n\nAstronomy text is dense in highly specific tokens (`DA white dwarf`, `logg`,\n`Balmer lines`, `Lomb-Scargle`, `Bayestar2019`). Empirically BM25 with a\ndomain-rule pre-tagger (46 instruments × 24 method families) outperforms\ngeneric dense retrieval on method-transfer queries, while remaining cheap and\nauditable. The same store is used by both `rag_navigator` and `method_scout`.\n\n### 4.2 Filesystem-as-memory\n\nEvery node writes a numbered JSON artifact to the run directory. This:\n\n- removes framework lock-in (no proprietary checkpoint format),\n- gives a human-readable provenance chain,\n- enables `--astrotool-run \u003Cexisting_dir>` to resume without re-downloading\n  survey products,\n- makes diffing two runs a `diff` away.\n\nA separate SQLite **method-success ledger** is maintained across runs and read\nby `memory_advisor`; it stores aggregate outcomes only, not raw data.\n\n### 4.3 Evidence-gated parameter claims\n\n`source_research` produces a per-target pack that explicitly lists which\nmodeling claims are *currently supported* and which are *blocked pending\nevidence*. In `photometric_hrd_sed_fallback` (no spectra) the agent may report\nprovisional Teff \u002F radius \u002F luminosity from SED+Gaia HRD, but blocks final\nspectral type, line detections, composition, precise log g, mass, and\ncooling-age until stronger evidence is added. Drafter and paper_qc respect\nthese gates.\n\n### 4.4 Supervisor-issued repair actions\n\n`model_supervisor` does not \"fix\" results; it emits structured repair tasks\n(`owner`, `priority`, `acceptance_criterion`). Repairs are then executed\neither by the next iteration node or, optionally, by `claude_code_delegate`\ncalling Claude Code as a subprocess. This keeps science decisions and code\nchanges on separately reviewable artifacts.\n\n### 4.5 PaperOrchestra: five-agent manuscript pipeline\n\nDrafting is split into deterministic sub-roles:\n\n- **Outline Agent** — section plan and required evidence per section,\n- **Plotting Agent** — figure list keyed to evidence artifacts,\n- **Literature Review Agent** — RAG-grounded references,\n- **Section Writing Agent** — produces LaTeX per section against the outline,\n- **Content Refinement Agent** — consistency, units, claim\u002Fevidence linking.\n\nEach sub-role's prompt manifest is in\n`paper_orchestra\u002Fagents_manifest.json`; the Codex-style tool\u002Fcontext rules\n(bounded context window, structured tool I\u002FO, review-first QA) live in\n`paper_orchestra\u002Fcodex_style_guidance.json`.\n\n---\n\n## 5. Astro Toolbox\n\n### 5.1 Survey coverage\n\nSpectroscopy: SDSS DR18, DESI DR1, LAMOST DR8, HST COS\u002FSTIS, JWST\nNIRSpec\u002FMIRI, GALAH DR4, KOA\u002FKeck.\nPhotometry: SDSS *ugriz*, Gaia DR3, 2MASS, WISE, GALEX, SPHEREx.\nTime-domain: ZTF DR23, TESS, Kepler\u002FK2, Gaia epoch photometry, NEOWISE.\nX-ray: ROSAT \u002F XMM \u002F Chandra via HEASARC.\n\n### 5.2 Modeling modules\n\n`sed`, `sed_decoupled` (Lin+2025 UPK 13-c2 two-component decoupling),\n`wd_fitting` (Koester \u002F TLUSTY atmospheres + cooling track + mass-radius),\n`cooling_age`, `rv_fitting` (cross-correlation \u002F template matching),\n`period_analysis` (Lomb–Scargle + folding), `orbit_traceback` (6D phase-space\nintegration), `hr_diagram`, `cluster_membership` (kinematic + spatial χ²),\n`extinction` (Bayestar2019 \u002F SFD98), `compact_binary_report`,\n`disk_eclipse_mcmc`, `ingress_measurement`, `binary_orbit`, stellar templates.\n\n### 5.3 Single-target driver\n\n`run_single_target_all_tools.py` orchestrates the toolbox end-to-end and\nproduces the JSON `run_summary` consumed by `data_fetcher`.\n\n---\n\n## 6. Repository Layout\n\n```text\n.\n├── Astro_Agent\u002F\n│   ├── analysis_agent\u002F          # LangGraph workflow, LLM clients, QA, paper pipeline\n│   ├── astro_toolbox\u002F           # survey clients + scientific modeling modules\n│   ├── claude_code_toolbox\u002F     # optional Claude Code subprocess wrapper\n│   ├── scripts\u002F                 # ablation, prompt tuning, review, helper scripts\n│   ├── web\u002F                     # local web UI\n│   └── USAGE.md                 # extended usage notes\n├── rag_pipeline\u002F                # local literature RAG utilities\n├── start_services.sh            # local launcher\n├── stop_services.sh             # local stopper\n└── README.md\n```\n\nNote: a private literature → KG workspace can live anywhere on disk and is\nreferenced via `ASTRO_AGENT_KG_WORKSPACE`. Its construction code, corpus, and\nexports are **not** part of this repository.\n\n---\n\n## 7. Setup\n\nPython 3.10+.\n\n```bash\npython3 -m venv .venv\nsource .venv\u002Fbin\u002Factivate\npython -m pip install --upgrade pip\npython -m pip install numpy pandas scipy matplotlib astropy astroquery requests \\\n    python-dotenv pyyaml networkx fastapi uvicorn\npython -m pip install openai langgraph json-repair scikit-learn\n```\n\nOptional, depending on which modules you exercise:\n\n```bash\npython -m pip install lightkurve galpy dustmaps emcee corner sentence-transformers\n```\n\nCopy and edit local env files (never commit them):\n\n```bash\ncp Astro_Agent\u002Fanalysis_agent\u002F.env.example Astro_Agent\u002Fanalysis_agent\u002F.env\ncp Astro_Agent\u002Fastro_toolbox\u002F.env.example Astro_Agent\u002Fastro_toolbox\u002F.env\n```\n\nCommon variables:\n\n```text\nASTRO_AGENT_MODEL_PROVIDER=deepseek            # or gemini \u002F kimi\nDEEPSEEK_BASE_URL=https:\u002F\u002Fapi.deepseek.com\nDEEPSEEK_MODEL=deepseek-v4-pro\nDEEPSEEK_API_KEY=...\n\nADS_DEV_KEY=...\nGAIA_TOKEN=...\nLAMOST_TOKEN=...\nASTRO_AGENT_KG_WORKSPACE=\u002Fabsolute\u002Fpath\u002Fto\u002Fprivate\u002Fkg_workspace   # optional\n```\n\n---\n\n## 8. Running\n\nPlan-only (no downloads):\n\n```bash\npython -m Astro_Agent.analysis_agent.cli \"Gaia DR3 865415642195374464\" \\\n    --ra 232.3955 --dec 29.4672\n```\n\nFull toolbox-backed analysis:\n\n```bash\npython -m Astro_Agent.analysis_agent.cli \"Gaia DR3 865415642195374464\" \\\n    --ra 232.3955 --dec 29.4672 --execute\n```\n\nLLM-backed writing\u002Freview:\n\n```bash\npython -m Astro_Agent.analysis_agent.cli \"Gaia DR3 865415642195374464\" \\\n    --ra 232.3955 --dec 29.4672 --execute --use-llm --llm-provider deepseek\n```\n\nLocal HTTP service + web UI:\n\n```bash\npython -m uvicorn Astro_Agent.analysis_agent.server:app \\\n    --host 0.0.0.0 --port 8765 --reload\n# then open http:\u002F\u002Flocalhost:8765\u002F\n```\n\nToolbox stand-alone:\n\n```bash\npython -m Astro_Agent.astro_toolbox.run_single_target_all_tools\n```\n\n---\n\n## 9. Outputs\n\nEach run writes to `Astro_Agent\u002Foutput\u002Fanalysis_agent\u002F\u003Ctarget>_\u003Ctimestamp>\u002F`\nand contains:\n\n- numbered JSON checkpoints per node,\n- `run_summary.json` from the toolbox,\n- `source_research\u002F` evidence pack,\n- either `paper\u002F\u003Caastex631>.tex` + figures + bibliography, or\n  `abnormal_analysis_report.md` when QA blocks publishing,\n- `peer_review.md`, `toolbox_evolution.md`,\n- supervisor repair-action ledger.\n\nAll outputs are gitignored.\n\n---\n\n## 10. Data, Attribution, Safety\n\nThis repository contains orchestration code only and does not redistribute\nthird-party survey data. If you publish results derived from data fetched by\nthe toolbox, cite the original providers and follow their terms (SIMBAD\u002FCDS,\nVizieR, ADS, Gaia, SDSS, DESI, MAST, ZTF, WISE, 2MASS, GALEX, LAMOST, GALAH,\nKOA\u002FKeck, HEASARC, …).\n\nBefore pushing:\n\n```bash\ngit status --short\ngit ls-files --others --exclude-standard\nrg -n \"sk-[A-Za-z0-9_-]{20,}|BEGIN .*PRIVATE KEY|password\\s*=\" \\\n    -g '!**\u002F.git\u002F**' -g '!**\u002F.env' .\nfind . -path '.\u002F.git' -prune -o -type f -size +50M -print\n```\n\nOnly code, public prompts, public configs, docs, and lightweight examples\nshould be committed. Keep `.env`, PDFs, FITS, SQLite indexes, KG workspaces,\nprivate reports, and run artifacts local.\n\n## License\n\nResearch use. Respect the terms of the external data services and model\nproviders you configure locally.\n","Astro Agent 是一个面向白矮星、双星和多波段巡天研究的Python工具箱，它将光谱、测光、光变曲线、SED、HR图、视向速度、冷却年龄和轨道回溯等功能整合到一套统一接口中，并内置了KOA\u002FKeck LRIS数据下载与一维谱提取流程。该项目的核心功能包括通过LangGraph状态机实现从目标解析、多源数据获取、模型迭代运行到结果审计及文献对比的全流程自动化处理；同时提供了一个包含多种天文观测和科学建模模块的领域工具箱。适合用于需要高效整合并分析来自不同天文观测项目的数据的研究场景，尤其是在进行白矮星及其相关天体物理现象研究时。",2,"2026-06-11 03:55:00","CREATED_QUERY"]