[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-74093":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":8,"htmlUrl":8,"language":9,"languages":8,"totalLinesOfCode":8,"stars":10,"forks":11,"watchers":12,"openIssues":12,"contributorsCount":13,"subscribersCount":13,"size":13,"stars1d":14,"stars7d":15,"stars30d":16,"stars90d":13,"forks30d":13,"starsTrendScore":17,"compositeScore":18,"rankGlobal":8,"rankLanguage":8,"license":19,"archived":20,"fork":20,"defaultBranch":21,"hasWiki":22,"hasPages":20,"topics":23,"createdAt":8,"pushedAt":8,"updatedAt":24,"readmeContent":25,"aiSummary":26,"trendingCount":13,"starSnapshotCount":13,"syncStatus":27,"lastSyncTime":28,"discoverSource":29},74093,"AutoFigure-Edit","ResearAI\u002FAutoFigure-Edit","ResearAI",null,"Python",3725,254,4,0,85,338,555,255,29.22,"MIT License",false,"main",true,[],"2026-06-12 02:03:22","\u003Cdiv align=\"center\">\n\n\u003Cimg src=\"img\u002Flogo.png\" alt=\"AutoFigure-Edit Logo\" width=\"100%\"\u002F>\n\n# AutoFigure-Edit: Generating Editable Scientific Illustration\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"README.md\">English\u003C\u002Fa> | \u003Ca href=\"README_ZH.md\">中文\u003C\u002Fa>\n\u003C\u002Fp>\n\n[![arXiv 2603.06674](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2603.06674-b31b1b?style=for-the-badge&logo=arxiv&logoColor=white)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2603.06674)\n[![License: MIT](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-yellow.svg?style=for-the-badge)](https:\u002F\u002Fopensource.org\u002Flicenses\u002FMIT)\n[![Python](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPython-3.10%2B-blue?style=for-the-badge&logo=python&logoColor=white)](https:\u002F\u002Fwww.python.org\u002F)\n[![HuggingFace](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20HuggingFace-FigureBench-orange?style=for-the-badge)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FWestlakeNLP\u002FFigureBench)\n[![Website](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FWebsite-deepscientist.cc-brightgreen?style=for-the-badge&logo=googlechrome&logoColor=white)](https:\u002F\u002Fdeepscientist.cc\u002F)\n\n\u003Cp align=\"center\">\n  \u003Cstrong>From Method Text to Editable SVG\u003C\u002Fstrong>\u003Cbr>\n  AutoFigure-Edit is the next version of AutoFigure. It turns paper method sections into fully editable SVG figures and lets you refine them in an embedded SVG editor.\n\u003C\u002Fp>\n\n[Quick Start](#-quick-start) • [Web Interface](#-web-interface) • [How It Works](#-how-it-works) • [Configuration](#-configuration) • [Citation](#-citation--license)\n\n[[`Paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F2603.06674)]\n[[`AutoFigure`](https:\u002F\u002Fgithub.com\u002FResearAI\u002FAutoFigure)]\n[[`BibTeX`](#-citation--license)]\n\n\u003C\u002Fdiv>\n\n---\n\n\n\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F6f93deb4-9854-4f1e-8097-53b0c3378a0d\n\n\n\n\n\n## 🔥 News\n\n- **[2026.04.23]** 🚀 **AutoFigure-Edit v1.1** is now available. This release primarily adds user-supplied stage-1 figure import, official OpenAI model support including `gpt-image-2` and `gpt-5.5`, Bianxie AI \u002F `custom` OpenAI-compatible routing, and a bilingual configuration workflow. See the full [release notes](releases\u002Fv1.1.md).\n- **[2026.03.24]** 🧠 Our sister project **DeepScientist v1.5** is now officially released. It is a local-first open-source autonomous research system for end-to-end scientific discovery. Explore it on [GitHub](https:\u002F\u002Fgithub.com\u002FResearAI\u002FDeepScientist) or read the [ICLR 2026 paper](https:\u002F\u002Fopenreview.net\u002Fforum?id=cZFgsLq8Gs).\n- **[2026.03.11]** 📄 Our **AutoFigure-Edit** paper is now available on [arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2603.06674) and featured in 🤗[Hugging Face Daily Papers](https:\u002F\u002Fhuggingface.co\u002Fpapers\u002F2603.06674)! If you find our work helpful, please consider giving us an **upvote** on Hugging Face and **citing** our paper. Thank you! ❤️\n- **[2026.02.17]** 🚀 The **AutoFigure-Edit online platform** is now live! It is free for all scholars to use. Try it out at [deepscientist.cc](https:\u002F\u002Fdeepscientist.cc).\n- **[2026.01.26]** 🎉 AutoFigure has been accepted to **ICLR 2026**! You can read the paper on [arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.03828).\n\n---\n\n## 🆕 V1.1 (2026.04.23)\n\nAutoFigure-Edit v1.1 is published as tag `v1.1`. This release focuses on two practical workflows that were still awkward in earlier public builds: starting from a user-supplied stage-1 academic figure, and running the pipeline cleanly with official OpenAI models or OpenAI-compatible gateways.\n\n- **User-supplied stage-1 figure import:** You can now upload an existing academic raster figure, skip step 1 image generation, and continue directly from SAM + SVG reconstruction in both the web UI and CLI workflow.\n- **Official OpenAI model support:** Step 1 can now use the OpenAI Images API with `gpt-image-2`, while the OpenAI Responses path is documented and exposed for text plus multimodal SVG reconstruction with `gpt-5.5` as the default SVG model.\n- **Bianxie AI and `custom` OpenAI-compatible routing:** The CLI and web UI expose `bianxie` as a built-in compatible route with `https:\u002F\u002Fapi.bianxie.ai\u002Fv1`, while `custom` remains available for user-supplied OpenAI-compatible `\u002Fv1` base URLs. The `openai_response` route can inherit the same compatible `base_url` and `api_key` by default.\n- **Bilingual setup and onboarding:** The main page, import page, canvas, and guide now support in-page Chinese \u002F English switching, and the built-in guide explains workflow choices, fields, SAM backends, and recommended presets.\n\nFull release notes: [releases\u002Fv1.1.md](releases\u002Fv1.1.md)\n\n---\n\n## ✨ Features\n\n| Feature | Description |\n| :--- | :--- |\n| 📝 **Text-to-Figure** | Generate a draft figure directly from method text. |\n| 🧠 **SAM3 Icon Detection** | Detect icon regions from multiple prompts and merge overlaps. |\n| 🎯 **Labeled Placeholders** | Insert consistent AF-style placeholders for reliable SVG mapping. |\n| 🧩 **SVG Generation** | Produce an editable SVG template aligned to the figure. |\n| 🖥️ **Embedded Editor** | Edit the SVG in-browser using the bundled svg-edit. |\n| 📦 **Artifact Outputs** | Save PNG\u002FSVG outputs and icon crops per run. |\n\n---\n\n## 🎨 Gallery: Editable Vectorization & Style Transfer\n\nAutoFigure-edit introduces two breakthrough capabilities:\n\n1.  **Fully Editable SVGs (Pure Code Implementation):** Unlike raster images, our outputs are structured Vector Graphics (SVG). Every component is editable—text, shapes, and layout can be modified losslessly.\n2.  **Style Transfer:** The system can mimic the artistic style of reference images provided by the user.\n\nBelow are **9 examples** covering 3 different papers. Each paper is generated using 3 different reference styles.\n*(Each image shows: **Left** = AutoFigure Generation | **Right** = Vectorized Editable SVG)*\n\n| Paper & Style Transfer Demonstration |\n| :---: |\n| **[CycleResearcher](https:\u002F\u002Fgithub.com\u002Fzhu-minjun\u002FResearcher) \u002F [Style 1](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2510.09558)**\u003Cbr>\u003Cimg src=\"img\u002Fcase\u002F4.png\" width=\"100%\" alt=\"Paper 1 Style 1\"\u002F> |\n| **[CycleResearcher](https:\u002F\u002Fgithub.com\u002Fzhu-minjun\u002FResearcher) \u002F [Style 2](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2503.18102)**\u003Cbr>\u003Cimg src=\"img\u002Fcase\u002F5.png\" width=\"100%\" alt=\"Paper 1 Style 2\"\u002F> |\n| **[CycleResearcher](https:\u002F\u002Fgithub.com\u002Fzhu-minjun\u002FResearcher) \u002F [Style 3](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2510.14512)**\u003Cbr>\u003Cimg src=\"img\u002Fcase\u002F6.png\" width=\"100%\" alt=\"Paper 1 Style 3\"\u002F> |\n| **[DeepReviewer](https:\u002F\u002Fgithub.com\u002Fzhu-minjun\u002FResearcher) \u002F [Style 1](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2510.09558)**\u003Cbr>\u003Cimg src=\"img\u002Fcase\u002F7.png\" width=\"100%\" alt=\"Paper 2 Style 1\"\u002F> |\n| **[DeepReviewer](https:\u002F\u002Fgithub.com\u002Fzhu-minjun\u002FResearcher) \u002F [Style 2](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2503.18102)**\u003Cbr>\u003Cimg src=\"img\u002Fcase\u002F8.png\" width=\"100%\" alt=\"Paper 2 Style 2\"\u002F> |\n| **[DeepReviewer](https:\u002F\u002Fgithub.com\u002Fzhu-minjun\u002FResearcher) \u002F [Style 3](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2510.14512)**\u003Cbr>\u003Cimg src=\"img\u002Fcase\u002F9.png\" width=\"100%\" alt=\"Paper 2 Style 3\"\u002F> |\n| **[DeepScientist](https:\u002F\u002Fgithub.com\u002FResearAI\u002FDeepScientist) \u002F [Style 1](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2510.09558)**\u003Cbr>\u003Cimg src=\"img\u002Fcase\u002F10.png\" width=\"100%\" alt=\"Paper 3 Style 1\"\u002F> |\n| **[DeepScientist](https:\u002F\u002Fgithub.com\u002FResearAI\u002FDeepScientist) \u002F [Style 2](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2503.18102)**\u003Cbr>\u003Cimg src=\"img\u002Fcase\u002F11.png\" width=\"100%\" alt=\"Paper 3 Style 2\"\u002F> |\n| **[DeepScientist](https:\u002F\u002Fgithub.com\u002FResearAI\u002FDeepScientist) \u002F [Style 3](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2510.14512)**\u003Cbr>\u003Cimg src=\"img\u002Fcase\u002F12.png\" width=\"100%\" alt=\"Paper 3 Style 3\"\u002F> |\n\n---\n## 🚀 How It Works\n\nThe AutoFigure-edit pipeline transforms a raw generation into an editable SVG in four distinct stages:\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"img\u002Fpipeline.png\" width=\"100%\" alt=\"Pipeline Visualization: Figure -> SAM -> Template -> Final\"\u002F>\n  \u003Cbr>\n  \u003Cem>(1) Raw Generation &rarr; (2) SAM3 Segmentation &rarr; (3) SVG Layout Template &rarr; (4) Final Assembled Vector\u003C\u002Fem>\n\u003C\u002Fdiv>\n\n\u003Cbr>\n\n1.  **Generation (`figure.png`):** The LLM generates a raster draft based on the method text.\n2.  **Segmentation (`sam.png`):** SAM3 detects and segments distinct icons and text regions.\n3.  **Templating (`template.svg`):** The system constructs a structural SVG wireframe using placeholders.\n4.  **Assembly (`final.svg`):** High-quality cropped icons and vectorized text are injected into the template.\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>View Detailed Technical Pipeline\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n\u003Cbr>\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"img\u002Fedit_method.png\" width=\"100%\" alt=\"AutoFigure-edit Technical Pipeline\"\u002F>\n\u003C\u002Fdiv>\n\nAutoFigure2’s pipeline starts from the paper’s method text and first calls a **text‑to‑image LLM** to render a journal‑style schematic, saved as `figure.png`. The system then runs **SAM3 segmentation** on that image using one or more text prompts (e.g., “icon, diagram, arrow”), merges overlapping detections by an IoU‑like threshold, and draws gray‑filled, black‑outlined labeled boxes on the original; this produces both `samed.png` (the labeled mask overlay) and a structured `boxlib.json` with coordinates, scores, and prompt sources.\n\nNext, each box is cropped from the original figure and passed through **RMBG‑2.0** for background removal, yielding transparent icon assets under `icons\u002F*.png` and `*_nobg.png`. With `figure.png`, `samed.png`, and `boxlib.json` as multimodal inputs, the LLM generates a **placeholder‑style SVG** (`template.svg`) whose boxes match the labeled regions.\n\nOptionally, the SVG is iteratively refined by an **LLM optimizer** to better align strokes, layouts, and styles, resulting in `optimized_template.svg` (or the original template if optimization is skipped). The system then compares the SVG dimensions with the original figure to compute scale factors and aligns coordinate systems. Finally, it replaces each placeholder in the SVG with the corresponding transparent icon (matched by label\u002FID), producing the assembled `final.svg`.\n\n**Key configuration details:**\n- **Placeholder Mode:** Controls how icon boxes are encoded in the prompt (`label`, `box`, or `none`).\n- **Optimization:** `optimize_iterations=0` allows skipping the refinement step to use the raw structure directly.\n\u003C\u002Fdetails>\n\n---\n\n## ⚡ Quick Start\n\n### Option 0: Docker Deployment Guide (Recommended)\n\nUse Docker for a reproducible one-command setup without local Python\u002FSAM3 installation.\n\n#### 0) Prerequisites\n\n- Docker Desktop (with Docker Compose v2)\n- Port `8000` available on host\n- HuggingFace access to `briaai\u002FRMBG-2.0`: https:\u002F\u002Fhuggingface.co\u002Fbriaai\u002FRMBG-2.0\n\n#### 1) Prepare `.env`\n\n```bash\n# Linux\u002FmacOS\ncp .env.example .env\n\n# Windows PowerShell\nCopy-Item .env.example .env\n```\n\nAt minimum, set this in `.env`:\n\n```bash\nHF_TOKEN=hf_xxx\n```\n\nOptional but recommended:\n\n```bash\n# SAM3 API backend (Docker default in UI is Roboflow)\nROBOFLOW_API_KEY=your_roboflow_key\n\n# Step-4 multimodal retry tuning (OpenRouter)\nOPENROUTER_MULTIMODAL_RETRIES=3\nOPENROUTER_MULTIMODAL_RETRY_DELAY=1.5\n\n# DNS override for Roboflow name-resolution issues\nDOCKER_DNS_1=223.5.5.5\nDOCKER_DNS_2=119.29.29.29\n```\n\nFor restricted networks, you can also set build mirrors:\n\n```bash\nBASE_IMAGE=docker.m.daocloud.io\u002Flibrary\u002Fpython:3.11-slim\nPIP_INDEX_URL=https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\nPIP_EXTRA_INDEX_URL=\n```\n\n#### 2) Build and start\n\n```bash\ndocker compose up -d --build\n```\n\nOpen `http:\u002F\u002Flocalhost:8000`.\n\n#### 3) Verify service health\n\n```bash\ndocker compose ps\ncurl http:\u002F\u002Flocalhost:8000\u002Fhealthz\n```\n\nExpected health response: `{\"status\":\"ok\"}`.\n\n#### 4) Daily operations\n\n```bash\n# Stream logs\ndocker compose logs -f autofigure-edit\n\n# Restart service\ndocker compose restart autofigure-edit\n\n# Rebuild from scratch (no cache)\ndocker compose build --no-cache\ndocker compose up -d\n\n# Stop and remove container\ndocker compose down\n```\n\n#### 5) Persistence and defaults\n\n- Persistent outputs: `.\u002Foutputs`, `.\u002Fuploads`\n- Persistent HuggingFace cache: Docker volume `hf_cache` (`\u002Fapp\u002F.cache\u002Fhuggingface`)\n- Docker\u002FWeb default SAM backend: `roboflow`\n- Default SAM prompt: `icon,person,robot,animal`\n- Current default models:\n  - `openrouter`: image `google\u002Fgemini-3.1-flash-image-preview`, svg `google\u002Fgemini-3.1-pro-preview`\n  - `bianxie`: image `gpt-image-2`, svg `gemini-3.1-pro-preview` (built-in base URL `https:\u002F\u002Fapi.bianxie.ai\u002Fv1`)\n  - `custom`: image `gemini-3.1-flash-image-preview`, svg `gemini-3.1-pro-preview` (requires your own OpenAI-compatible `\u002Fv1` base URL)\n  - `gemini`: image `gemini-3.1-flash-image-preview`, svg `gemini-3.1-pro-preview`\n  - `openai_response`: image `gpt-image-2` (step 1 fallback), svg `gpt-5.5` via Responses API\n- Optional step-1 override:\n  - `--image_provider openai`: image `gpt-image-2` via the official OpenAI Images API\n\n#### 6) Common Docker networking issues\n\n- `Temporary failure in name resolution` (Roboflow): set `DOCKER_DNS_1\u002F2` in `.env`, then `docker compose up -d --build`.\n- Cannot reach Docker Hub auth (`auth.docker.io`): set `BASE_IMAGE` and `PIP_INDEX_URL` mirrors in `.env`.\n- Optional Roboflow endpoint override:\n  - `ROBOFLOW_API_URL=\u003Cyour_reachable_roboflow_endpoint>`\n  - `ROBOFLOW_API_FALLBACK_URLS=\u003Ccomma_separated_backup_endpoints>`\n\n### Option 1: CLI\n\n```bash\n# 1) Install dependencies\npip install -r requirements.txt\n\n# 2) Install SAM3 separately (not vendored in this repo)\ngit clone https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fsam3.git\ncd sam3\npip install -e .\n```\n\n**Run:**\n\n```bash\npython autofigure2.py \\\n  --method_file paper.txt \\\n  --output_dir outputs\u002Fdemo \\\n  --provider bianxie \\\n  --api_key YOUR_KEY\n```\n\nUse OpenAI only for step 1 image generation while keeping SVG reconstruction on the original provider:\n\n```bash\npython autofigure2.py \\\n  --method_file paper.txt \\\n  --output_dir outputs\u002Fdemo \\\n  --provider gemini \\\n  --api_key GEMINI_KEY \\\n  --image_provider openai \\\n  --image_api_key OPENAI_KEY \\\n  --image_model gpt-image-2\n```\n\nUse the OpenAI Responses API for text + multimodal SVG reconstruction:\n\n```bash\npython autofigure2.py \\\n  --method_file paper.txt \\\n  --output_dir outputs\u002Fdemo \\\n  --provider openai_response \\\n  --api_key OPENAI_KEY\n```\n\nContinue from an existing stage-1 figure and skip image generation:\n\n```bash\npython autofigure2.py \\\n  --input_figure_path .\u002Fmy_stage1_figure.png \\\n  --output_dir outputs\u002Fimport_demo \\\n  --provider openai_response \\\n  --api_key OPENAI_KEY \\\n  --svg_model gpt-5.5\n```\n\n### Option 2: Web Interface\n\n```bash\npython server.py\n```\n\nThen open `http:\u002F\u002Flocalhost:8000`.\n\n---\n\n## 🖥️ Web Interface Demo\n\nAutoFigure-edit provides a visual web interface designed for seamless generation and editing.\n\n### 1. Configuration Page\n\u003Cimg src=\"img\u002Fdemo_start.png\" width=\"100%\" alt=\"Configuration Page\" style=\"border: 1px solid #ddd; border-radius: 8px; margin-bottom: 10px;\"\u002F>\n\nOn the start page, paste your paper's method text on the left. On the right, configure your generation settings:\n*   **Provider:** Select your LLM provider (Bianxie AI, OpenRouter, Custom, Gemini, or OpenAI Responses).\n*   **Image Provider:** Optionally override **step 1 only** to use OpenAI GPT-Image.\n*   **Optimize:** Set SVG template refinement iterations (recommend `0` for standard use).\n*   **Image Size:** Available when the effective step-1 image provider is **Gemini**. Choose `1K`, `2K`, or `4K`.\n*   **Auto Upscale:** Enabled by default. Upscales `figure.png` to a 4K long edge (`3840px`) while preserving aspect ratio.\n*   **Reference Image:** Upload a target image to enable style transfer.\n*   **SAM3 Backend:** Choose local SAM3 or the fal.ai API (API key optional).\n\nIf you already have the first-stage raster figure, use the black button in the top-right corner:\n\n*   **I already have the stage-1 figure:** Opens a dedicated import page where you upload an existing academic figure and continue directly from SAM + SVG reconstruction.\n\n### 2. Canvas & Editor\n\u003Cimg src=\"img\u002Fdemo_canvas.png\" width=\"100%\" alt=\"Canvas Page\" style=\"border: 1px solid #ddd; border-radius: 8px; margin-bottom: 10px;\"\u002F>\n\nThe generation result loads directly into an integrated [SVG-Edit](https:\u002F\u002Fgithub.com\u002FSVG-Edit\u002Fsvgedit) canvas, allowing for full vector editing.\n*   **Status & Logs:** Check real-time progress (top-left) and view detailed execution logs (top-right button).\n*   **Artifacts Drawer:** Click the floating button (bottom-right) to expand the **Artifacts Panel**. This contains all intermediate outputs (icons, SVG templates, etc.). You can **drag and drop** any artifact directly onto the canvas for custom composition.\n*   **History:** Open the **History** button to browse saved runs from `outputs\u002F` and reopen an older result in the canvas.\n\n---\n\n## 🧩 SAM3 Installation Notes\n\nAutoFigure-edit depends on SAM3 but does **not** vendor it. Please follow the\nofficial SAM3 installation guide and prerequisites. The upstream repo currently\ntargets Python 3.12+, PyTorch 2.7+, and CUDA 12.6 for GPU builds.\n\nSAM3 checkpoints are hosted on Hugging Face and may require you to request\naccess and authenticate (e.g., `huggingface-cli login`) before download.\n\n- SAM3 repo: https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fsam3\n- SAM3 Hugging Face: https:\u002F\u002Fhuggingface.co\u002Ffacebook\u002Fsam3\n\n### SAM3 API Mode (No Local Install)\n\nIf you prefer not to install SAM3 locally, you can use an API backend (also supported in the Web demo). **We recommend using [Roboflow](https:\u002F\u002Froboflow.com\u002F) as it is free to use.**\n\n**Option A: fal.ai**\n\n```bash\nexport FAL_KEY=\"your-fal-key\"\npython autofigure2.py \\\n  --method_file paper.txt \\\n  --output_dir outputs\u002Fdemo \\\n  --provider bianxie \\\n  --api_key YOUR_KEY \\\n  --sam_backend fal\n```\n\n**Option B: Roboflow**\n\n```bash\nexport ROBOFLOW_API_KEY=\"your-roboflow-key\"\npython autofigure2.py \\\n  --method_file paper.txt \\\n  --output_dir outputs\u002Fdemo \\\n  --provider bianxie \\\n  --api_key YOUR_KEY \\\n  --sam_backend roboflow\n```\n\nOptional CLI flags (API):\n- `--sam_api_key` (overrides `FAL_KEY`\u002F`ROBOFLOW_API_KEY`)\n- `--sam_max_masks` (default: 32, fal.ai only)\n\n## ⚙️ Configuration\n\n### Supported LLM Providers\n\n| Provider | Base URL | Notes |\n|----------|----------|------|\n| **OpenRouter** | `openrouter.ai\u002Fapi\u002Fv1` | Supports Gemini\u002FClaude\u002Fothers |\n| **Bianxie AI** | `api.bianxie.ai\u002Fv1` | Built-in OpenAI-compatible aggregate API; supports GPT-image-2 and Gemini-3.1-Pro access for mainland China users without a foreign credit card |\n| **Custom** | `\u003Cyour-compatible-endpoint>\u002Fv1` (required) | Vendor-neutral OpenAI-compatible API |\n| **Gemini (Google)** | `generativelanguage.googleapis.com\u002Fv1beta` | Official Google Gemini API (`google-genai`) |\n| **OpenAI Responses** | `api.openai.com\u002Fv1` | Uses the official OpenAI Responses API for text + multimodal |\n\nCommon CLI flags:\n\n- `--method_text`, `--method_file`, or `--input_figure_path`\n- `--provider` (openrouter | bianxie | custom | gemini | openai_response)\n- `--image_provider` (openrouter | bianxie | custom | gemini | openai, optional step-1 override)\n- `--image_api_key`, `--image_base_url`\n- `--image_model`, `--svg_model`\n- `--image_size` (1K | 2K | 4K, Gemini only)\n- `--disable_auto_upscale` (disable the default 4K aspect-ratio-preserving upscale after step 1)\n- `--sam_prompt` (comma-separated prompts)\n- `--sam_backend` (local | fal | roboflow | api)\n- `--sam_api_key` (API key override; falls back to `FAL_KEY` or `ROBOFLOW_API_KEY`)\n- `--sam_max_masks` (fal.ai max masks, default 32)\n- `--merge_threshold` (0 disables merging)\n- `--optimize_iterations` (0 disables optimization)\n- `--reference_image_path` (optional)\n\n### Custom Provider \u002F Custom Base URL\n\nIf you want to use a self-hosted or third-party OpenAI-compatible endpoint, use:\n\n- `--provider custom`\n- `--base_url \u003Cyour_openai_compatible_v1_root>`\n- `--image_model \u003Cimage_model_id>`\n- `--svg_model \u003Csvg_model_id>`\n\nYou can also set `AUTOFIGURE_CUSTOM_BASE_URL` instead of passing `--base_url` every time.\n\n`base_url` must be the OpenAI-compatible `\u002Fv1` root:\n\n```text\nhttps:\u002F\u002Fyour-provider.example\u002Fv1\n```\n\nDo not pass a concrete endpoint path such as:\n\n```text\nhttps:\u002F\u002Fyour-provider.example\u002Fv1\u002Fchat\u002Fcompletions\n```\n\nFor text reasoning and SVG reconstruction, the Custom route calls:\n\n```http\nPOST \u002Fchat\u002Fcompletions\nAuthorization: Bearer \u003Capi_key>\n```\n\nText-only requests use the normal Chat Completions message shape:\n\n```json\n{\n  \"model\": \"your-text-or-svg-model\",\n  \"messages\": [{ \"role\": \"user\", \"content\": \"...\" }],\n  \"max_tokens\": 16000,\n  \"temperature\": 0.7\n}\n```\n\nMultimodal SVG reconstruction must support OpenAI-style `image_url` data URIs:\n\n```json\n{\n  \"role\": \"user\",\n  \"content\": [\n    { \"type\": \"text\", \"text\": \"...\" },\n    {\n      \"type\": \"image_url\",\n      \"image_url\": { \"url\": \"data:image\u002Fpng;base64,...\" }\n    }\n  ]\n}\n```\n\nThe response must return content in the standard shape:\n\n```json\n{\n  \"choices\": [\n    { \"message\": { \"content\": \"\u003Csvg ...>...\u003C\u002Fsvg>\" } }\n  ]\n}\n```\n\nThe SVG may be returned as raw `\u003Csvg>...\u003C\u002Fsvg>` or inside a markdown code block.\n\nThe built-in `bianxie` route uses the same OpenAI-compatible chat shape for text\u002FSVG reconstruction, and uses an OpenAI Images-compatible path for `gpt-image-2` step-1 image generation.\n\nFor step-1 image generation with `--image_provider custom` (or when `--provider custom` is linked to step 1), this repo currently calls `\u002Fchat\u002Fcompletions` and expects the returned message content to contain a base64 image data URI:\n\n```text\n![image](data:image\u002Fpng;base64,...)\n```\n\nor:\n\n```text\ndata:image\u002Fpng;base64,...\n```\n\nIf your provider only exposes an OpenAI Images-compatible `\u002Fimages\u002Fgenerations` route, use `--image_provider openai` for the official OpenAI Images API, or keep image generation on another supported route.\n\n### OpenAI GPT-Image for Step 1\n\nAs of April 23, 2026, OpenAI's official Images API supports `images.generate` and `images.edit` for GPT-Image models. In this repo, `--image_provider openai` uses the OpenAI Images API for step 1 only:\n\n- no reference image: `images.generate`\n- with reference image: `images.edit`\n- default model: `gpt-image-2` (override with `--image_model`)\n- API key precedence: `--image_api_key` -> `OPENAI_API_KEY` -> `--api_key`\n\n### Default 4K Upscale\n\nAfter step 1, the generated `figure.png` is upscaled by default so its long edge reaches `3840px` while preserving the original aspect ratio. If the generated image is already at or above a 4K long edge, the upscale step is skipped automatically.\n\nDisable it with:\n\n```bash\n--disable_auto_upscale\n```\n\n### OpenAI Responses Provider\n\nAs of April 23, 2026, OpenAI's official Responses API supports text output plus multimodal input with `input_text` and `input_image`. In this repo, `--provider openai_response` means:\n\n- text calls use `client.responses.create(...)`\n- multimodal SVG reconstruction also uses `client.responses.create(...)`\n- step 1 image generation falls back to the official OpenAI Images API unless `--image_provider` is explicitly set\n- default SVG model: `gpt-5.5` (override with `--svg_model`)\n\n### Importing an Existing Stage-1 Figure\n\nIf you already have the academic raster figure from step 1, use `--input_figure_path` to skip image generation entirely. The pipeline will normalize the imported image into `figure.png`, optionally apply the default 4K aspect-ratio-preserving upscale, and then continue from SAM segmentation and SVG reconstruction.\n\n---\n\n## 📁 Project Structure\n\n\u003Cdetails>\n\u003Csummary>Click to expand directory tree\u003C\u002Fsummary>\n\n```\nAutoFigure-edit\u002F\n├── autofigure2.py         # Main pipeline\n├── server.py              # FastAPI backend\n├── requirements.txt\n├── web\u002F                   # Static frontend\n│   ├── index.html\n│   ├── canvas.html\n│   ├── history.html\n│   ├── styles.css\n│   ├── app.js\n│   └── vendor\u002Fsvg-edit\u002F   # Embedded SVG editor\n└── img\u002F                   # README assets\n```\n\u003C\u002Fdetails>\n\n---\n\n## 🤝 Community & Support\n\n**WeChat Discussion Group**  \nScan the QR code to join our community. If the code is expired, please add WeChat ID `nauhcutnil` or contact `tuchuan@mail.hfut.edu.cn`.\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd>\u003Cimg src=\"img\u002Fwechat11.jpg\" width=\"200\" alt=\"WeChat 2\"\u002F>\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"img\u002Flzwechat.jpg\" width=\"200\" alt=\"WeChat 2\"\u002F>\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n---\n\n## 📜 Citation & License\n\nIf you find **AutoFigure**, **AutoFigure-Edit**, or **FigureBench** helpful, please cite:\n\n```bibtex\n@inproceedings{\nzhu2026autofigure,\ntitle={AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations},\nauthor={Minjun Zhu and Zhen Lin and Yixuan Weng and Panzhong Lu and Qiujie Xie and Yifan Wei and Sifan Liu and Qiyao Sun and Yue Zhang},\nbooktitle={The Fourteenth International Conference on Learning Representations},\nyear={2026},\nurl={https:\u002F\u002Fopenreview.net\u002Fforum?id=5N3z9JQJKq}\n}\n\n@misc{lin2026autofigureeditgeneratingeditablescientific,\n      title={AutoFigure-Edit: Generating Editable Scientific Illustration}, \n      author={Zhen Lin and Qiujie Xie and Minjun Zhu and Shichen Li and Qiyao Sun and Enhao Gu and Yiran Ding and Ke Sun and Fang Guo and Panzhong Lu and Zhiyuan Ning and Yixuan Weng and Yue Zhang},\n      year={2026},\n      eprint={2603.06674},\n      archivePrefix={arXiv},\n      primaryClass={cs.CV},\n      url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2603.06674}, \n}\n\n@dataset{figurebench2025,\n  title = {FigureBench: A Benchmark for Automated Scientific Illustration Generation},\n  author = {WestlakeNLP},\n  year = {2025},\n  url = {https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FWestlakeNLP\u002FFigureBench}\n}\n```\n\nRepository metadata and usage guidance:\n\n- [CITATION.cff](.\u002FCITATION.cff)\n- [Citation and attribution guidance](.\u002FCITATION_AND_ATTRIBUTION.md)\n- [Name and logo usage](.\u002FTRADEMARK.md)\n\n## 🙏 Acknowledgments\n\nWe would like to thank the [Linux.do](https:\u002F\u002Flinux.do\u002F) community for their support.\n\nWe also thank the following sponsor for supporting this project:\n\n| Sponsor | Link | Support | Note |\n|---|---|---|---|\n| Bianxie AI Aggregate API | [https:\u002F\u002Fapi.bianxie.ai](https:\u002F\u002Fbianxieai.com\u002Fautofigure) | Provides mainland China-friendly access to GPT-image-2 and Gemini-3.1-Pro for AutoFigure-Edit users | No foreign credit card required |\n\nThis project is licensed under the MIT License - see `LICENSE` for details.\nName and logo usage are covered separately in `TRADEMARK.md`.\n\n---\n\n## More From ResearAI\n\nExplore more open-source research tools from ResearAI:\n\n| Project | What it does |\n|---|---|\n| [DeepScientist](https:\u002F\u002Fgithub.com\u002FResearAI\u002FDeepScientist) | autonomous scientific discovery system |\n| [AutoFigure](https:\u002F\u002Fgithub.com\u002FResearAI\u002FAutoFigure) | generate paper-ready figures |\n| [DeepReviewer-v2](https:\u002F\u002Fgithub.com\u002FResearAI\u002FDeepReviewer-v2) | review papers and drafts |\n| [Awesome-AI-Scientist](https:\u002F\u002Fgithub.com\u002FResearAI\u002FAwesome-AI-Scientist) | curated AI scientist landscape |\n\n\n\n---\n\n## Q&A\n\nThe optimal configuration for this project uses `gemini-3.1-flash-image-preview` from Google AI Studio [[https:\u002F\u002Faistudio.google.com\u002F](https:\u002F\u002Faistudio.google.com\u002F)] as the image generation model and `gemini-3.1-pro-preview` as the SVG conversion model. Each run costs approximately $0.50, consumes about 30,000 tokens, and takes around 20 minutes. **It is strongly recommended to use the 4K option for optimal performance**, as using 1K or 2K resolutions will result in the final generated SVG being unusually blurry.\n\n[Mainland China Notice] Gemini's Terms of Service do not permit access or usage by users in mainland China. If OpenRouter throws an error, it is often because an account registered in mainland China lacks the necessary permissions to use Gemini. **It is recommended to use an OpenRouter account registered in the United States or Europe and to ensure compliant usage.**\n","AutoFigure-Edit 是一个将论文方法部分转换为可编辑SVG图形的工具。它基于Python开发，能够根据文本生成科学插图，并提供嵌入式SVG编辑器供用户进一步修改和完善。该项目支持多种AI模型，包括OpenAI的gpt-image-2和gpt-5.5等，同时具备多语言配置功能。适合科研人员在撰写论文时快速创建高质量、可定制的科学图表，提高工作效率。",2,"2026-06-11 03:48:47","high_star"]