[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-74036":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":23,"hasPages":23,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":34,"readmeContent":35,"aiSummary":36,"trendingCount":16,"starSnapshotCount":16,"syncStatus":37,"lastSyncTime":38,"discoverSource":39},74036,"Edit-Banana","BIT-DataLab\u002FEdit-Banana","BIT-DataLab","Edit Banana: A framework for converting statistical formats into editable.","https:\u002F\u002Fwww.editbanana.net",null,"Python",5306,360,14,19,0,25,46,148,75,113.67,"GNU Affero General Public License v3.0",false,"main",[26,27,28,29,30,31,32,33],"ai","data","figure","llm","nanobanana","open-source","python","pythonprogramming","2026-06-12 04:01:12","\u003Cp align=\"center\">\n  \u003Cimg src=\"\u002Fstatic\u002Fbanana.jpg\" width=\"180\" alt=\"Edit Banana Logo\"\u002F>\n\u003C\u002Fp>\n\n\u003Ch1 align=\"center\">🍌 Edit Banana\u003C\u002Fh1>\n\u003Cp align=\"center\">\n  \u003Ca href=\"README_CN.md\">中文\u003C\u002Fa> | English\n\u003C\u002Fp>\n\u003Ch3 align=\"center\">Universal Content Re-Editor: Make the Uneditable, Editable\u003C\u002Fh3>\n\n\u003Cp align=\"center\">\nBreak free from static formats. Our platform empowers you to transform fixed content into fully manipulatable assets.\nPowered by SAM 3 and multimodal large models, it enables high-fidelity reconstruction that preserves the original diagram details and logical relationships.\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fwww.python.org\u002F\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPython-3.10+-3776AB?style=flat-square&logo=python&logoColor=white\" alt=\"Python\"\u002F>\u003C\u002Fa>\n  \u003Ca href=\"LICENSE\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-Apache_2.0-2F80ED?style=flat-square&logo=apache&logoColor=white\" alt=\"License\"\u002F>\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fdeveloper.nvidia.com\u002Fcuda-downloads\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FGPU-CUDA%20Recommended-76B900?style=flat-square&logo=nvidia\" alt=\"CUDA\"\u002F>\u003C\u002Fa>\n  \u003Ca href=\"#-join-wechat-group\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FWeChat-Join%20Group-07C160?style=flat-square&logo=wechat&logoColor=white\" alt=\"WeChat\"\u002F>\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FBIT-DataLab\u002FEdit-Banana\u002Fstargazers\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FBIT-DataLab\u002FEdit-Banana?style=flat-square&logo=github\" alt=\"GitHub stars\"\u002F>\u003C\u002Fa>\n\u003C\u002Fp>\n\n---\n\n\u003Ch3 align=\"center\">Try It Now!\u003C\u002Fh3>\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fwww.editbanana.net\u002F\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🚀%20Try%20Online%20Demo-www.editbanana.net-FF6B6B?style=for-the-badge&logoColor=white\" alt=\"Try Online Demo\"\u002F>\n  \u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  👆 \u003Cb>Click above or https:\u002F\u002Fwww.editbanana.net\u002F to try Edit Banana online!\u003C\u002Fb> Upload an image to get \u003Cb>editable DrawIO (XML)\u003C\u002Fb> in seconds. \n\u003C\u002Fp>\n\n> [!WARNING]\n> **Please note**: Our GitHub repository currently trails behind our web-based service. For the most up-to-date features and performance, we recommend using our web platform.\n---\n## 💬 Join WeChat Group\n\nWelcome to join our WeChat group to discuss and exchange ideas! Scan the QR code below to join:\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"\u002Fstatic\u002FwechatGroup.jpg\" width=\"70%\" alt=\"WeChat Group QR Code\"\u002F>\n  \u003Cbr\u002F>\n  \u003Cem>Scan to join the Edit Banana community\u003C\u002Fem>\n\u003C\u002Fp>\n\n> [!TIP]\n> If the QR code has expired, please submit an [Issue](https:\u002F\u002Fgithub.com\u002FBIT-DataLab\u002FEdit-Banana\u002Fissues) to request an updated one.\n\n---\n## 📑 Table of Contents\n\n- [📸 Effect Demonstration](#-effect-demonstration)\n- [🚀 Key Features](#-key-features)\n- [🛠️ Architecture Pipeline](#️-architecture-pipeline)\n- [📂 Project Structure](#-project-structure)\n- [📦 Installation & Setup](#-installation--setup)\n- [🔤 Usage](#-usage)\n- [⚙️ Configuration](#️-configuration)\n- [📌 Development Roadmap](#-development-roadmap)\n- [💬 Join WeChat Group](#-join-wechat-group)\n- [🤝 Contribution Guidelines](#-contribution-guidelines)\n- [🤩 Contributors](#-contributors)\n- [📄 License](#-license)\n- [🌟 Star History](#-star-history)\n\n---\n\n## 📸 Effect Demonstration\n\n### High-Definition Input-Output Comparison (4 Typical Scenarios)\n\nTo demonstrate the high-fidelity conversion effect, we provides one-to-one comparisons between 4 scenarios of \"original static formats\" and \"editable reconstruction results\". All elements can be individually dragged, styled, and modified.\n\n#### Scenario 1: Figures to DrawIO\n\n| 🔒 Original Static Diagram (Input · Non-editable) | 🔓 DrawIO Reconstruction Result (Output · Fully Editable) |\n|:---:|:---:|\n| \u003Cbr>\u003Cb>Example 1: Basic Flowchart\u003C\u002Fb>\u003Cbr>\u003Cbr>\u003Cimg src=\"\u002Fstatic\u002Fdemo\u002Foriginal_1.jpg\" width=\"450\" alt=\"Original Diagram 1\" style=\"border: 1px solid #eee; border-radius: 8px;\"\u002F> | \u003Cbr>\u003Cb>✨ Editable Flowchart\u003C\u002Fb>\u003Cbr>\u003Cbr>\u003Cimg src=\"\u002Fstatic\u002Fdemo\u002Frecon_1.png\" width=\"450\" alt=\"Reconstruction Result 1\" style=\"border: 1px solid #eee; border-radius: 8px;\"\u002F> |\n| \u003Cbr>\u003Cb>Example 2: Multi-level Architecture\u003C\u002Fb>\u003Cbr>\u003Cbr>\u003Cimg src=\"\u002Fstatic\u002Fdemo\u002Foriginal_2.png\" width=\"450\" alt=\"Original Diagram 2\" style=\"border: 1px solid #eee; border-radius: 8px;\"\u002F> | \u003Cbr>\u003Cb>✨ Editable Architecture\u003C\u002Fb>\u003Cbr>\u003Cbr>\u003Cimg src=\"\u002Fstatic\u002Fdemo\u002Frecon_2.png\" width=\"450\" alt=\"Reconstruction Result 2\" style=\"border: 1px solid #eee; border-radius: 8px;\"\u002F> |\n| \u003Cbr>\u003Cb>Example 3: Technical Schematic\u003C\u002Fb>\u003Cbr>\u003Cbr>\u003Cimg src=\"\u002Fstatic\u002Fdemo\u002Foriginal_3.jpg\" width=\"450\" alt=\"Original Diagram 3\" style=\"border: 1px solid #eee; border-radius: 8px;\"\u002F> | \u003Cbr>\u003Cb>✨ Editable Schematic\u003C\u002Fb>\u003Cbr>\u003Cbr>\u003Cimg src=\"\u002Fstatic\u002Fdemo\u002Frecon_3.png\" width=\"450\" alt=\"Reconstruction Result 3\" style=\"border: 1px solid #eee; border-radius: 8px;\"\u002F> |\n| \u003Cbr>\u003Cb>Example 4: Scientific Formula\u003C\u002Fb>\u003Cbr>\u003Cbr>\u003Cimg src=\"\u002Fstatic\u002Fdemo\u002Foriginal_4.jpg\" width=\"450\" alt=\"Original Diagram 4\" style=\"border: 1px solid #eee; border-radius: 8px;\"\u002F> | \u003Cbr>\u003Cb>✨ Editable Formula\u003C\u002Fb>\u003Cbr>\u003Cbr>\u003Cimg src=\"\u002Fstatic\u002Fdemo\u002Frecon_4.png\" width=\"450\" alt=\"Reconstruction Result 4\" style=\"border: 1px solid #eee; border-radius: 8px;\"\u002F> |\n\n#### Scenario 2: Human in the Loop Modification\n\n\u003Cdiv align=\"center\">\n\n\u003Cbr>\n\u003Cimg src=\"static\u002Fdemo\u002Fcut.gif\" width=\"90%\"\u002F>\n\u003Cbr>\u003Csub>✨ Manual repair\u003C\u002Fsub>\n\n\u003Cbr>\u003Cbr>\n\u003Cimg src=\"static\u002Fdemo\u002Fsave.gif\" width=\"90%\"\u002F>\n\u003Cbr>\u003Csub>✨ Save locally\u003C\u002Fsub>\n\n\u003C\u002Fdiv>\n\n> [!NOTE]\n> **✨ Conversion Highlights:**\n> 1. Preserves the layout logic, color matching, and element hierarchy of the original diagram.\n> 2. 1:1 restoration of shape stroke\u002Ffill and arrow styles (dashed lines\u002Fthickness).\n> 3. Accurate text recognition, supporting direct subsequent editing and format adjustment.\n> 4. All elements are independently selectable, supporting native DrawIO template replacement and layout optimization.\n\n## 🚀 Key Features\n\n- **Advanced Segmentation**: Using our fine-tuned **SAM 3 (Segment Anything Model 3)** for segmentation of diagram elements.\n- **Fixed Multi-Round VLM Scanning**: An extraction process guided by **Multimodal LLMs**.\n- **Text Recognition**:\n  - **Local OCR** for text localization; easy to install, runs offline.\n  - **Pix2Text** for mathematical formula recognition and **LaTeX** conversion .\n  - **Crop-Guided Strategy**: Extracts text\u002Fformula regions and sends high-res crops to the formula engine.\n\n- **User System**:\n\n  - **Registration**: New users receive **10 free credits**.\n  - **Credit System**: Pay-per-use model prevents resource abuse.\n  - **Multi-User Concurrency**: Built-in support for concurrent user sessions using a **Global Lock** mechanism for thread-safe GPU access and an **LRU Cache** (Least Recently Used) to persist image embeddings across requests, ensuring high performance and stability.\n\n---\n\n## 🛠️ Architecture Pipeline\n\n1.  **Input**: Image (PNG\u002FJPG\u002FBMP\u002FTIFF\u002FWebP).\n2.  **Segmentation (SAM3)**: Using our fine-tuned SAM3 mask decoder.\n3.  **Text Extraction (Parallel)**:\n    *   Local OCR (Tesseract) detects text bounding boxes.\n    *   High-res crops of text\u002Fformula regions are sent to Pix2Text for LaTeX conversion.\n4.  **DrawIO XML Generation**: Merging spatial data from SAM3 and text OCR results.\n\n---\n\n## 📂 Project Structure\n\n  \u003Cdetails>\n\n  \u003Csummary>\u003Cb>Click to expand project structure \u003C\u002Fb>\u003C\u002Fsummary>\n\n  ```text\n  Edit-Banana\u002F\n  ├── config\u002F               # Configuration files (copy config.yaml.example → config.yaml)\n  ├── flowchart_text\u002F       # OCR & Text Extraction Module (standalone entry)\n  │   ├── src\u002F\n  │   └── main.py             # OCR-only entry point\n  ├── input\u002F                # [Manual] Input images directory\n  ├── models\u002F               # [Manual] Model weights (SAM3) and optional BPE vocab\n  ├── output\u002F               # [Manual] Results directory\n  ├── sam3\u002F                 # SAM3 library (see Installation: install from facebookresearch\u002Fsam3)\n  ├── sam3_service\u002F         # SAM3 HTTP service (optional, for multi-process deployment)\n  ├── scripts\u002F              # Setup and utility scripts\n  │   ├── setup_sam3.sh       # Install SAM3 lib and copy BPE to models\u002F\n  │   ├── setup_rmbg.py       # Download RMBG model from ModelScope\n  │   └── merge_xml.py        # XML merge utilities\n  ├── main.py               # CLI entry (modular pipeline)\n  ├── server_pa.py          # FastAPI backend server\n  └── requirements.txt      # Python dependencies\n  ```\n\n  \u003C\u002Fdetails>\n\n  ---\n\n## 📦 Installation & Setup\n\nFollow these core phases to set up the project locally.\n\n### Phase 1: Environment & Base Setup\n\nConfigure your base environment and directory structure.\n\n#### 1. Prerequisites & Environment\n\n- Python 3.10+** & CUDA-capable GPU (Highly recommended)\n- Install PyTorch with CUDA support (e.g., for CUDA 11.8):\n\n    ```bash\n    pip install torch torchvision --index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu118\n    ```\n\n#### 2. Clone Repository & Init Directories\n\n  ```bash\n  git clone https:\u002F\u002Fgithub.com\u002FBIT-DataLab\u002FEdit-Banana.git\n  cd Edit-Banana\n  mkdir -p input output sam3_output\n  ```\n\n### Phase 2: Models & Core Dependencies\n\nNext, install the required packages and download necessary model weights (which should be placed in models\u002F and not committed).\n\n#### 1. Base Dependencies\n\n```bash\npip install -r requirements.txt\n```\n\n#### 2. SAM3 & Model Assets\n\n- SAM3 Library & BPE: \nRun `bash scripts\u002Fsetup_sam3.sh`to install the lib and copy the BPE vocab to `models\u002F`.\nVerify with:\n\n  ```bash\n  python -c \"from sam3.model_builder import build_sam3_image_model; print('OK')\"\n  ```\n\n- SAM3 Weights: Download sam3.pt from [ModelScope](https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002Ffacebook\u002Fsam3)  or [Hugging Face](https:\u002F\u002Fhuggingface.co\u002Ffacebook\u002Fsam3) and place it under `models\u002Fsam3_ms`.\n\n- Text Local OCR (Tesseract): \n\n  ```bash\n  sudo apt install tesseract-ocr tesseract-ocr-chi-sim\n  ```\n\n\u003Cdetails> \u003Csummary>\u003Cb>🧩 Optional Capabilities (OCR Engine, Formula, RMBG) - Click to expand\u003C\u002Fb>\u003C\u002Fsummary>\n\n- PaddleOCR (Alternative\u002FBetter for mixed text): Use paddlepaddle==3.2.2 (avoiding 3.3.0 bug).\n\n  ```bash\n  pip install paddlepaddle==3.2.2 paddleocr.\n  ```\n\n- Formula (Pix2Text): \n\n  ```bash\n  pip install pix2text onnxruntime-gpu.\n  ```\n\n- Background Removal (RMBG): `pip install onnxruntime modelscope` then run `python scripts\u002Fsetup_rmbg.py`.\n\u003C\u002Fdetails>\n\n### Phase 3: Configuration & Troubleshooting\n\n#### 1. Final Configuration\n\nCopy the example config and adjust the asset paths:\n\n  ```bash\n  cp config\u002Fconfig.yaml.example config\u002Fconfig.yaml\n  ```\n\nEdit `config.yaml` to ensure `sam3.checkpoint_path` and `sam3.bpe_path` match your `models\u002F locations`.\n\n\u003Cdetails> \u003Csummary>\u003Cb>🛠️ Before First Run Checklist & Troubleshooting - Click to expand\u003C\u002Fb>\u003C\u002Fsummary>\n\n**Checklist**:\n\n- [ ]  Config files copied and model paths set in `config.yaml`\n- [ ] SAM3 weights (`sam3.pt`) and BPE vocab placed under `models\u002F`\n- [ ] Extracted SAM3 library via `scripts\u002Fsetup_sam3.sh`\n Tesseract or PaddleOCR installed\n\n**Common Issues**:\n\n- \"no kernel image is available...\": GPU arch mismatch. Upgrade PyTorch or set `sam3.device: \"cpu\"`.\n- \"Model file not found at ...rmbg\u002F...\": RMBG is optional. Enable by downloading via script.\n- \"PaddleOCR inference failed...\": Use `paddlepaddle==3.2.2` or fallback to Tesseract.\n\n\u003C\u002Fdetails>\n\n---\n\n## 🔤 Usage\n\n### Command Line Interface (CLI)\n\nSupports image files (PNG, JPG, BMP, TIFF, WebP). To process a single image:\n\n```bash\npython main.py -i input\u002Ftest_diagram.png\n```\n\nThe output XML will be saved in the `output\u002F` directory. For batch processing, put images in `input\u002F` and run `python main.py` without `-i`.\n\n### Run and test locally\n\n1. **One-time setup**\n\n   ```bash\n   git clone https:\u002F\u002Fgithub.com\u002FBIT-DataLab\u002FEdit-Banana.git && cd Edit-Banana\n   python3 -m venv .venv && source .venv\u002Fbin\u002Factivate   # Linux\u002FmacOS; Windows: .venv\\Scripts\\activate\n   pip install torch torchvision --index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu118   # or CPU build\n   pip install -r requirements.txt\n   sudo apt install tesseract-ocr tesseract-ocr-chi-sim   # OCR (or equivalent on your OS)\n   ```\n\n   Install the SAM3 library and download model weights + BPE. Then:\n\n   ```bash\n   mkdir -p input output\n   cp config\u002Fconfig.yaml.example config\u002Fconfig.yaml\n   # Edit config\u002Fconfig.yaml: set sam3.checkpoint_path and sam3.bpe_path to your models\u002F paths\n   ```\n\n2. **Test with CLI**\n\n   ```bash\n   # Put a diagram image in input\u002F, e.g. input\u002Ftest.png\n   python main.py -i input\u002Ftest.png\n   # Output appears under output\u002F\u003Cimage_stem>\u002F (DrawIO XML and intermediates)\n   ```\n\n3. **Optional: test the web API**\n\n   ```bash\n   python server_pa.py\n   # In another terminal:\n   curl -X POST http:\u002F\u002Flocalhost:8000\u002Fconvert -F \"file=@input\u002Ftest.png\"\n   # Or open http:\u002F\u002Flocalhost:8000\u002Fdocs and use the \u002Fconvert endpoint with a file upload\n   ```\n\n---\n\n## ⚙️ Configuration\n\nCustomize the pipeline behavior in `config\u002Fconfig.yaml`:\n\n- **sam3**: Adjust score thresholds, NMS (Non-Maximum Suppression) thresholds, max iteration loops.\n\n- **paths**: Set input\u002Foutput directories.\n\n- **dominant_color**: Fine-tune color extraction sensitivity.\n\n---\n\n## 📌 Development Roadmap\n\n| Feature Module           | Status       | Description                     |\n|--------------------------|--------------|---------------------------------|\n| Core Conversion Pipeline | ✅ Completed | Full pipeline of segmentation, reconstruction and OCR |\n| Intelligent Arrow Connection | ⚠️ In Development | Automatically associate arrows with target shapes |\n| DrawIO Template Adaptation | 📍 Planned | Support custom template import |\n| Batch Export Optimization | 📍 Planned | Batch export to DrawIO files (.drawio) |\n| Local LLM Adaptation | 📍 Planned | Support local VLM deployment, independent of APIs |\n\n---\n\n\n## 🤝 Contribution Guidelines\n\nContributions of all kinds are welcome (code submissions, bug reports, feature suggestions):\n\n1. Fork this repository\n2. Create a feature branch (`git checkout -b feature\u002Fxxx`)\n3. Commit your changes (`git commit -m 'feat: add xxx'`)\n4. Push to the branch (`git push origin feature\u002Fxxx`)\n5. Open a Pull Request\n\nBug Reports: [Issues](https:\u002F\u002Fgithub.com\u002FBIT-DataLab\u002FEdit-Banana\u002Fissues)\nFeature Suggestions: [Discussions](https:\u002F\u002Fgithub.com\u002FBIT-DataLab\u002FEdit-Banana\u002Fdiscussions)\n\n---\n\n\n\n## 📄 License\n\nThis project is open-source under the [Apache License 2.0](LICENSE), allowing commercial use and secondary development (with copyright notice retained).\n\n---\n\n## 🌟 Star History\n\n🌟 If this project helps you, please star it to show your support!\n\n[![Star History Chart](https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=bit-datalab\u002Fedit-banana&type=date&legend=top-left)](https:\u002F\u002Fwww.star-history.com\u002F#bit-datalab\u002Fedit-banana&type=date&legend=top-left)\n","Edit Banana 是一个将统计图表格式转换为可编辑内容的框架。它利用SAM 3和多模态大模型技术，能够高保真地重建原始图表的细节和逻辑关系，使原本不可编辑的内容变得可以自由操作。项目主要使用Python语言编写，并推荐使用CUDA以获得更好的性能。适用于需要对固定格式的数据或图表进行再编辑的场景，比如学术研究、数据分析报告制作等。用户可以通过其官方网站上传图片，在几秒钟内获得可编辑的DrawIO (XML)文件，极大地提高了工作效率。",2,"2026-06-11 03:48:30","high_star"]