[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-83242":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":17,"stars90d":15,"forks30d":15,"starsTrendScore":18,"compositeScore":19,"rankGlobal":10,"rankLanguage":10,"license":10,"archived":20,"fork":20,"defaultBranch":21,"hasWiki":22,"hasPages":22,"topics":23,"createdAt":10,"pushedAt":10,"updatedAt":24,"readmeContent":25,"aiSummary":10,"trendingCount":15,"starSnapshotCount":15,"syncStatus":26,"lastSyncTime":27,"discoverSource":28},83242,"flipbook-app","imcuttle\u002Fflipbook-app","imcuttle","🎨 点击式探索的知识画册，长按图片生成带标注子图 | Flipbook Canvas — click-to-explore knowledge picture-book. Long-press any image to spawn an annotated child diagram via a pluggable multimodal pipeline (LLM + image gen + web search + OCR). ","https:\u002F\u002Fimcuttle.github.io\u002Fflipbook-app\u002F",null,"JavaScript",250,11,1,0,10,138,52,90.24,false,"main",true,[],"2026-06-12 04:01:40","# 🎨 Flipbook Canvas\n\n**English** · [中文](.\u002FREADME.zh.md)\n\n[![Node](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FNode.js-%E2%89%A520.10-339933?logo=node.js&logoColor=white)](https:\u002F\u002Fnodejs.org\u002F)\n[![React](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FReact-18-61DAFB?logo=react&logoColor=white)](https:\u002F\u002Freact.dev\u002F)\n[![Vite](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FVite-5-646CFF?logo=vite&logoColor=white)](https:\u002F\u002Fvitejs.dev\u002F)\n[![Express](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FExpress-4-000000?logo=express&logoColor=white)](https:\u002F\u002Fexpressjs.com\u002F)\n[![TypeScript](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTypeScript-5-3178C6?logo=typescript&logoColor=white)](https:\u002F\u002Fwww.typescriptlang.org\u002F)\n[![SQLite](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FSQLite-Sequelize-003B57?logo=sqlite&logoColor=white)](https:\u002F\u002Fwww.sqlite.org\u002F)\n[![Multimodal](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FMultimodal-LLM%20%C3%97%20ImageGen%20%C3%97%20WebSearch%20%C3%97%20OCR-FF6F61)](#-multimodal--mainstream-llms)\n[![PRs Welcome](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPRs-welcome-brightgreen.svg)](https:\u002F\u002Fgithub.com\u002Fimcuttle\u002Fflipbook-app\u002Fpulls)\n[![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fimcuttle\u002Fflipbook-app?style=social)](https:\u002F\u002Fgithub.com\u002Fimcuttle\u002Fflipbook-app\u002Fstargazers)\n\n### 🔭 [**Live examples → imcuttle.github.io\u002Fflipbook-app**](https:\u002F\u002Fimcuttle.github.io\u002Fflipbook-app\u002F)\n\n> Browse fully-interactive, exported flipbooks right in your browser — click hotspots to drill in, no install needed.\n\n> ✨ Click anywhere on a generated image. The backend infers what you clicked,\n> searches the web when useful, generates a child diagram, and links it back.\n> **A flipbook of explorable knowledge — one click at a time.**\n\n> 💡 Inspired by and a re-implementation of the product idea behind\n> [flipbook.page](https:\u002F\u002Fflipbook.page) — credit to the original team for the\n> click-to-explore canvas concept.\n\nA long-running web product: **Express + SSE** backend, **Vite + React + TS**\nfrontend, a **pluggable multi-model image pipeline**, web-search augmented\nplanning, per-node concurrency, read-only share links, fullscreen casting and\na fully responsive mobile layout.\n\n---\n\n## ✨ Why this is fun\n\nMost \"AI画图\" demos stop at one image. This one turns each image into a\n**playable knowledge surface**:\n\n- 🖱️ **Long-press anywhere on a picture** → the model reads what's under your\n  finger, decides whether the topic needs fresh sources, optionally hits the\n  web, then paints a brand new annotated diagram zoomed into that concept.\n- 📚 **Encyclopedia-style output** — every node ships with a 150–220-char\n  caption and 20–40 in-image labels (place names, dates, numbers…), all\n  OCR'd back into a transparent text layer so you can drag-select and copy\n  any fragment straight off the picture.\n- 🌳 **Infinite tree of canvases** — every click spawns a child node; the\n  whole exploration tree is persisted, shareable, and replayable.\n- ⏳ **Watch it think** — a node is saved and linkable the instant you click,\n  then its title \u002F caption \u002F scene prompt **type out live**; share the link\n  and a friend on another device watches the same stream fill in.\n\n---\n\n## 📸 Screenshots\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd align=\"center\" width=\"50%\">\n      \u003Cimg src=\".\u002Fdocs\u002Fassets\u002Fdemo.gif\" alt=\"Click-to-explore demo\" \u002F>\n      \u003Cbr\u002F>\u003Csub>\u003Cb>Click-to-explore\u003C\u002Fb> — long-press any region to drill in\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\" width=\"50%\">\n      \u003Cimg src=\".\u002Fdocs\u002Fassets\u002Fwoodpecker.gif\" alt=\"Woodpecker walkthrough\" \u002F>\n      \u003Cbr\u002F>\u003Csub>\u003Cb>End-to-end pipeline\u003C\u002Fb> — search → planner → ImageGen → drill-down\u003C\u002Fsub>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd colspan=\"2\" align=\"center\">\n      \u003Cimg src=\".\u002Fdocs\u002Fassets\u002Fscreenshot.png\" alt=\"Gallery and canvas\" \u002F>\n      \u003Cbr\u002F>\u003Csub>\u003Cb>Gallery + canvas\u003C\u002Fb> — every canvas is persisted, shareable, replayable\u003C\u002Fsub>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n---\n\n## 🚀 Highlights\n\n- 🖱️ **Click-to-explore**: long-press (1 s) anywhere on a node's image. The\n  backend infers the label, decides whether to web-search, then generates a\n  child node. Spatial + semantic dedup means clicking the same region again\n  jumps straight in.\n- ⏳ **Live-streaming, linkable generating nodes**: the moment you click, the\n  child node is **persisted under its final id** and its parent hotspot links\n  to it immediately — so it's **shareable \u002F openable on any device while still\n  generating**. Its title, caption and image prompt **type out live**\n  (token-streamed via SSE), the catalog shows a **spinner row**, and a refresh\n  or cross-device open **resumes the stream from the on-disk snapshot**. On\n  failure the half-node is auto-deleted.\n- 🌫️ **Progressive image loading**: every PNG gets blur → thumbnail → medium →\n  full variants (sharp). Gallery cards blur-up, the canvas swaps to full-res\n  when ready — no broken-image flashes, fast first paint.\n- 🖼️ **Portrait & landscape canvases**: pick orientation per canvas (mobile\n  portrait viewports default to portrait); filter the gallery by\n  **All \u002F Landscape \u002F Portrait** with the choice synced to the URL.\n- ⚡ **Per-node parallelism**: up to **4 different spots in parallel per parent**\n  (configurable). Each in-flight click streams a phase chip\n  (`Inferring label…` → `Searching the web…` → `Generating image…`) on the\n  hotspot. Hit the cap and the cursor turns into ⌛.\n- 📖 **Encyclopedia register**: planner produces 150–220 char captions with\n  20–40 in-image text fragments — like reading a richly annotated diagram in\n  a children's encyclopedia. Long captions clamp to 2 lines with a\n  **查看更多 \u002F Show more** toggle.\n- 🌐 **Web-search augmented**: a \"decide-then-search\" gate asks the LLM whether\n  a topic benefits from up-to-date sources. When yes, results are fetched and\n  fed into the planner; sources are persisted to disk + DB and rendered as a\n  📚 hover badge over the canvas.\n- 🔁 **Resilient SSE**: Last-Event-ID replay + per-job snapshot resume — a\n  dropped connection or page refresh mid-generation reconnects and catches up\n  on everything it missed, including the in-flight typewriter.\n- 🎬 **Scene transitions**: drill-in \u002F drill-out \u002F fade animations make\n  navigation feel like a zooming flipbook rather than a page swap.\n- 🔗 **Share as preview**: any canvas → read-only `?s=\u003Ctoken>` URL. Viewers can\n  navigate and watch live SSE updates from in-flight generations, but cannot\n  trigger new ones.\n- 📺 **Fullscreen casting**: ⛶ requests browser fullscreen; toggle the chrome\n  (breadcrumb + caption + hint) on\u002Foff for a clean projection view.\n- 🔤 **Selectable in-image text**: every label baked into the diagram is OCR'd\n  with Apple Vision (`zh-Hans` + `en-US`) and overlaid as invisible HTML, so\n  users can drag-select and Cmd-C copy any text directly off the picture\n  while the painted pixels remain the visual ground truth.\n- 🔊 **Voice narration**: each node's title + caption is synthesised to speech\n  with **Microsoft Edge neural voices** (msedge-tts — free, no API key). Pick a\n  **character voice** per flipbook from the live Edge catalogue (filtered to the\n  UI language); the picker reads \"晓晓 · 女声\" instead of raw locale IDs.\n  Switching voices re-narrates the whole book and restarts in-flight playback.\n  **Auto-narration is on by default** (toggleable) and is bundled into exports\n  so the static site speaks offline too.\n- 📱 **Mobile responsive**: sticky top bar that pins on scroll, single-column\n  gallery, pinch-zoom image lightbox, smaller hotspots and pending bubbles.\n\n---\n\n## 🤖 Multimodal × Mainstream LLMs\n\nFlipbook Canvas is built around a **pluggable multimodal pipeline**. Three\nmodalities are wired end-to-end:\n\n| Modality | What it does | Pluggable into |\n|---|---|---|\n| 📝 **Text \u002F JSON LLM** | planner, click-label inference, decide-then-search verdict | any chat-completion-style model |\n| 🖼️ **Image generation** | turns a structured prompt into a 2752×1536 annotated diagram with bake-in text labels | OpenAI, Nano Banana (Gemini), Seedream\u002FSeeddance, or your own provider |\n| 🌐 **Web search** | rephrased query → top-N normalized results → planner context + 📚 sources panel | any search backend |\n| 👁️ **OCR (Apple Vision)** | `zh-Hans` + `en-US` recognition over every generated PNG, projected as a selectable HTML overlay | local, no API keys needed |\n| 🔊 **TTS (Edge neural voices)** | synthesises each node's title + caption to an mp3, per-flipbook character voice | Microsoft Edge online voices via msedge-tts, no API key |\n\nThe image layer is a **provider chain** (`IMAGE_PROVIDER=...,svg`) — first\nenabled provider wins, `svg` is always appended last as a placeholder so the\nUI never breaks. Adding a new model is a single file:\n\n```js\n\u002F\u002F server\u002Fsrc\u002Fgeneration\u002Fproviders\u002F\u003Cname>.js\nexport default {\n  name: 'my-model',\n  enabled(config) { return Boolean(config.MY_API_KEY); },\n  async generate({ imagePrompt, outputDir, size, title, hash, onEvent }) {\n    \u002F\u002F call your model, write \u003Chash>.png into outputDir, push phase events\n  },\n};\n```\n\nOut of the box:\n\n| Provider | Trigger to enable | Status |\n|---|---|---|\n| `openai` | `OPENAI_API_KEY` set | 🔌 stub — implement in `providers\u002Fopenai.js` |\n| `nanobanana` | `NANOBANANA_API_KEY` or `GEMINI_API_KEY` | 🔌 stub |\n| `seeddance` | `SEEDDANCE_API_KEY` or `ARK_API_KEY` | 🔌 stub |\n| `codebuddy` | `ENABLE_CODEBUDDY=1` | ✅ reference impl (used in the demo gif) |\n| `svg` | always | ✅ fallback placeholder |\n\n> 🎯 The **reference implementation** wires the `codebuddy` CLI as a\n> subprocess driver for planner \u002F ImageGen \u002F WebSearch. Subprocess lifecycle\n> (concurrency cap, per-call timeouts, single retry, file-size sanity check on\n> generated PNGs, graceful degradation) lives in `server\u002Fsrc\u002FcodebuddyClient.js`\n> and is a useful template if you ever shell out to *any* CLI-based model.\n\n---\n\n## 🐦 Walkthrough — generating a woodpecker flipbook from zero\n\nType `啄木鸟` (woodpecker) into the top bar and watch the entire pipeline run:\ndecide-then-search → planner → ImageGen → click to drill into the tongue\nanatomy \u002F nest cavity \u002F ant-foraging zones, each spawning its own annotated\ndiagram with its own sources.\n\n---\n\n## 🗂️ Layout\n\n```\n.\n├── prompts\u002F                        # system \u002F planner \u002F click-label \u002F image-prompt \u002F decide-search\n├── scripts\u002F\n│   ├── sync-prompts.mjs\n│   ├── serve-preview.mjs           # build + serve one canvas's static preview\n│   └── example-doc-publish.mjs     # publish canvases to GitHub Pages\n├── server\u002F\n│   └── src\u002F\n│       ├── routes\u002F                 # canvas, click, events (SSE), assets, share\n│       ├── export\u002F                 # static-site exporter + viewer template\n│       │   ├── buildExport.js      # buildCanvasSite \u002F buildCanvasExport (zip)\n│       │   └── template\u002F           # self-contained index.html + viewer.js\u002Fcss\n│       ├── lib\u002Fzip.js              # dependency-free ZIP writer\n│       ├── generation\u002F\n│       │   ├── pipeline.js         # generateRoot + expandFromClick + per-node concurrency\n│       │   ├── decideSearch.js     # decide-then-search gate\n│       │   ├── webSearch.js        # WebSearch subprocess + result normaliser\n│       │   ├── queue.js            # PerCanvasQueue \u002F Semaphore \u002F PerKeySemaphore\n│       │   ├── planner.js \u002F clickLabel.js\n│       │   ├── image.js            # provider-chain orchestrator\n│       │   └── providers\u002F          # codebuddy, openai, nanobanana, seeddance, svg\n│       ├── db\u002F                     # Sequelize models + hydrateFromDisk\n│       ├── store\u002F                  # filesystem layer\n│       ├── sse\u002F                    # event hub\n│       └── codebuddyClient.js      # reference CLI-subprocess wrapper\n└── web\u002F                            # Vite + React + TS\n```\n\n## 💾 Storage\n\n- 📁 **Filesystem** (source of truth for big artifacts):\n  `server\u002Fdata\u002Fcanvases\u002F\u003Cid>\u002F{data\u002Ftree.json, data\u002Fnodes\u002F\u003Chash>.json, images\u002F\u003Chash>.{png,svg}, manifest.json}`.\n- 🗃️ **SQLite** (`server\u002Fdata\u002Fflipbook.sqlite`, via Sequelize): metadata index —\n  Canvases \u002F Nodes \u002F Hotspots \u002F ShareLinks \u002F Sources tables. Drives the\n  gallery, spatial dedup, share lookup, and sources hover panel. On boot the\n  server runs `hydrateFromDisk()` to rebuild this index if it's missing.\n\n## 🛠️ Develop\n\n```bash\nnpm install\nnpm run dev           # server on :8787 + Vite on :5173 in parallel\n```\n\nOpen http:\u002F\u002F127.0.0.1:5173.\n\nBy default `ENABLE_CODEBUDDY=0` (stub mode — fast, SVG placeholders, no LLM).\nSet `ENABLE_CODEBUDDY=1` to use the reference CLI provider for planner +\nImageGen + WebSearch:\n\n```bash\nENABLE_CODEBUDDY=1 npm run dev:server\n```\n\n> ⏱️ With the reference provider, each node takes ~70–95 s end-to-end (planner\n> ~25 s + ImageGen ~50–60 s including cold start; +5–15 s if web search runs).\n> ImageGen produces **2752×1536 PNG** (~6 MB).\n\n### Per-node parallelism\n\nUp to **4 click expansions per parent node** run in parallel; excess clicks\nqueue. Different parents and different canvases run independently. A\nper-parent write lock serializes only the short read-modify-write of the\nparent node JSON. Tunable via `MAX_PARALLEL_CLICKS_PER_NODE` (default 4).\n\n## 🔍 Web search\n\nA pre-planner gate (`decideSearch.js` + `prompts\u002Fdecide-search.md`) calls the\nLLM with the proposed subject and asks: do recent \u002F authoritative sources\nmaterially improve this node? The default leans **yes** — only clearly\nabstract \u002F timeless subjects skip search. When yes:\n\n1. The web-search backend runs with the rephrased query.\n2. Results are normalised into `{title, url, snippet, source}`.\n3. Top results are passed into the planner prompt.\n4. Sources are persisted both into `nodes\u002F\u003Chash>.json` and into the SQLite\n   `Sources` table.\n5. The frontend renders a 📚 badge near the breadcrumb. Hover to see a popover\n   with the source list (220 ms grace period so the popover is reachable with\n   the mouse).\n\n## 📦 Export as a standalone static site\n\nAny canvas can be exported as a **fully self-contained static site** — a\nread-only replica of the preview with all data and images inlined, openable\ndirectly from `file:\u002F\u002F` with zero network requests.\n\n- **In-app**: the `···` More menu → **Export preview** downloads a `.zip`\n  (`index.html` \u002F `viewer.js` \u002F `viewer.css` \u002F `data.js` + `images\u002F`).\n- **Serve one locally** for quick viewing in a browser:\n\n  ```bash\n  npm run serve-preview -- \u003CcanvasId> [--lang en] [--port 8088]\n  ```\n\n  Builds the static site to a temp dir, starts a tiny static HTTP server,\n  prints the URL. Ctrl-C cleans up.\n\n- **Publish to GitHub Pages** (one or more canvases → a routed gallery landing\n  page at `\u002F`, each example at `\u002F\u003CcanvasId>\u002F`):\n\n  ```bash\n  npm run example:publish -- \u003CcanvasId> [\u003CcanvasId> ...] [--lang en] [--no-push]\n  ```\n\n  Builds each canvas, regenerates the landing index, and pushes to the\n  `gh-pages` branch (accumulating — re-publishing a new id keeps the others).\n  → see the result at **https:\u002F\u002Fimcuttle.github.io\u002Fflipbook-app\u002F**.\n\nThe exported viewer mirrors the live read-only preview: image stage with\ncollision-avoiding hotspot labels, leader lines, selectable OCR text overlay,\ncaption, breadcrumb, catalog and sources — plus progressive image loading,\nscene transitions, and next-layer image prefetch. **Per-node narration mp3s are\nbundled too**, so the static site auto-narrates offline (toggleable in the top\nbar). It never calls the server.\n\n## 🔗 Share \u002F preview links\n\n- `POST \u002Fapi\u002Fcanvas\u002F:id\u002Fshare` → `{token, url}`. Reuses an existing token for\n  the same canvas.\n- `GET \u002Fapi\u002Fshare\u002F:token` → `{canvasId, topic, readOnly:true}`.\n- Frontend: opening `…?s=\u003Ctoken>` puts the UI in **read-only preview** mode —\n  no topic input, no clicks on the image, \"👁 Preview\" badge in the corner.\n  SSE stays connected, so a viewer watching mid-generation sees images stream\n  in real-time.\n\n## 📺 Fullscreen \u002F casting\n\n- `⛶` button in TopBar requests browser fullscreen; uses CSS-only fullscreen\n  on iOS Safari where the API isn't supported.\n- `👁` \u002F `🚫` button (visible while in fullscreen) toggles the breadcrumb +\n  caption + hint. Useful for clean projection.\n- Long-press hint is suppressed in fullscreen by default; the press still\n  works.\n\n## 🧹 Cleaning local state\n\n```bash\nnpm run clean:data    # reset server\u002Fdata (all canvases)\nnpm run clean:dist    # reset web\u002Fdist\nnpm run clean         # both\n```\n\n## 📦 Build for production\n\n```bash\nnpm run build         # builds web\u002Fdist\nnpm start             # serves web\u002Fdist + API from :8787\n```\n\n## 🌐 LAN access via a fixed domain (macOS)\n\nGive the app a stable hostname (e.g. `http:\u002F\u002Fflipbook.lan`) reachable from any\ndevice on your LAN — no port number needed. Uses **dnsmasq** (resolves the\ndomain → this machine's LAN IP) + **Caddy** (reverse-proxies `:80` to the app).\n\n```bash\nnpm run lan:up        # flipbook.lan → dev :5173 (preferred), falls back to prod :8787\nnpm run lan:down      # tear it down\n\n# custom: scripts\u002Flan-domain-setup.sh \u003Cdomain> \u003CdevPort> \u003CprodPort>\nbash scripts\u002Flan-domain-setup.sh studio.lan 5173 8787\n```\n\nThe proxy tries the **dev** port (5173) first and automatically **falls back to\nthe prod** port (8787) when dev isn't running (passive health check, 3s\nblacklist). So `npm run dev` and `npm start` both work behind the same domain.\n\n`lan:up` installs dnsmasq\u002Fcaddy via Homebrew if missing and needs `sudo`\n(dnsmasq binds 53, Caddy binds 80). It only configures **this** machine; to\nreach the domain from other devices, point their DNS at this machine's LAN IP\n(router DHCP DNS, per-device DNS, or a `hosts` entry — the script prints the\nexact options and your IP).\n\n## ⚙️ Configuration (env)\n\n| Var | Default | Purpose |\n|---|---|---|\n| `PORT` | 8787 | server port |\n| `HOST` | 127.0.0.1 | server bind |\n| `DATA_DIR` | `server\u002Fdata` | canvas state on disk |\n| `PROMPTS_DIR` | `prompts` | prompt files |\n| `DB_PATH` | `\u003CDATA_DIR>\u002Fflipbook.sqlite` | SQLite file |\n| `MAX_PARALLEL_CLICKS_PER_NODE` | 4 | concurrent click expansions per parent |\n| `MAX_PARALLEL_CODEBUDDY` | 20 | concurrent planner\u002FLLM subprocesses |\n| `MAX_PARALLEL_IMAGE` | 20 | concurrent image-generation jobs (separate pool from the LLM limit) |\n| `PLANNER_TIMEOUT_MS` | 90000 | per-call planner timeout |\n| `IMAGE_TIMEOUT_MS` | 180000 | per-call ImageGen timeout |\n| `WEB_SEARCH_TIMEOUT_MS` | 60000 | per-call WebSearch timeout |\n| `IMAGE_PROVIDER` | `codebuddy` | provider chain (e.g. `openai,nanobanana,svg`) |\n| `IMAGE_SIZE` | `1920x1080` | requested size (provider may pick its own) |\n| `ENABLE_CODEBUDDY` | 0 | flip to 1 to enable the reference CLI provider |\n| `ENABLE_WEB_SEARCH` | follows `ENABLE_CODEBUDDY` | force-disable with `0` |\n| `ENABLE_OCR` | 1 | run Apple Vision OCR on each generated PNG to produce a selectable text overlay; set to `0` to skip |\n| `OCR_TIMEOUT_MS` | 25000 | per-call OCR timeout |\n| `OCR_MIN_CONFIDENCE` | 0.4 | drop OCR spans below this confidence |\n| `ENABLE_AUDIO` | 1 | synthesise Edge neural-voice narration (mp3) for each node; set to `0` to skip. Non-blocking — failures never stop image generation |\n| `AUDIO_TIMEOUT_MS` | 30000 | per-call TTS synthesis timeout |\n\n---\n\n**English** · [中文](.\u002FREADME.zh.md)\n",2,"2026-06-11 04:10:31","CREATED_QUERY"]