[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-84126":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":11,"openIssues":12,"contributorsCount":12,"subscribersCount":12,"size":12,"stars1d":12,"stars7d":12,"stars30d":12,"stars90d":12,"forks30d":12,"starsTrendScore":12,"compositeScore":12,"rankGlobal":9,"rankLanguage":9,"license":9,"archived":13,"fork":13,"defaultBranch":14,"hasWiki":15,"hasPages":13,"topics":16,"createdAt":9,"pushedAt":9,"updatedAt":17,"readmeContent":18,"aiSummary":9,"trendingCount":12,"starSnapshotCount":12,"syncStatus":19,"lastSyncTime":20,"discoverSource":21},84126,"xhs-","leeee-999\u002Fxhs-","leeee-999","爬取xhs图片，评论，文字，存储在独立文件夹中",null,"JavaScript",53,0,false,"main",true,[],"2026-06-12 02:04:38","# 小红书帖子收集\n\n这个文件夹用来把小红书种草帖整理成旅行可用资料。\n\n## 文件结构\n\n- `links.json`：待整理的小红书链接清单。以后新增帖子就往这里加。\n- `posts\u002F`：每篇帖子一个子文件夹，保存截图、可读文本、元数据和手动笔记。\n- `scripts\u002Farchive-xhs.mjs`：半自动归档脚本。\n\n## 使用方式\n\n在当前旅行文件夹运行：\n\n```powershell\n& 'C:\\Users\\li\\.cache\\codex-runtimes\\codex-primary-runtime\\dependencies\\node\\bin\\node.exe' '.\\xhs收集\\scripts\\archive-xhs.mjs'\n```\n\n如果页面要求登录或一直加载，运行带界面的模式：\n\n```powershell\n& 'C:\\Users\\li\\.cache\\codex-runtimes\\codex-primary-runtime\\dependencies\\node\\bin\\node.exe' '.\\xhs收集\\scripts\\archive-xhs.mjs' --headed\n```\n\n打开浏览器后手动登录\u002F等待页面加载，脚本会尽量保存页面截图和可见文字。不要用它绕过验证码、登录限制或批量高频抓取；这里只做个人旅行资料归档。\n\n## 常见情况\n\n- `安全限制 \u002F IP存在风险 \u002F 300012`：这是小红书网页端的风控，脚本不能也不应该绕过。请用手机打开帖子后截图，或复制正文给我，我可以继续整理。\n- `visible-text.txt` 很短：说明正文没有被网页端加载出来，看 `screenshot.png` 判断是否需要手动补。\n- 图片没有抓到：小红书图片经常被动态脚本保护，建议直接把手机截图放进对应帖子文件夹。\n\n## MediaCrawler 方案\n\n我已经把 `NanmiCoder\u002FMediaCrawler` 克隆到 `_tools\u002FMediaCrawler`，并配置为只抓这条小红书笔记详情：\n\n```text\n69bcb7660000000022029344\n```\n\n它需要用你自己的浏览器登录态。运行：\n\n```powershell\n.\\xhs收集\\scripts\\run-mediacrawler-xhs.ps1\n```\n\n脚本会打开一个独立 Chrome\u002FEdge 资料目录。请在浏览器里登录小红书，确认能正常打开帖子，再回到终端按 Enter。\n\n采集成功后，把 MediaCrawler 输出导入帖子文件夹：\n\n```powershell\n& 'C:\\Users\\li\\.cache\\codex-runtimes\\codex-primary-runtime\\dependencies\\node\\bin\\node.exe' '.\\xhs收集\\scripts\\import-mediacrawler-output.mjs'\n```\n\n导入后看：\n\n- `posts\u002F01-小韩著名牛逼十八件套-心碎小芝士\u002Fmediacrawler-note.json`\n- `posts\u002F01-小韩著名牛逼十八件套-心碎小芝士\u002Fmanual-notes.md`\n- `posts\u002F01-小韩著名牛逼十八件套-心碎小芝士\u002Fsummary.md`\n\n## 每篇帖子整理建议\n\n采集后重点看每个帖子文件夹里的：\n\n- `screenshot.png`：页面截图\n- `visible-text.txt`：页面可见文字\n- `manual-notes.md`：如果正文抓不到，把截图里或手机里看到的店名粘到这里\n- `summary.md`：最后整理成“店名 \u002F 区域 \u002F 适合做什么 \u002F 放进哪一天”\n",2,"2026-06-11 04:12:20","CREATED_QUERY"]