[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-2189":3},{"id":4,"name":5,"fullName":6,"owner":5,"repo":5,"description":7,"homepage":8,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":9,"rankLanguage":9,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":24,"topics":25,"createdAt":9,"pushedAt":9,"updatedAt":46,"readmeContent":47,"aiSummary":48,"trendingCount":15,"starSnapshotCount":15,"syncStatus":49,"lastSyncTime":50,"discoverSource":51},2189,"ArchiveBox","ArchiveBox\u002FArchiveBox","🗃 Open source self-hosted web archiving. Takes URLs\u002Fbrowser history\u002Fbookmarks\u002FPocket\u002FPinboard\u002Fetc., saves HTML, JS, PDFs, media, and more...","https:\u002F\u002Farchivebox.io",null,"Python",27678,1533,181,197,0,6,64,219,29,44.56,"MIT License",false,"dev",true,[26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45],"archivebox","backups","bookmark-archiver","browser-bookmarks","chromium","digipres","firefox","headless-browser","internet-archiving","pinboard","pocket","python","rss","self-hosted","singlefile","warc","wayback-machine","web-archiving","wget","youtube-dl","2026-06-12 02:00:38","\u003Cdiv align=\"center\" style=\"text-align: center; width: 100%\">\n\u003Cimg src=\"https:\u002F\u002Farchivebox.io\u002Ficon.png\" height=\"90px\"\u002F>\n\u003Ch1>ArchiveBox\u003Cbr\u002F>\u003Csub>Open-source self-hosted web archiving.\u003C\u002Fsub>\u003C\u002Fh1>\n\n\u003Cbr\u002F>\n\n▶️ \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FQuickstart\">Quickstart\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Fdemo.archivebox.io\">Demo\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\">GitHub\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\">Documentation\u003C\u002Fa> | \u003Ca href=\"#background--motivation\">Info & Motivation\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FWeb-Archiving-Community\">Community\u003C\u002Fa>\n\n\u003Cbr\u002F>\n\n\u003C!--\u003Ca href=\"http:\u002F\u002Fwebchat.freenode.net?channels=ArchiveBox&uio=d4\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FCommunity_chat-IRC-%2328A745.svg\"\u002F>\u003C\u002Fa>-->\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fblob\u002Fdev\u002FLICENSE\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FOpen_source-MIT-green.svg?logo=git&logoColor=green\"\u002F>\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fcommits\u002Fdev\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flast-commit\u002FArchiveBox\u002FArchiveBox.svg?logo=Sublime+Text&logoColor=green&label=Active\"\u002F>\u003C\u002Fa> &nbsp; \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FArchiveBox\u002FArchiveBox.svg?logo=github&label=Stars&logoColor=blue\"\u002F>\u003C\u002Fa> &nbsp; \u003Ca href=\"https:\u002F\u002Fhub.docker.com\u002Fr\u002Farchivebox\u002Farchivebox\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fdocker\u002Fpulls\u002Farchivebox\u002Farchivebox.svg?label=Docker+Pulls\"\u002F>\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fpypi.org\u002Fproject\u002Farchivebox\u002F\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fdm\u002Farchivebox?label=PyPI%20Installs&color=%235f7dae\"\u002F>\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Fchromewebstore.google.com\u002Fdetail\u002Farchivebox-exporter\u002Fhabonpimjphpdnmcfkaockjnffodikoj\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fchrome-web-store\u002Fusers\u002Fhabonpimjphpdnmcfkaockjnffodikoj?label=Chrome%20Store&color=%231973e8\"\u002F>\u003C\u002Fa>\n\n\u003C!--\u003Cpre lang=\"bash\" align=\"left\">\u003Ccode style=\"white-space: pre-line; text-align: left\" align=\"left\">\ncurl -fsSL 'https:\u002F\u002Fget.archivebox.io' | bash    # (or see pip\u002Fbrew\u002FDocker instructions below)\n\u003C\u002Fcode>\u003C\u002Fpre>-->\n\n\u003C\u002Fdiv>\n\u003Chr\u002F>\n\u003Cbr\u002F>\n\n**ArchiveBox is a self-hosted app that lets you preserve content from websites in a variety of formats.**\n\nWe aim to make your data immediately useful, and kept in formats that other programs can read directly. As output, we save standard HTML, PNG, PDF, TXT, JSON, WARC, SQLite, all guaranteed to be readable for decades to come. ArchiveBox also has a CLI, REST API, and webhooks so you can set up integrations with other services.\n\nWithout active preservation effort, everything on the internet eventually disappears or degrades.\n\n*ArchiveBox is an open source tool that lets organizations & individuals archive both public & private web content while retaining control over their data. It can be used to save copies of bookmarks, preserve evidence for legal cases, backup photos from FB\u002FInsta\u002FFlickr or media from YT\u002FSoundcloud\u002Fetc., save research papers, and more...*\n\u003Cbr\u002F>\n\n> ➡️ Get ArchiveBox with `pip install archivebox` on [Linux](#quickstart)\u002F[macOS](#quickstart), or via **[Docker](#quickstart)** ⭐️ on any OS.  \n\n*Once installed, you can interact with it through the: [Browser Extension](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002Farchivebox-browser-extension), [CLI](#usage), [self-hosted web interface](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FPublishing-Your-Archive), [Python API](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUsage#python-shell-usage), or [filesystem](#static-archive-exporting).*\n\n\u003Cbr\u002F>\n\u003Chr\u002F>\n\u003Cbr\u002F>\n\n📥 **You can feed ArchiveBox URLs one at a time, or schedule regular imports** from your bookmarks or history, social media feeds or RSS, link-saving services like Pocket\u002FPinboard, our [Browser Extension](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002Farchivebox-browser-extension), and more.  \n\u003Csub>See \u003Ca href=\"#input-formats\">Input Formats\u003C\u002Fa> for a full list of supported input formats...\u003C\u002Fsub>\n\n\u003Cbr\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002F90f1ce3c-75bb-401d-88ed-6297694b76ae\" alt=\"snapshot detail page\" align=\"right\" width=\"190px\" style=\"float: right\"\u002F>\n\n**It saves snapshots of the URLs you feed it in several redundant formats.**  \nIt also detects any content featured *inside* pages & extracts it out into a folder:\n- 🌐 **HTML**\u002F**Any websites** ➡️ `original HTML+CSS+JS`, `singlefile HTML`, `screenshot PNG`, `PDF`, `WARC`, `title`, `article text`, `favicon`, `headers`, ...\n- 🎥 **Social Media**\u002F**News** ➡️ `post content TXT`, `comments`, `title`, `author`, `images`, ...\n- 🎬 **YouTube**\u002F**SoundCloud**\u002Fetc. ➡️ `MP3\u002FMP4`s, `subtitles`, `metadata`, `thumbnail`, ...\n- 💾 **Github**\u002F**Gitlab**\u002Fetc. links ➡️ `clone of GIT source code`, `README`, `images`, ...\n- ✨ *and more, see [Output Formats](#output-formats) below...*\n\nYou can run ArchiveBox as a Docker web app to manage these snapshots, or continue accessing the same collection using the `pip`-installed CLI, Python API, and SQLite3 APIs. \nAll the ways of using it are equivalent, and provide matching features like adding tags, scheduling regular crawls, viewing logs, and more...\n\n\u003Cbr\u002F>\n\u003Chr\u002F>\n\n🛠️ ArchiveBox uses [standard tools](#dependencies) like Chrome, [`wget`](https:\u002F\u002Fwww.gnu.org\u002Fsoftware\u002Fwget\u002F), & [`yt-dlp`](https:\u002F\u002Fgithub.com\u002Fyt-dlp\u002Fyt-dlp), and stores data in [ordinary files & folders](#archive-layout).  \n*(no complex proprietary formats, all data is readable without needing to run ArchiveBox)*\n\nThe goal is to sleep soundly knowing the part of the internet you care about will be automatically preserved in durable, easily accessible formats [for decades](#background--motivation) after it goes down.\n\n\n\u003Chr\u002F>\n\u003Cbr\u002F>\n\n\n**📦&nbsp; Install ArchiveBox using your preferred method: `docker` \u002F `pip` \u002F `apt` \u002F etc. ([see full Quickstart below](#quickstart)).**\n\n\n\u003Cdetails>\n&nbsp; \u003Csummary>\u003Ci>Expand for quick copy-pastable install commands...\u003C\u002Fi> &nbsp; ⤵️\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\"># Option A: Get ArchiveBox with Docker Compose (recommended):\nmkdir -p ~\u002Farchivebox\u002Fdata && cd ~\u002Farchivebox\ncurl -fsSL 'https:\u002F\u002Fdocker-compose.archivebox.io' > docker-compose.yml   # edit options in this file as-needed\ndocker compose run archivebox init --install\n# docker compose run archivebox add 'https:\u002F\u002Fexample.com'\n# docker compose run archivebox help\n# docker compose up\n\u003Cbr\u002F>\n\u003Cbr\u002F>\n# Option B: Or use it as a plain Docker container:\nmkdir -p ~\u002Farchivebox\u002Fdata && cd ~\u002Farchivebox\u002Fdata\ndocker run -it -v $PWD:\u002Fdata archivebox\u002Farchivebox init --install\n# docker run -it -v $PWD:\u002Fdata archivebox\u002Farchivebox add 'https:\u002F\u002Fexample.com'\n# docker run -it -v $PWD:\u002Fdata archivebox\u002Farchivebox help\n# docker run -it -v $PWD:\u002Fdata -p 8000:8000 archivebox\u002Farchivebox\n\u003Cbr\u002F>\n\u003Cbr\u002F>\n# Option C: Or install it with your preferred pkg manager (see Quickstart below for apt, brew, and more)\npip install archivebox\nmkdir -p ~\u002Farchivebox\u002Fdata && cd ~\u002Farchivebox\u002Fdata\narchivebox init --install\n# archivebox add 'https:\u002F\u002Fexample.com'\n# archivebox help\n# archivebox server 0.0.0.0:8000\n\u003Cbr\u002F>\n\u003Cbr\u002F>\n# Option D: Or use the optional auto setup script to install it\ncurl -fsSL 'https:\u002F\u002Fget.archivebox.io' | bash\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Cbr\u002F>\n\u003Csub>Open \u003Ca href=\"http:\u002F\u002Fweb.archivebox.localhost:8000\">\u003Ccode>http:\u002F\u002Fweb.archivebox.localhost:8000\u003C\u002Fcode>\u003C\u002Fa> for the public UI and \u003Ca href=\"http:\u002F\u002Fadmin.archivebox.localhost:8000\">\u003Ccode>http:\u002F\u002Fadmin.archivebox.localhost:8000\u003C\u002Fcode>\u003C\u002Fa> for the admin UI ➡️\u003C\u002Fsub>\u003Cbr\u002F>\n\u003Csub>Set \u003Ccode>LISTEN_HOST\u003C\u002Fcode> to change the base domain; \u003Ccode>web.\u003C\u002Fcode> and \u003Ccode>admin.\u003C\u002Fcode> subdomains are used automatically.\u003C\u002Fsub>\n\u003C\u002Fdetails>\n\u003Cbr\u002F>\n\n\n\u003Cdiv align=\"center\" style=\"text-align: center\">\n\u003Cbr\u002F>\u003Cbr\u002F>\n\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002F5a7d95f2-6977-4de6-9f08-42851a1fe1d2\" height=\"70px\" alt=\"bookshelf graphic\"> &nbsp; \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002Fb2765a33-0d1e-4019-a1db-920c7e00e20e\" height=\"75px\" alt=\"logo\" align=\"top\"\u002F> &nbsp; \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002F5a7d95f2-6977-4de6-9f08-42851a1fe1d2\" height=\"70px\" alt=\"bookshelf graphic\">\n\u003Cbr\u002F>\u003Cbr\u002F>\n\u003Csmall>\u003Ca href=\"https:\u002F\u002Fdemo.archivebox.io\">Demo\u003C\u002Fa> | \u003Ca href=\"#screenshots\">Screenshots\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUsage\">Usage\u003C\u002Fa>\u003C\u002Fsmall>\n\u003Cbr\u002F>\n\u003Csub>. . . . . . . . . . . . . . . . . . . . . . . . . . . .\u003C\u002Fsub>\n\u003Cbr\u002F>\u003Cbr\u002F>\n\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002F8d67382c-e0ce-4286-89f7-7915f09b930c\" width=\"22%\" alt=\"cli init screenshot\" align=\"top\">\n\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002Fdad2bc51-e7e5-484e-bb26-f956ed692d16\" width=\"22%\" alt=\"cli init screenshot\" align=\"top\">\n\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002Fe8e0b6f8-8fdf-4b7f-8124-c10d8699bdb2\" width=\"22%\" alt=\"server snapshot admin screenshot\" align=\"top\">\n\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002Face0954a-ddac-4520-9d18-1c77b1ec50b2\" width=\"28.6%\" alt=\"server snapshot details page screenshot\" align=\"top\"\u002F>\n\u003Cbr\u002F>\u003Cbr\u002F>\n\u003C\u002Fdiv>\n\n## Key Features\n\n- [**Free & open source**](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fblob\u002Fdev\u002FLICENSE), own your own data & maintain your privacy by self-hosting\n- [**Powerful CLI**](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUsage#CLI-Usage) with [modular dependencies](#dependencies) and [support for Google Drive\u002FNFS\u002FSMB\u002FS3\u002FB2\u002Fetc.](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FSetting-Up-Storage)\n- [**Comprehensive documentation**](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki), [active development](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FRoadmap), and [rich community](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FWeb-Archiving-Community)\n- [**Extracts a wide variety of content out-of-the-box**](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fissues\u002F51): [media (yt-dlp), articles (readability), code (git), etc.](#output-formats)\n- [**Supports scheduled\u002Frealtime importing**](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FScheduled-Archiving) from [many types of sources](#input-formats)\n- [**Uses standard, durable, long-term formats**](#output-formats) like HTML, JSON, PDF, PNG, MP4, TXT, and WARC\n- [**Powerful CLI**](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUsage#CLI-Usage), [**self-hosted web UI**](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUsage#UI-Usage), [Python API](https:\u002F\u002Fdocs.archivebox.io\u002Fen\u002Fdev\u002Fapidocs\u002Farchivebox\u002Farchivebox.html) (BETA), [REST API](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fissues\u002F496) (ALPHA), or [desktop app](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002Felectron-archivebox)\n- [**Saves all pages to archive.org as well**](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FConfiguration#save_archive_dot_org) by default for redundancy (can be [disabled](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FSecurity-Overview#stealth-mode) for local-only mode)\n- Advanced users: support for archiving [content requiring login\u002Fpaywall\u002Fcookies](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FConfiguration#chrome_user_data_dir) (see wiki security caveats!)\n- Planned: support for running [JS during archiving](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fissues\u002F51) to adblock, [autoscroll](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fissues\u002F80), [modal-hide](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fissues\u002F175), [thread-expand](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fissues\u002F345)\n\n\u003Cbr\u002F>\n\n## 🤝 Professional Integration\n\nArchiveBox is free for everyone to self-host, but we also provide support, security review, and custom integrations to help NGOs, governments, and other organizations [run ArchiveBox professionally](https:\u002F\u002Fzulip.archivebox.io\u002F#narrow\u002Fstream\u002F167-enterprise\u002Ftopic\u002Fwelcome\u002Fnear\u002F1191102):\n\n- **Journalists:**\n  `crawling during research`, `preserving cited pages`, `fact-checking & review`  \n- **Lawyers:**\n  `collecting & preserving evidence`, `detecting changes`, `tagging & review`  \n- **Researchers:**\n  `analyzing social media trends`, `getting LLM training data`, `crawling pipelines`\n- **Individuals:**\n  `saving bookmarks`, `preserving portfolio content`, `legacy \u002F memoirs archival`\n- **Governments:**\n  `snapshotting public service sites`, `recordkeeping compliance`\n\n> ***[Contact us](https:\u002F\u002Fzulip.archivebox.io\u002F#narrow\u002Fstream\u002F167-enterprise\u002Ftopic\u002Fwelcome\u002Fnear\u002F1191102)** if your org wants help using ArchiveBox professionally.*  \n> We offer: setup & support, CAPTCHA\u002Fratelimit unblocking, SSO, audit logging\u002Fchain-of-custody, and more  \n> *ArchiveBox is a 🏛️ 501(c)(3) [nonprofit FSP](https:\u002F\u002Fhackclub.com\u002Fhcb\u002F) and all our work supports open-source development.* \n\n\u003Cbr\u002F>\n\n\u003Cdiv align=\"center\" style=\"text-align: center\">\n\u003Cbr\u002F>\n\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002F0db52ea7-4a2c-441d-b47f-5553a5d8fe96\" width=\"49%\" alt=\"grass\"\u002F>\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002F0db52ea7-4a2c-441d-b47f-5553a5d8fe96\" width=\"49%\" alt=\"grass\"\u002F>\n\u003C\u002Fdiv>\n\n\u003Ca name=\"install\">\u003C\u002Fa>\n\n# Quickstart\n\n**🖥&nbsp; [Supported OSs](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FInstall#supported-systems):** Linux\u002FBSD, macOS, Windows (Docker) &nbsp; **👾&nbsp; CPUs:** `amd64` (`x86_64`), `arm64`, `arm7` \u003Csup>(raspi>=3)\u003C\u002Fsup>\u003Cbr\u002F>\n\n\u003Cbr\u002F>\n\n#### ✳️&nbsp; Easy Setup\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>\u003Cimg src=\"https:\u002F\u002Fuser-images.githubusercontent.com\u002F511499\u002F117447182-29758200-af0b-11eb-97bd-58723fee62ab.png\" alt=\"Docker\" height=\"28px\" align=\"top\"\u002F> \u003Ccode>docker-compose\u003C\u002Fcode>\u003C\u002Fb>  (macOS\u002FLinux\u002FWindows) &nbsp; \u003Cb>👈&nbsp; recommended\u003C\u002Fb> &nbsp; \u003Ci>(click to expand)\u003C\u002Fi>\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\u003Ci>👍 Docker Compose is recommended for the easiest install\u002Fupdate UX + best security + all \u003Ca href=\"#dependencies\">extras\u003C\u002Fa> out-of-the-box.\u003C\u002Fi>\n\u003Cbr\u002F>\u003Cbr\u002F>\n\u003Col>\n\u003Cli>Install \u003Ca href=\"https:\u002F\u002Fdocs.docker.com\u002Fget-docker\u002F\">Docker\u003C\u002Fa> on your system (if not already installed).\u003C\u002Fli>\n\u003Cli>Download the \u003Ca href=\"https:\u002F\u002Fraw.githubusercontent.com\u002FArchiveBox\u002FArchiveBox\u002Fdev\u002Fdocker-compose.yml\" download>\u003Ccode>docker-compose.yml\u003C\u002Fcode>\u003C\u002Fa> file into a new empty directory (can be anywhere).\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">mkdir -p ~\u002Farchivebox\u002Fdata && cd ~\u002Farchivebox\n# Read and edit docker-compose.yml options as-needed after downloading\ncurl -fsSL 'https:\u002F\u002Fdocker-compose.archivebox.io' > docker-compose.yml\n\u003C\u002Fcode>\u003C\u002Fpre>\u003C\u002Fli>\n\u003Cli>Run the initial setup to create an admin user (or set ADMIN_USER\u002FPASS in docker-compose.yml)\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">docker compose run archivebox init --install\n\u003C\u002Fcode>\u003C\u002Fpre>\u003C\u002Fli>\n\u003Cli>Next steps: Start the server then login to the Web UI \u003Ca href=\"http:\u002F\u002F127.0.0.1:8000\">http:\u002F\u002F127.0.0.1:8000\u003C\u002Fa> ⇢ Admin.\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">docker compose up\n# completely optional, CLI can always be used without running a server\n# docker compose run [-T] archivebox [subcommand] [--help]\ndocker compose run archivebox add 'https:\u002F\u002Fexample.com'\ndocker compose run archivebox help\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Ci>For more info, see \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FInstall#option-a-docker--docker-compose-setup-%EF%B8%8F\">Install: Docker Compose\u003C\u002Fa> in the Wiki. ➡️\u003C\u002Fi>\n\u003C\u002Fli>\n\u003C\u002Fol>\n\nSee \u003Ca href=\"#%EF%B8%8F-cli-usage\">below\u003C\u002Fa> for more usage examples using the CLI, Web UI, or \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUsage#sql-shell-usage\">filesystem\u002FSQL\u002FPython\u003C\u002Fa> to manage your archive.\n\u003Cbr\u002F>\u003Cbr\u002F>\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>\u003Cimg src=\"https:\u002F\u002Fuser-images.githubusercontent.com\u002F511499\u002F117447182-29758200-af0b-11eb-97bd-58723fee62ab.png\" alt=\"Docker\" height=\"28px\" align=\"top\"\u002F> \u003Ccode>docker run\u003C\u002Fcode>\u003C\u002Fb>  (macOS\u002FLinux\u002FWindows)\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\u003Col>\n\u003Cli>Install \u003Ca href=\"https:\u002F\u002Fdocs.docker.com\u002Fget-docker\u002F\">Docker\u003C\u002Fa> on your system (if not already installed).\u003C\u002Fli>\n\u003Cli>Create a new empty directory and initialize your collection (can be anywhere).\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">mkdir -p ~\u002Farchivebox\u002Fdata && cd ~\u002Farchivebox\u002Fdata\ndocker run -v $PWD:\u002Fdata -it archivebox\u002Farchivebox init --install\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003C\u002Fli>\n\u003Cli>Optional: Start the server then login to the Web UI \u003Ca href=\"http:\u002F\u002F127.0.0.1:8000\">http:\u002F\u002F127.0.0.1:8000\u003C\u002Fa> ⇢ Admin.\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">docker run -v $PWD:\u002Fdata -p 8000:8000 archivebox\u002Farchivebox\n# completely optional, CLI can always be used without running a server\n# docker run -v $PWD:\u002Fdata -it [subcommand] [--help]\ndocker run -v $PWD:\u002Fdata -it archivebox\u002Farchivebox help\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Ci>For more info, see \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FInstall#option-a-docker--docker-compose-setup-%EF%B8%8F\">Install: Docker Compose\u003C\u002Fa> in the Wiki. ➡️\u003C\u002Fi>\n\u003C\u002Fli>\n\u003C\u002Fol>\n\nSee \u003Ca href=\"#%EF%B8%8F-cli-usage\">below\u003C\u002Fa> for more usage examples using the CLI, Web UI, or filesystem\u002FSQL\u002FPython to manage your archive.\n\u003Cbr\u002F>\u003Cbr\u002F>\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>\u003Cimg src=\"https:\u002F\u002Fuser-images.githubusercontent.com\u002F511499\u002F117456282-08665e80-af16-11eb-91a1-8102eff54091.png\" alt=\"curl sh automatic setup script\" height=\"28px\" align=\"top\"\u002F> \u003Ccode>bash\u003C\u002Fcode> auto-setup script\u003C\u002Fb>  (macOS\u002FLinux)\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\u003Col>\n\u003Cli>Install \u003Ca href=\"https:\u002F\u002Fdocs.docker.com\u002Fget-docker\u002F\">Docker\u003C\u002Fa> on your system (optional, highly recommended but not required).\u003C\u002Fli>\n\u003Cli>Run the automatic setup script.\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">curl -fsSL 'https:\u002F\u002Fget.archivebox.io' | bash\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Ci>For more info, see \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FInstall#option-b-automatic-setup-script\">Install: Bare Metal\u003C\u002Fa> in the Wiki. ➡️\u003C\u002Fi>\n\u003C\u002Fli>\n\u003C\u002Fol>\n\nSee \u003Ca href=\"#%EF%B8%8F-cli-usage\">below\u003C\u002Fa> for more usage examples using the CLI, Web UI, or filesystem\u002FSQL\u002FPython to manage your archive.\u003Cbr\u002F>\nSee \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fblob\u002Fdev\u002Fbin\u002Fsetup.sh\">\u003Ccode>setup.sh\u003C\u002Fcode>\u003C\u002Fa> for the source code of the auto-install script.\u003Cbr\u002F>\nSee \u003Ca href=\"https:\u002F\u002Fdocs.sweeting.me\u002Fs\u002Fagainst-curl-sh\">\"Against curl | sh as an install method\"\u003C\u002Fa> blog post for my thoughts on the shortcomings of this install method.\n\u003Cbr\u002F>\u003Cbr\u002F>\n\u003C\u002Fdetails>\n\n\u003Cbr\u002F>\n\n#### 🛠&nbsp; Package Manager Setup\n\n\u003Ca name=\"Manual-Setup\">\u003C\u002Fa>\n\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>\u003Cimg src=\"https:\u002F\u002Fuser-images.githubusercontent.com\u002F511499\u002F117447613-ba4c5d80-af0b-11eb-8f89-1d98e31b6a79.png\" alt=\"Pip\" height=\"28px\" align=\"top\"\u002F> \u003Ccode>pip\u003C\u002Fcode>\u003C\u002Fb> (macOS\u002FLinux\u002FBSD)\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\u003Col>\n\n\u003Cli>Install \u003Ca href=\"https:\u002F\u002Frealpython.com\u002Finstalling-python\u002F\">Python >= v3.13\u003C\u002Fa> and \u003Ca href=\"https:\u002F\u002Fnodejs.org\u002Fen\u002Fdownload\u002Fpackage-manager\u002F\">Node >= v22\u003C\u002Fa> on your system (if not already installed).\u003C\u002Fli>\n\u003Cli>Install the ArchiveBox package using \u003Ccode>pip3\u003C\u002Fcode> (or \u003Ca href=\"https:\u002F\u002Fdocs.astral.sh\u002Fuv\u002Fguides\u002Ftools\u002F#running-tools\">\u003Ccode>uvx\u003C\u002Fcode>\u003C\u002Fa>).\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">pip3 install --upgrade archivebox\narchivebox version\n# install any missing extras shown using apt\u002Fbrew\u002Fpkg\u002Fetc. see Wiki for instructions\n#    python@3.13 node curl wget git ripgrep ...\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Ci>See the \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FInstall\">Install: Bare Metal\u003C\u002Fa> Wiki for full install instructions for each OS...\u003C\u002Fi>\n\u003C\u002Fli>\n\u003Cli>Create a new empty directory and initialize your collection (can be anywhere).\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">mkdir -p ~\u002Farchivebox\u002Fdata && cd ~\u002Farchivebox\u002Fdata   # for example\narchivebox init --install   # instantialize a new collection\n# (--setup auto-installs and link JS dependencies: singlefile, readability, mercury, etc.)\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003C\u002Fli>\n\u003Cli>Optional: Start the server then login to the Web UI \u003Ca href=\"http:\u002F\u002F127.0.0.1:8000\">http:\u002F\u002F127.0.0.1:8000\u003C\u002Fa> ⇢ Admin.\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">archivebox server 0.0.0.0:8000\n# completely optional, CLI can always be used without running a server\n# archivebox [subcommand] [--help]\narchivebox help\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003C\u002Fli>\n\u003C\u002Fol>\n\nSee \u003Ca href=\"#%EF%B8%8F-cli-usage\">below\u003C\u002Fa> for more usage examples using the CLI, Web UI, or filesystem\u002FSQL\u002FPython to manage your archive.\u003Cbr\u002F>\n\u003Cbr\u002F>\n\u003Csub>See the \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002Fpip-archivebox\">\u003Ccode>pip-archivebox\u003C\u002Fcode>\u003C\u002Fa> repo for more details about this distribution.\u003C\u002Fsub>\n\u003Cbr\u002F>\u003Cbr\u002F>\n\u003C\u002Fdetails>\n\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>\u003Cimg src=\"https:\u002F\u002Fuser-images.githubusercontent.com\u002F511499\u002F117448075-49597580-af0c-11eb-91ba-f34fff10096b.png\" alt=\"aptitude\" height=\"28px\" align=\"top\"\u002F> \u003Ccode>apt\u003C\u002Fcode>\u003C\u002Fb> (Ubuntu\u002FDebian\u002Fetc.)\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\u003Col>\n\u003Cli>Download and install the \u003Ccode>.deb\u003C\u002Fcode> package from the \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Freleases\">latest release\u003C\u002Fa>.\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\"># download the .deb for your architecture (amd64 or arm64)\nARCH=\"$(dpkg --print-architecture)\"\nVERSION=\"$(curl -fsSL https:\u002F\u002Fapi.github.com\u002Frepos\u002FArchiveBox\u002FArchiveBox\u002Freleases\u002Flatest | python3 -c \"import sys,json; print(json.load(sys.stdin)['tag_name'].lstrip('v'))\")\"\ncurl -fsSL \"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Freleases\u002Flatest\u002Fdownload\u002Farchivebox_${VERSION}_${ARCH}.deb\" -o \u002Ftmp\u002Farchivebox.deb\nsudo apt install \u002Ftmp\u002Farchivebox.deb\narchivebox version                         # make sure all dependencies are installed\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003C\u002Fli>\n\u003Cli>Create a new empty directory and initialize your collection (can be anywhere).\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">mkdir -p ~\u002Farchivebox\u002Fdata && cd ~\u002Farchivebox\u002Fdata\narchivebox init --install\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Cbr\u002F>\n\u003C\u002Fli>\n\u003Cli>Optional: Start the server then login to the Web UI \u003Ca href=\"http:\u002F\u002F127.0.0.1:8000\">http:\u002F\u002F127.0.0.1:8000\u003C\u002Fa> ⇢ Admin.\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">archivebox server 0.0.0.0:8000\n# completely optional, CLI can always be used without running a server\n# archivebox [subcommand] [--help]\narchivebox help\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003C\u002Fli>\n\u003C\u002Fol>\nSee \u003Ca href=\"#%EF%B8%8F-cli-usage\">below\u003C\u002Fa> for more usage examples using the CLI, Web UI, or filesystem\u002FSQL\u002FPython to manage your archive.\u003Cbr\u002F>\n\u003Csub>See the \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002Fdebian-archivebox\">\u003Ccode>debian-archivebox\u003C\u002Fcode>\u003C\u002Fa> repo for more details about this distribution.\u003C\u002Fsub>\n\u003Cbr\u002F>\u003Cbr\u002F>\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>\u003Cimg src=\"https:\u002F\u002Fuser-images.githubusercontent.com\u002F511499\u002F117447803-f2ec3700-af0b-11eb-87d3-671d114f011d.png\" alt=\"homebrew\" height=\"28px\" align=\"top\"\u002F> \u003Ccode>brew\u003C\u002Fcode>\u003C\u002Fb> (macOS only)\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\u003Col>\n\u003Cli>Install \u003Ca href=\"https:\u002F\u002Fbrew.sh\u002F#install\">Homebrew\u003C\u002Fa> on your system (if not already installed).\u003C\u002Fli>\n\u003Cli>Install the ArchiveBox package using \u003Ccode>brew\u003C\u002Fcode>.\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">brew tap archivebox\u002Farchivebox\nbrew install archivebox\narchivebox version                         # make sure all dependencies are installed\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Ci>See the \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FInstall#option-c-bare-metal-setup\">Install: Bare Metal\u003C\u002Fa> Wiki for more granular instructions for macOS... ➡️\u003C\u002Fi>\n\u003C\u002Fli>\n\u003Cli>Create a new empty directory and initialize your collection (can be anywhere).\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">mkdir -p ~\u002Farchivebox\u002Fdata && cd ~\u002Farchivebox\u002Fdata\narchivebox init --install\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003C\u002Fli>\n\u003Cli>Optional: Start the server then login to the Web UI \u003Ca href=\"http:\u002F\u002F127.0.0.1:8000\">http:\u002F\u002F127.0.0.1:8000\u003C\u002Fa> ⇢ Admin.\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">archivebox server 0.0.0.0:8000\n# completely optional, CLI can always be used without running a server\n# archivebox [subcommand] [--help]\narchivebox help\n\u003C\u002Fcode>\u003C\u002Fpre>\u003Cbr\u002F>\n\u003C\u002Fli>\n\u003C\u002Fol>\n\nSee \u003Ca href=\"#%EF%B8%8F-cli-usage\">below\u003C\u002Fa> for more usage examples using the CLI, Web UI, or filesystem\u002FSQL\u002FPython to manage your archive.\u003Cbr\u002F>\n\u003Csub>See the \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002Fhomebrew-archivebox\">\u003Ccode>homebrew-archivebox\u003C\u002Fcode>\u003C\u002Fa> repo for more details about this distribution.\u003C\u002Fsub>\n\u003Cbr\u002F>\u003Cbr\u002F>\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cimg src=\"https:\u002F\u002Fuser-images.githubusercontent.com\u002F511499\u002F118077361-f0616580-b381-11eb-973c-ee894a3349fb.png\" alt=\"Arch\" height=\"28px\" align=\"top\"\u002F> \u003Ccode>pacman\u003C\u002Fcode> \u002F \u003Cimg src=\"https:\u002F\u002Fuser-images.githubusercontent.com\u002F511499\u002F118077946-29e6a080-b383-11eb-94f0-d4871da08c3f.png\" alt=\"FreeBSD\" height=\"28px\" align=\"top\"\u002F> \u003Ccode>pkg\u003C\u002Fcode> \u002F \u003Cimg src=\"https:\u002F\u002Fuser-images.githubusercontent.com\u002F511499\u002F118077861-002d7980-b383-11eb-86a7-5936fad9190f.png\" alt=\"Nix\" height=\"28px\" align=\"top\"\u002F> \u003Ccode>nix\u003C\u002Fcode> (Arch\u002FFreeBSD\u002FNixOS\u002Fmore)\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\n> *Warning: These are contributed by external volunteers and may lag behind the official `pip` channel.*\n\n\u003Cul>\n\u003Cli>Arch: \u003Ca href=\"https:\u002F\u002Faur.archlinux.org\u002Fpackages\u002Farchivebox\u002F\">\u003Ccode>yay -S archivebox\u003C\u002Fcode>\u003C\u002Fa> (contributed by \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fimlonghao\">\u003Ccode>@imlonghao\u003C\u002Fcode>\u003C\u002Fa>, maintained by \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fjasongodev\">\u003Ccode>@jasongodev\u003C\u002Fcode>\u003C\u002Fa>)\u003C\u002Fli>\n\u003Cli>FreeBSD: \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox#%EF%B8%8F-easy-setup\">\u003Ccode>curl -fsSL 'https:\u002F\u002Fget.archivebox.io' | bash\u003C\u002Fcode>\u003C\u002Fa> (uses \u003Ccode>pkg\u003C\u002Fcode> + \u003Ccode>pip3\u003C\u002Fcode> under-the-hood)\u003C\u002Fli>\n\u003Cli>Nix: \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FNixOS\u002Fnixpkgs\u002Fblob\u002Fmaster\u002Fpkgs\u002Fapplications\u002Fmisc\u002Farchivebox\u002Fdefault.nix\">\u003Ccode>nix-env --install archivebox\u003C\u002Fcode>\u003C\u002Fa> (contributed by \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fsiraben\">\u003Ccode>@siraben\u003C\u002Fcode>\u003C\u002Fa>)\u003C\u002Fli>\n\u003Cli>Guix: \u003Ca href=\"https:\u002F\u002Fpackages.guix.gnu.org\u002Fpackages\u002Farchivebox\u002F\">\u003Ccode>guix install archivebox\u003C\u002Fcode>\u003C\u002Fa> (contributed by \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Frakino\">\u003Ccode>@rakino\u003C\u002Fcode>\u003C\u002Fa>)\u003C\u002Fli>\n\u003Cli>More: \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fissues\u002Fnew\">\u003Ci>contribute another distribution...!\u003C\u002Fi>\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\nSee \u003Ca href=\"#%EF%B8%8F-cli-usage\">below\u003C\u002Fa> for usage examples using the CLI, Web UI, or filesystem\u002FSQL\u002FPython to manage your archive.\n\u003Cbr\u002F>\u003Cbr\u002F>\n\u003C\u002Fdetails>\n\n\u003Cbr\u002F>\n\n#### 🎗&nbsp; Other Options\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>\u003Cimg src=\"https:\u002F\u002Fuser-images.githubusercontent.com\u002F511499\u002F117447182-29758200-af0b-11eb-97bd-58723fee62ab.png\" alt=\"Docker\" height=\"28px\" align=\"top\"\u002F> \u003Ccode>docker\u003C\u002Fcode> + \u003Cimg src=\"https:\u002F\u002Fuser-images.githubusercontent.com\u002F511499\u002F117447263-4316c980-af0b-11eb-928d-eaf1292ac646.png\" alt=\"Electron\" height=\"28px\" align=\"top\"\u002F> \u003Ccode>electron\u003C\u002Fcode> Desktop App\u003C\u002Fb> (macOS\u002FLinux\u002FWindows)\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\u003Col>\n\u003Cli>Install \u003Ca href=\"https:\u002F\u002Fdocs.docker.com\u002Fget-docker\u002F\">Docker\u003C\u002Fa> on your system (if not already installed).\u003C\u002Fli>\n\u003Cli>Download a binary release for your OS or build the native app from source\u003Cbr\u002F>\n\u003Cul>\n\u003Cli>macOS: \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Freleases\u002Fdownload\u002Fv0.6.2\u002FElectron-ArchiveBox-macOS-x64-0.6.2.app.zip\" download>\u003Ccode>ArchiveBox.app.zip\u003C\u002Fcode>\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>Linux: \u003Ccode>ArchiveBox.deb\u003C\u002Fcode> (alpha: \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002Felectron-archivebox#quickstart\">build manually\u003C\u002Fa>)\u003C\u002Fli>\n\u003Cli>Windows: \u003Ccode>ArchiveBox.exe\u003C\u002Fcode> (beta: \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002Felectron-archivebox#quickstart\">build manually\u003C\u002Fa>)\u003C\u002Fli>\n\u003C\u002Ful>\n\u003C\u002Fli>\n\u003C\u002Fol>\n\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002F575ef92f-bb3e-4a7c-a4ba-986c1fd76ecf\" width=\"320px\">\n\u003Cbr\u002F>\n\u003Ci>✨ Alpha (contributors wanted!)\u003C\u002Fi>: for more info, see the: \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002Felectron-archivebox\">Electron ArchiveBox\u003C\u002Fa> repo.\n\u003Cbr\u002F>\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002F0c46e949-00fe-49c8-a613-ee14501c014c\" alt=\"Self-hosting Platforms\" height=\"28px\" align=\"top\"\u002F>\u003Cb> TrueNAS \u002F UNRAID \u002F YunoHost \u002F Cloudron \u002F etc.\u003C\u002Fb> (self-hosting solutions)\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\n> *Warning: These are contributed by external volunteers and may lag behind the official `pip` channel.*\n\n\u003Cul>\n\u003Cli>\u003Cs>TrueNAS: \u003Ca href=\"https:\u002F\u002Ftruecharts.org\u002Fcharts\u002Fstable\u002Farchivebox\u002F\">Official ArchiveBox TrueChart\u003C\u002Fa> \u002F \u003Ca href=\"https:\u002F\u002Fdev.to\u002Ffinloop\u002Fsetting-up-archivebox-on-truenas-scale-1788\">Custom App Guide\u003C\u002Fa>\u003C\u002Fs> (\u003Ca href=\"https:\u002F\u002Ftruecharts.org\u002Fnews\u002Fscale-deprecation\u002F\">TrueCharts is discontinued\u003C\u002Fa>, wait for \u003Ca href=\"https:\u002F\u002Fforums.truenas.com\u002Ft\u002Fthe-future-of-electric-eel-and-apps\u002F5409\u002F\">Electric Eel\u003C\u002Fa>)\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Funraid.net\u002Fcommunity\u002Fapps?q=archivebox#r\">UnRaid\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fcommunity-scripts.github.io\u002FProxmoxVE\u002Fscripts?id=archivebox\">Proxmox\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FYunoHost-Apps\u002Farchivebox_ynh\">Yunohost\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fwww.cloudron.io\u002Fstore\u002Fio.archivebox.cloudronapp.html\">Cloudron\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fdocs.saltbox.dev\u002Fsandbox\u002Fapps\u002Farchivebox\u002F\">Saltbox\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fportainer-templates.as93.net\u002Farchivebox\">Portainer\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fpull\u002F922\u002Ffiles#diff-00f0606e18b2618c3cc1667ca7c2b703b537af690ca71eba1330633587dcb1ee\">AppImage\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fruntipi.io\u002Fdocs\u002Fapps-available#:~:text=for%20AI%20Chats.-,ArchiveBox,Open%20source%20self%2Dhosted%20web%20archiving.,-Atuin%20Server\">Runtipi\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fissues\u002F986\">Umbrel\u003C\u002Fa> (need contributors...)\u003C\u002Fli>\n\n\u003Cli>More: \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fissues\u002Fnew\">\u003Ci>contribute another distribution...!\u003C\u002Fi>\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\nSee \u003Ca href=\"#%EF%B8%8F-cli-usage\">below\u003C\u002Fa> for usage examples using the CLI, Web UI, or filesystem\u002FSQL\u002FPython to manage your archive.\n\u003Cbr\u002F>\u003Cbr\u002F>\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cimg src=\"https:\u002F\u002Fuser-images.githubusercontent.com\u002F511499\u002F117448723-1663b180-af0d-11eb-837f-d43959227810.png\" alt=\"paid\" height=\"27px\" align=\"top\"\u002F> Paid hosting solutions (cloud VPS)\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\u003Cul>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fzulip.archivebox.io\u002F#narrow\u002Fstream\u002F167-enterprise\u002Ftopic\u002Fwelcome\u002Fnear\u002F1191102\">\n \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FCustom_Development-ArchiveBox.io-%231a1a1a.svg?style=flat\" height=\"22px\"\u002F>\n\u003C\u002Fa> (\u003Ca href=\"https:\u002F\u002Fzulip.archivebox.io\u002F#narrow\u002Fstream\u002F167-enterprise\u002Ftopic\u002Fwelcome\u002Fnear\u002F1191102\">get hosting, support, and feature customization directly from us\u003C\u002Fa>)\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fmonadical.com\">\n \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FGeneral_Dev_Consulting-Monadical.com-%231a1a1a.svg?style=flat\" height=\"22px\"\u002F>\n\u003C\u002Fa> (\u003Ca href=\"https:\u002F\u002Fmonadical.com\u002Fcontact-us.html\">generalist consultancy that has ArchiveBox experience\u003C\u002Fa>)\u003C\u002Fli>\n\u003Cbr\u002F>\nOther providers of paid ArchiveBox hosting (not officially endorsed):\u003Cbr\u002F>\n\u003Cbr\u002F>\u003Cbr\u002F>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Felest.io\u002Fopen-source\u002Farchivebox\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FManaged_Hosting-Elest.io-%23193f7e.svg?style=flat\" height=\"22px\"\u002F>\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fwww.stellarhosted.com\u002Farchivebox\u002F\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FSemi_Managed_Hosting-StellarHosted.com-%23193f7e.svg?style=flat\" height=\"22px\"\u002F>\u003C\u002Fa> (USD $29-250\u002Fmo, \u003Ca href=\"https:\u002F\u002Fwww.stellarhosted.com\u002Farchivebox\u002F#pricing\">pricing\u003C\u002Fa>)\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fwww.pikapods.com\u002Fpods?run=archivebox\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FSemi_Managed_Hosting-PikaPods.com-%2343a047.svg?style=flat\" height=\"22px\"\u002F>\u003C\u002Fa> (from USD $2.6\u002Fmo)\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fm.do.co\u002Fc\u002Fcbc4c0c17840\">\n \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FUnmanaged_VPS-DigitalOcean.com-%232f7cf7.svg?style=flat\" height=\"22px\"\u002F>\n\u003C\u002Fa> (USD $5-50+\u002Fmo, \u003Ca href=\"https:\u002F\u002Fm.do.co\u002Fc\u002Fcbc4c0c17840\">🎗&nbsp; referral link\u003C\u002Fa>, \u003Ca href=\"https:\u002F\u002Fwww.digitalocean.com\u002Fcommunity\u002Ftutorials\u002Fhow-to-install-and-use-docker-compose-on-ubuntu-20-04\">instructions\u003C\u002Fa>)\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fwww.vultr.com\u002F?ref=7130289\">\n \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FUnmanaged_VPS-Vultr.com-%232337a8.svg?style=flat\" height=\"22px\"\u002F>\n\u003C\u002Fa> (USD $2.5-50+\u002Fmo, \u003Ca href=\"https:\u002F\u002Fwww.vultr.com\u002F?ref=7130289\">🎗&nbsp; referral link\u003C\u002Fa>, \u003Ca href=\"https:\u002F\u002Fwww.vultr.com\u002Fdocs\u002Finstall-docker-compose-on-ubuntu-20-04\">instructions\u003C\u002Fa>)\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Ffly.io\u002F\">\n \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FUnmanaged_App-Fly.io-%239a2de6.svg?style=flat\" height=\"22px\"\u002F>\n\u003C\u002Fa> (USD $10-50+\u002Fmo, \u003Ca href=\"https:\u002F\u002Ffly.io\u002Fdocs\u002Fhands-on\u002Fstart\u002F\">instructions\u003C\u002Fa>)\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Frailway.app\u002Ftemplate\u002F2Vvhmy\">\n \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FUnmanaged_App-Railway-%23A11BE6.svg?style=flat\" height=\"22px\"\u002F>\n\u003C\u002Fa> (USD $0-5+\u002Fmo)\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Faws.amazon.com\u002Fmarketplace\u002Fpp\u002FLinnovate-Open-Source-Innovation-Support-For-Archi\u002FB08RVW6MJ2\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FUnmanaged_VPS-AWS-%23ee8135.svg?style=flat\" height=\"22px\"\u002F>\u003C\u002Fa> (USD $60-200+\u002Fmo)\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fazuremarketplace.microsoft.com\u002Fen-us\u002Fmarketplace\u002Fapps\u002Fmeanio.archivebox?ocid=gtmrewards_whatsnewblog_archivebox_vol118\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FUnmanaged_VPS-Azure-%237cb300.svg?style=flat\" height=\"22px\"\u002F>\u003C\u002Fa> (USD $60-200+\u002Fmo)\u003C\u002Fli>\n\u003Cbr\u002F>\n\u003Csub>\u003Ci>Referral links marked 🎗 provide $5-10 of free credit for new users and help pay for our \u003Ca href=\"https:\u002F\u002Fdemo.archivebox.io\">demo server\u003C\u002Fa> hosting costs.\u003C\u002Fi>\u003C\u002Fsub>\n\u003C\u002Ful>\n\nFor more discussion on managed and paid hosting options see here: \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fissues\u002F531\">Issue #531\u003C\u002Fa>.\n\n\u003C\u002Fdetails>\n\n\u003Cbr\u002F>\n\n#### ➡️&nbsp; Next Steps\n\n- Import URLs from some of the supported [Input Formats](#input-formats) or view the supported [Output Formats](#output-formats)...\n- (Optional) Create a persona and import browser cookies to archive logged-in sites: `archivebox persona create --import=chrome personal`\n- Tweak your UI or archiving behavior [Configuration](#configuration), read about some of the [Caveats](#caveats), or [Troubleshoot](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FTroubleshooting)\n- Read about the [Dependencies](#dependencies) used for archiving, the [Upgrading Process](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUpgrading-or-Merging-Archives), or the [Archive Layout](#archive-layout) on disk...\n- Or check out our full [Documentation](#documentation) or [Community Wiki](#internet-archiving-ecosystem)...\n\n\u003Cbr\u002F>\n\n### Usage\n\n#### ⚡️&nbsp; \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUsage#cli-usage\">CLI Usage\u003C\u002Fa>\n\nArchiveBox commands can be run in a terminal [directly on your host](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUsage#cli-usage), or via [Docker](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FDocker#usage-1)\u002F[Docker Compose](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FDocker#usage).  \n\u003Csup>(depending on how you chose to install it above)\u003C\u002Fsup>\n\n```bash\nmkdir -p ~\u002Farchivebox\u002Fdata   # create a new data dir anywhere\ncd ~\u002Farchivebox\u002Fdata         # IMPORTANT: cd into the directory\n\n# archivebox [subcommand] [--help]\narchivebox version\narchivebox help\n\n# equivalent: docker compose run archivebox [subcommand] [--help]\ndocker compose run archivebox help\n\n# equivalent: docker run -it -v $PWD:\u002Fdata archivebox\u002Farchivebox [subcommand] [--help]\ndocker run -it -v $PWD:\u002Fdata archivebox\u002Farchivebox help\n\n# optional: import your browser cookies into a persona for logged-in archiving\narchivebox persona create --import=chrome personal\n# supported: chrome\u002Fchromium\u002Fbrave\u002Fedge (Chromium-based only)\n# use --profile to target a specific profile (e.g. Default, Profile 1)\n# re-running import merges\u002Fdedupes cookies.txt (by domain\u002Fpath\u002Fname) but replaces chrome_user_data\n```\n\n#### ArchiveBox Subcommands\n\n- `archivebox` `help`\u002F`version` to see the list of available subcommands \u002F currently installed version info\n- `archivebox` `setup`\u002F`init`\u002F`config`\u002F`status`\u002F`shell`\u002F`manage` to administer your collection\n- `archivebox` `add`\u002F`schedule` to pull in fresh URLs from [bookmarks\u002Fhistory\u002FRSS\u002Fetc.](#input-formats)\n- `archivebox` `list`\u002F`update`\u002F`remove` to manage existing Snapshots in your collection\n\n\u003Cbr\u002F>\n\u003Cdetails>\n\u003Csummary>\u003Cimg src=\"https:\u002F\u002Fuser-images.githubusercontent.com\u002F511499\u002F117456282-08665e80-af16-11eb-91a1-8102eff54091.png\" alt=\"curl sh automatic setup script\" height=\"22px\" align=\"top\"\u002F> \u003Cb>CLI Usage Examples: non-Docker\u003C\u002Fb>\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">\n# make sure you have pip-installed ArchiveBox and it's available in your $PATH first  \n\u003Cbr\u002F>\n# archivebox [subcommand] [--help]\narchivebox init --install      # safe to run init multiple times (also how you update versions)\narchivebox version           # get archivebox version info + check dependencies\narchivebox help              # get list of archivebox subcommands that can be run\narchivebox add --depth=1 'https:\u002F\u002Fnews.ycombinator.com'\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Ci>For more info, see our \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUsage#cli-usage\">Usage: CLI Usage\u003C\u002Fa> wiki. ➡️\u003C\u002Fi>\n\u003C\u002Fdetails>\n\n\u003Cbr\u002F>\n\n\u003Cdetails>\n\u003Csummary>\u003Cimg src=\"https:\u002F\u002Fuser-images.githubusercontent.com\u002F511499\u002F117447182-29758200-af0b-11eb-97bd-58723fee62ab.png\" alt=\"Docker\" height=\"22px\" align=\"top\"\u002F> \u003Cb>CLI Usage Examples: Docker Compose\u003C\u002Fb>\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">\n# make sure you have `docker-compose.yml` from the Quickstart instructions first\n\u003Cbr\u002F>\n# docker compose run archivebox [subcommand] [--help]\ndocker compose run archivebox init --install\ndocker compose run archivebox version\ndocker compose run archivebox help\ndocker compose run archivebox add --depth=1 'https:\u002F\u002Fnews.ycombinator.com'\n# to start webserver: docker compose up\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Ci>For more info, see our \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FDocker#usage\">Usage: Docker Compose CLI\u003C\u002Fa> wiki. ➡️\u003C\u002Fi>\n\u003C\u002Fdetails>\n\n\u003Cbr\u002F>\n\n\u003Cdetails>\n\u003Csummary>\u003Cimg src=\"https:\u002F\u002Fuser-images.githubusercontent.com\u002F511499\u002F117447182-29758200-af0b-11eb-97bd-58723fee62ab.png\" alt=\"Docker\" height=\"22px\" align=\"top\"\u002F> \u003Cb>CLI Usage Examples: Docker\u003C\u002Fb>\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">\n# make sure you create and cd into in a new empty directory first  \n\u003Cbr\u002F>\n# docker run -it -v $PWD:\u002Fdata archivebox\u002Farchivebox [subcommand] [--help]\ndocker run -v $PWD:\u002Fdata -it archivebox\u002Farchivebox init --install\ndocker run -v $PWD:\u002Fdata -it archivebox\u002Farchivebox version\ndocker run -v $PWD:\u002Fdata -it archivebox\u002Farchivebox help\ndocker run -v $PWD:\u002Fdata -it archivebox\u002Farchivebox add --depth=1 'https:\u002F\u002Fnews.ycombinator.com'\n# to start webserver: docker run -v $PWD:\u002Fdata -it -p 8000:8000 archivebox\u002Farchivebox\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Ci>For more info, see our \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FDocker#usage-1\">Usage: Docker CLI\u003C\u002Fa> wiki. ➡️\u003C\u002Fi>\n\u003C\u002Fdetails>\n\n\u003Cbr\u002F>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>🗄&nbsp; SQL\u002FPython\u002FFilesystem Usage\u003C\u002Fb>\u003C\u002Fsummary>\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">\narchivebox shell           # explore the Python library API in a REPL\nsqlite3 .\u002Findex.sqlite3    # run SQL queries directly on your index\nls .\u002Farchive\u002F*\u002Findex.html  # or inspect snapshot data directly on the filesystem\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Ci>For more info, see our \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUsage#python-shell-usage\">Python Shell\u003C\u002Fa>, \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUsage#sql-shell-usage\">SQL API\u003C\u002Fa>, and \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox#archive-layout\">Disk Layout\u003C\u002Fa> wikis. ➡️\u003C\u002Fi>\n\u003C\u002Fdetails>\n\n\n\u003Cbr\u002F>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>🖥&nbsp; Web UI & API Usage\u003C\u002Fb>\u003C\u002Fsummary>\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">\n# Start the server on bare metal (pip\u002Fapt\u002Fbrew\u002Fetc):\narchivebox manage createsuperuser              # create a new admin user via CLI\narchivebox server 0.0.0.0:8000                 # start the server\n\u003Cbr\u002F>\n# Or with Docker Compose:\nnano docker-compose.yml                        # setup initial ADMIN_USERNAME & ADMIN_PASSWORD\ndocker compose up                              # start the server\n\u003Cbr\u002F>\n# Or with a Docker container:\ndocker run -v $PWD:\u002Fdata -it archivebox\u002Farchivebox archivebox manage createsuperuser\ndocker run -v $PWD:\u002Fdata -it -p 8000:8000 archivebox\u002Farchivebox\n\u003C\u002Fcode>\u003C\u002Fpre>\n\n\u003Csup>Open \u003Ca href=\"http:\u002F\u002Fweb.archivebox.localhost:8000\">\u003Ccode>http:\u002F\u002Fweb.archivebox.localhost:8000\u003C\u002Fcode>\u003C\u002Fa> for the public UI and \u003Ca href=\"http:\u002F\u002Fadmin.archivebox.localhost:8000\">\u003Ccode>http:\u002F\u002Fadmin.archivebox.localhost:8000\u003C\u002Fcode>\u003C\u002Fa> for the admin UI ➡️\u003C\u002Fsup>\u003Cbr\u002F>\n\u003Csup>Set \u003Ccode>LISTEN_HOST\u003C\u002Fcode> to change the base domain; \u003Ccode>web.\u003C\u002Fcode> and \u003Ccode>admin.\u003C\u002Fcode> subdomains are used automatically.\u003C\u002Fsup>\n\u003Cbr\u002F>\u003Cbr\u002F>\n\u003Ci>For more info, see our \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUsage#ui-usage\">Usage: Web UI\u003C\u002Fa> wiki. ➡️\u003C\u002Fi>\n\u003Cbr\u002F>\u003Cbr\u002F>\n\u003Cb>Optional: Change permissions to allow non-logged-in users\u003C\u002Fb>\n\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">\narchivebox config --set PUBLIC_ADD_VIEW=True   # allow guests to submit URLs \narchivebox config --set PUBLIC_SNAPSHOTS=True  # allow guests to see snapshot content\narchivebox config --set PUBLIC_INDEX=True      # allow guests to see list of all snapshots\n# or\ndocker compose run archivebox config --set ...\n\n# restart the server to apply any config changes\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003C\u002Fdetails>\n\n\u003Cbr\u002F>\n\u003Cbr\u002F>\n\n> [!TIP]\n> Whether in Docker or not, ArchiveBox commands work the same way, and can be used to access the same data on-disk.\n> For example, you could run the Web UI in Docker Compose, and run one-off commands with `pip`-installed ArchiveBox.\n\n\u003Cdetails>\n\u003Csummary>\u003Ci>Expand to show comparison...\u003C\u002Fi>\u003C\u002Fsummary>\u003Cbr\u002F>\n\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">\narchivebox add --depth=1 'https:\u002F\u002Fexample.com'                     # add a URL with pip-installed archivebox on the host\ndocker compose run archivebox add --depth=1 'https:\u002F\u002Fexample.com'                       # or w\u002F Docker Compose\ndocker run -it -v $PWD:\u002Fdata archivebox\u002Farchivebox add --depth=1 'https:\u002F\u002Fexample.com'  # or w\u002F Docker, all equivalent\n\u003C\u002Fcode>\u003C\u002Fpre>\n\n\u003Ci>For more info, see our \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FDocker\">Docker\u003C\u002Fa> wiki. ➡️\u003C\u002Fi>\n\n\u003C\u002Fdetails>\n\n\n\u003Cbr\u002F>\n\u003Cdiv align=\"center\" style=\"text-align: center\">\n\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002F65f82532-18dd-49c5-86f1-02b1f3100e1e\" width=\"49%\" alt=\"grass\"\u002F>\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002F65f82532-18dd-49c5-86f1-02b1f3100e1e\" width=\"49%\" alt=\"grass\"\u002F>\n\u003C\u002Fdiv>\n\u003Cbr\u002F>\n\n\u003Cdiv align=\"center\" style=\"text-align: center\">\n\u003Csub>. . . . . . . . . . . . . . . . . . . . . . . . . . . .\u003C\u002Fsub>\n\u003Cbr\u002F>\u003Cbr\u002F>\n\u003Ca href=\"https:\u002F\u002Fdemo.archivebox.io\">DEMO: \u003Ccode>https:\u002F\u002Fdemo.archivebox.io\u003C\u002Fcode>\u003C\u002Fa>\u003Cbr\u002F>\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUsage\">Usage\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FConfiguration\">Configuration\u003C\u002Fa> | \u003Ca href=\"#Caveats\">Caveats\u003C\u002Fa>\n\u003Cbr\u002F>\n\u003C\u002Fdiv>\n\n\u003Cbr\u002F>\n\n---\n\n\u003Cdiv align=\"center\" style=\"text-align: center\">\n\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002Fac1f897a-8baa-4f8b-8ee8-7443611f258b\" width=\"96%\" alt=\"lego\"\u002F>\n\u003C\u002Fdiv>\n\n\u003Cbr\u002F>\n\n# Overview\n\n\u003Ca name=\"input-formats\">\u003C\u002Fa>\n\n##  Input Formats: How to pass URLs into ArchiveBox for saving\n\n\n- \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002Fff20d251-5347-4b85-ae9b-83037d0ac01e\" height=\"28px\"\u002F> \u003Cb>From the official \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002Farchivebox-extension\">ArchiveBox Browser Extension\u003C\u002Fa>\u003C\u002Fb>  \n  \u003Ci>Provides realtime archiving of browsing history or selected pages from Chrome\u002FChromium\u002FFirefox browsers.\u003C\u002Fi>\n\n- \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002F64078483-21d7-4eb1-aa6e-9ad55afe45b8\" height=\"22px\"\u002F> From manual imports of URLs from RSS, JSON, CSV, TXT, SQL, HTML, Markdown, etc. files  \n  \u003Ci>ArchiveBox supports injesting URLs in [any text-based format](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUsage#Import-a-list-of-URLs-from-a-text-file).\u003C\u002Fi>\n\n- \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002F32b494e6-4de1-4984-8d88-dc02f18e5c34\" height=\"22px\"\u002F> From manually exported [browser history](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FQuickstart#2-get-your-list-of-urls-to-archive) or [browser bookmarks](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FQuickstart#2-get-your-list-of-urls-to-archive) (in Netscape format)  \n  \u003Ci>Instructions: \u003Ca href=\"https:\u002F\u002Fsupport.google.com\u002Fchrome\u002Fanswer\u002F96816?hl=en\">Chrome\u003C\u002Fa>, \u003Ca href=\"https:\u002F\u002Fsupport.mozilla.org\u002Fen-US\u002Fkb\u002Fexport-firefox-bookmarks-to-backup-or-transfer\">Firefox\u003C\u002Fa>, \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002F24ad068e-0fa6-41f4-a7ff-4c26fc91f71a\">Safari\u003C\u002Fa>, \u003Ca href=\"https:\u002F\u002Fsupport.microsoft.com\u002Fen-us\u002Fhelp\u002F211089\u002Fhow-to-import-and-export-the-internet-explorer-favorites-folder-to-a-32-bit-version-of-windows\">IE\u003C\u002Fa>, \u003Ca href=\"https:\u002F\u002Fhelp.opera.com\u002Fen\u002Flatest\u002Ffeatures\u002F#bookmarks:~:text=Click%20the%20import\u002F-,export%20button,-on%20the%20bottom\">Opera\u003C\u002Fa>, \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FQuickstart#2-get-your-list-of-urls-to-archive\">and more...\u003C\u002Fa>\u003C\u002Fi>\n\n- \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002F4f7bd318-265c-4235-ad25-38be89946b12\" height=\"22px\"\u002F> From URLs visited through a [MITM Proxy](https:\u002F\u002Fmitmproxy.org\u002F) with [`archivebox-proxy`](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002Farchivebox-proxy)  \n  \u003Ci>Provides [realtime archiving](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fissues\u002F577) of all traffic from any device going through the proxy.\u003C\u002Fi>\n\n- \u003Cimg src=\"https:\u002F\u002Fgetpocket.com\u002Ffavicon.ico\" height=\"22px\"\u002F> From bookmarking services or social media (e.g. Twitter bookmarks, Reddit saved posts, etc.)  \n  \u003Ci>Instructions: \u003Ca href=\"https:\u002F\u002Fgetpocket.com\u002Fexport\">Pocket\u003C\u002Fa>, \u003Ca href=\"https:\u002F\u002Fpinboard.in\u002Fexport\u002F\">Pinboard\u003C\u002Fa>, \u003Ca href=\"https:\u002F\u002Fwww.instapaper.com\u002Fuser\">Instapaper\u003C\u002Fa>, \u003Ca href=\"https:\u002F\u002Fshaarli.readthedocs.io\u002Fen\u002Fmaster\u002FUsage\u002F#importexport\">Shaarli\u003C\u002Fa>, \u003Ca href=\"https:\u002F\u002Fwww.groovypost.com\u002Fhowto\u002Fhowto\u002Fexport-delicious-bookmarks-xml\u002F\">Delicious\u003C\u002Fa>, \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fcsu\u002Fexport-saved-reddit\">Reddit Saved\u003C\u002Fa>, \u003Ca href=\"https:\u002F\u002Fdoc.wallabag.org\u002Fen\u002Fuser\u002Fimport\u002Fwallabagv2.html\">Wallabag\u003C\u002Fa>, \u003Ca href=\"http:\u002F\u002Fhelp.unmark.it\u002Fimport-export\">Unmark.it\u003C\u002Fa>, \u003Ca href=\"https:\u002F\u002Fwww.addictivetips.com\u002Fweb\u002Fonetab-save-close-all-chrome-tabs-to-restore-export-or-import\u002F\">OneTab\u003C\u002Fa>, \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fissues\u002F648\">Firefox Sync\u003C\u002Fa>, \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FQuickstart#2-get-your-list-of-urls-to-archive\">and more...\u003C\u002Fa>\u003C\u002Fi>\n\n\n\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002Fe1e5bd78-b0b6-45dc-914c-e1046fee4bc4\" width=\"330px\" align=\"right\" style=\"float: right\"\u002F>\n\n\n```bash\n# archivebox add --help\narchivebox add 'https:\u002F\u002Fexample.com\u002Fsome\u002Fpage'\narchivebox add --parser=generic_rss \u003C ~\u002FDownloads\u002Fsome_feed.xml\narchivebox add --depth=1 'https:\u002F\u002Fnews.ycombinator.com#2020-12-12'\necho 'http:\u002F\u002Fexample.com' | archivebox add\necho 'any text with \u003Ca href=\"https:\u002F\u002Fexample.com\">urls\u003C\u002Fa> in it' | archivebox add\n\n# if using Docker, add -i when piping stdin:\n# echo 'https:\u002F\u002Fexample.com' | docker run -v $PWD:\u002Fdata -i archivebox\u002Farchivebox add\n# if using Docker Compose, add -T when piping stdin \u002F stdout:\n# echo 'https:\u002F\u002Fexample.com' | docker compose run -T archivebox add\n```\n\nSee the [Usage: CLI](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUsage#CLI-Usage) page for documentation and examples.\n\nIt also includes a built-in scheduled import feature with `archivebox schedule`, handled by the same orchestrator that powers `archivebox server`, so you can pull in URLs from RSS feeds and websites regularly without a separate cron container.\n\n\u003Cbr\u002F>\n\n\n\u003Ca name=\"output-formats\">\u003C\u002Fa>\n\n## Output Formats: What ArchiveBox saves for each URL\n\n\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002Face0954a-ddac-4520-9d18-1c77b1ec50b2\" width=\"330px\" align=\"right\" style=\"float: right\"\u002F>\n\n\nFor each web page added, ArchiveBox creates a Snapshot folder and preserves its content as ordinary files inside the folder (e.g. HTML, PDF, PNG, JSON, etc.).\n\nIt uses all available methods out-of-the-box, but you can disable extractors and fine-tune the [configuration](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FConfiguration) as-needed.\n\n\u003Cbr\u002F>\n\u003Cdetails>\n\u003Csummary>\u003Ci>Expand to see the full list of ways it saves each page...\u003C\u002Fi>\u003C\u002Fsummary>\n\n\n\u003Ccode>data\u002Farchive\u002F{Snapshot.id}\u002F\u003C\u002Fcode>\u003Cbr\u002F>\n\u003Cul>\n\u003Cli>\u003Cstrong>Index:\u003C\u002Fstrong> \u003Ccode>index.html\u003C\u002Fcode> &amp; \u003Ccode>index.json\u003C\u002Fcode> HTML and JSON index files containing metadata and details\u003C\u002Fli>\n\u003Cli>\u003Cstrong>Title\u003C\u002Fstrong>, \u003Cstrong>Favicon\u003C\u002Fstrong>, \u003Cstrong>Headers\u003C\u002Fstrong> Response headers, site favicon, and parsed site title\u003C\u002Fli>\n\u003Cli>\u003Cstrong>SingleFile:\u003C\u002Fstrong> \u003Ccode>singlefile.html\u003C\u002Fcode> HTML snapshot rendered with headless Chrome using SingleFile\u003C\u002Fli>\n\u003Cli>\u003Cstrong>Wget Clone:\u003C\u002Fstrong> \u003Ccode>example.com\u002Fpage-name.html\u003C\u002Fcode> wget clone of the site with  \u003Ccode>warc\u002FTIMESTAMP.gz\u003C\u002Fcode>\u003C\u002Fli>\n\u003Cli>Chrome Headless \u003Cul>\n\u003Cli>\u003Cstrong>PDF:\u003C\u002Fstrong> \u003Ccode>output.pdf\u003C\u002Fcode> Printed PDF of site using headless chrome\u003C\u002Fli>\n\u003Cli>\u003Cstrong>Screenshot:\u003C\u002Fstrong> \u003Ccode>screenshot.png\u003C\u002Fcode> 1440x900 screenshot of site using headless chrome\u003C\u002Fli>\n\u003Cli>\u003Cstrong>DOM Dump:\u003C\u002Fstrong> \u003Ccode>output.html\u003C\u002Fcode> DOM Dump of the HTML after rendering using headless chrome\u003C\u002Fli>\n\u003C\u002Ful>\u003C\u002Fli>\n\u003Cli>\u003Cstrong>Article Text:\u003C\u002Fstrong> \u003Ccode>article.html\u002Fjson\u003C\u002Fcode> Article text extraction using Readability &amp; Mercury\u003C\u002Fli>\n\u003Cli>\u003Cstrong>Archive.org Permalink:\u003C\u002Fstrong> \u003Ccode>archive.org.txt\u003C\u002Fcode> A link to the saved site on archive.org\u003C\u002Fli>\n\u003Cli>\u003Cstrong>Audio &amp; Video:\u003C\u002Fstrong> \u003Ccode>media\u002F\u003C\u002Fcode> all audio\u002Fvideo files + playlists, including subtitles &amp; metadata w\u002F \u003Ccode>yt-dlp\u003C\u002Fcode>\u003C\u002Fli>\n\u003Cli>\u003Cstrong>Source Code:\u003C\u002Fstrong> \u003Ccode>git\u002F\u003C\u002Fcode> clone of any repository found on GitHub, Bitbucket, or GitLab links\u003C\u002Fli>\n\u003Cli>\u003Cem>More coming soon! See the \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FRoadmap\">Roadmap\u003C\u002Fa>...\u003C\u002Fem>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003C\u002Fdetails>\n\u003Cbr\u002F>\n\n## Configuration\n\n\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002Fea672e6b-4df5-49d8-b550-7f450951fd27\" width=\"330px\" align=\"right\" style=\"float: right\"\u002F>\n\nArchiveBox can be configured via environment variables, by using the `archivebox config` CLI, or by editing `.\u002FArchiveBox.conf`.\n\u003Cbr\u002F>\n\u003Cdetails>\n\u003Csummary>\u003Ci>Expand to see examples...\u003C\u002Fi>\u003C\u002Fsummary>\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">archivebox config                               # view the entire config\narchivebox config --get CHROME_BINARY           # view a specific value\n\u003Cbr\u002F>\narchivebox config --set CHROME_BINARY=chromium  # persist a config using CLI\n# OR\necho CHROME_BINARY=chromium >> ArchiveBox.conf  # persist a config using file\n# OR\nenv CHROME_BINARY=chromium archivebox ...       # run with a one-off config\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Csub>These methods also work the same way when run inside Docker, see the \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FDocker#configuration\">Docker Configuration\u003C\u002Fa> wiki page for details.\u003C\u002Fsub>\n\u003C\u002Fdetails>\u003Cbr\u002F>\n\nThe configuration is documented here: **[Configuration Wiki](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FConfiguration)**, and loaded from: [`archivebox\u002Fconfig\u002F`](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fblob\u002Fdev\u002Farchivebox\u002Fconfig\u002F).\n\n\u003Ca name=\"most-common-options-to-tweak\">\u003C\u002Fa>\n\u003Cdetails>\n\u003Csummary>\u003Ci>Expand to see the most common options to tweak...\u003C\u002Fi>\u003C\u002Fsummary>\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">\n# e.g. archivebox config --set TIMEOUT=120\n# or   docker compose run archivebox config --set TIMEOUT=120\n\u003Cbr\u002F>\nTIMEOUT=240                # default: 60    add more seconds on slower networks\nCHECK_SSL_VALIDITY=False   # default: True  False = allow saving URLs w\u002F bad SSL\n\u003Cbr\u002F>\nPUBLIC_INDEX=True          # default: True  whether anon users can view index\nPUBLIC_SNAPSHOTS=True      # default: True  whether anon users can view pages\nPUBLIC_ADD_VIEW=False      # default: False whether anon users can add new URLs\n\u003Cbr\u002F>\nUSER_AGENT=\"Mozilla\u002F5.0 ...\"  # change this to get around bot blocking\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003C\u002Fdetails>\n\u003Cbr\u002F>\n\n## Dependencies\n\nTo achieve high-fidelity archives in as many situations as possible, ArchiveBox depends on a variety of 3rd-party libraries and tools that specialize in extracting different types of content.\n\n> Under-the-hood, ArchiveBox uses [Django](https:\u002F\u002Fwww.djangoproject.com\u002Fstart\u002Foverview\u002F) to power its [Web UI](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUsage#ui-usage), [Django Ninja](https:\u002F\u002Fdjango-ninja.dev\u002F) for the REST API, and [SQlite](https:\u002F\u002Fwww.sqlite.org\u002Flocrsf.html) + the filesystem to provide [fast & durable metadata storage](https:\u002F\u002Fwww.sqlite.org\u002Flocrsf.html) w\u002F [deterministic upgrades](https:\u002F\u002Fstackoverflow.com\u002Fa\u002F39976321\u002F2156113).\n\nArchiveBox bundles industry-standard tools like [Google Chrome](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FChromium-Install), [`wget`, `yt-dlp`, `readability`, etc.](#dependencies) internally, and its operation can be [tuned, secured, and extended](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FConfiguration) as-needed for many different applications.\n\n\u003Cbr\u002F>\n\u003Cdetails>\n\u003Csummary>\u003Ci>Expand to learn more about ArchiveBox's internals & dependencies...\u003C\u002Fi>\u003C\u002Fsummary>\u003Cbr\u002F>\n\n\u003Cblockquote>\n\u003Cp>\u003Cem>TIP: For better security while running ArchiveBox, and to avoid polluting your host system with a bunch of sub-dependencies that you need to keep up-to-date,\u003Cstrong>it is strongly recommended to use the \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FDocker\">⭐️ official Docker image\u003C\u002Fa>\u003C\u002Fstrong> which provides everything in an easy container with simple one-liner upgrades.\u003C\u002Fem>\u003C\u002Fp>\n\u003C\u002Fblockquote>\n\n\u003Cul>\n\u003Cli>Language: Python \u003Ccode>&gt;=3.13\u003C\u002Fcode>\u003C\u002Fli>\n\u003Cli>Backend: \u003Ca href=\"https:\u002F\u002Fwww.djangoproject.com\u002F\">Django\u003C\u002Fa> + \u003Ca href=\"https:\u002F\u002Fdjango-ninja.dev\u002F\">Django-Ninja\u003C\u002Fa> for REST API\u003C\u002Fli>\n\u003Cli>Frontend: \u003Ca href=\"https:\u002F\u002Fdocs.djangoproject.com\u002Fen\u002F6.0\u002Fref\u002Fcontrib\u002Fadmin\u002F\">Django Admin\u003C\u002Fa> + Vanilla HTML, CSS, JS\u003C\u002Fli>\n\u003Cli>Web Server: \u003Ca href=\"https:\u002F\u002Fwww.djangoproject.com\u002F\">Django\u003C\u002Fa> + \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fdjango\u002Fdaphne\u002F\">\u003Ccode>daphne\u003C\u002Fcode>\u003C\u002Fa> (ASGI)\u003C\u002Fli>\n\u003Cli>Database: \u003Ca href=\"https:\u002F\u002Fdocs.djangoproject.com\u002Fen\u002F6.0\u002Fref\u002Fdatabases\u002F#sqlite-notes\">Django ORM\u003C\u002Fa> saving to \u003Ca href=\"https:\u002F\u002Fwww.sqlite.org\u002Fmostdeployed.html\">SQLite3\u003C\u002Fa> \u003Ccode>.\u002Fdata\u002Findex.sqlite3\u003C\u002Fcode>\u003C\u002Fli>\n\u003Cli>Job Queue: Custom orchestrator using \u003Ccode>supervisord\u003C\u002Fcode> for worker management\u003C\u002Fli>\n\u003Cli>Build\u002Ftest\u002Flint: \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fastral-sh\u002Fuv\">\u003Ccode>uv\u003C\u002Fcode>\u003C\u002Fa> \u002F \u003Ccode>pyright\u003C\u002Fcode>+\u003Ccode>ty\u003C\u002Fcode>+\u003Ccode>pytest\u003C\u002Fcode> \u002F \u003Ccode>ruff\u003C\u002Fcode>\u003C\u002Fli>\n\u003Cli>Subdependencies: \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002Fabxpkg\">\u003Ccode>abxpkg\u003C\u002Fcode>\u003C\u002Fa> installs apt\u002Fbrew\u002Fpip\u002Fnpm pkgs at runtime (e.g. \u003Ccode>yt-dlp\u003C\u002Fcode>, \u003Ccode>singlefile\u003C\u002Fcode>, \u003Ccode>readability\u003C\u002Fcode>, \u003Ccode>git\u003C\u002Fcode>)\u003C\u002Fli>\n\u003C\u002Ful>\n\n\nThese optional subdependencies used for archiving sites include:\n\n\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fassets\u002F511499\u002F62a02155-05d7-4f3e-8de5-75a50a145c4f\" alt=\"archivebox --version CLI output screenshot showing dependencies installed\" width=\"330px\" align=\"right\" style=\"max-width: 100%;\">\n\n\u003Cul>\n\u003Cli>\u003Ccode>chromium\u003C\u002Fcode> \u002F \u003Ccode>chrome\u003C\u002Fcode> (for screenshots, PDF, DOM HTML, and headless JS scripts)\u003C\u002Fli>\n\u003Cli>\u003Ccode>node\u003C\u002Fcode> &amp; \u003Ccode>npm\u003C\u002Fcode> (for readability, mercury, and singlefile)\u003C\u002Fli>\n\u003Cli>\u003Ccode>wget\u003C\u002Fcode> (for plain HTML, static files, and WARC saving)\u003C\u002Fli>\n\u003Cli>\u003Ccode>curl\u003C\u002Fcode> (for fetching headers, favicon, and posting to Archive.org)\u003C\u002Fli>\n\u003Cli>\u003Ccode>yt-dlp\u003C\u002Fcode> or \u003Ccode>youtube-dl\u003C\u002Fcode> (for audio, video, and subtitles)\u003C\u002Fli>\n\u003Cli>\u003Ccode>git\u003C\u002Fcode> (for cloning git repos)\u003C\u002Fli>\n\u003Cli>\u003Ccode>singlefile\u003C\u002Fcode> (for saving into a self-contained html file)\u003C\u002Fli>\n\u003Cli>\u003Ccode>postlight\u002Fparser\u003C\u002Fcode> (for discussion threads, forums, and articles)\u003C\u002Fli>\n\u003Cli>\u003Ccode>readability\u003C\u002Fcode> (for articles and long text content)\u003C\u002Fli>\n\u003Cli>and more as we grow...\u003C\u002Fli>\n\u003C\u002Ful>\n\nYou don't need to install every dependency to use ArchiveBox. ArchiveBox will automatically disable extractors that rely on dependencies that aren't installed, based on what is configured and available in your \u003Ccode>$PATH\u003C\u002Fcode>.\n  \nIf not using Docker, make sure to keep the dependencies up-to-date yourself and check that ArchiveBox isn't reporting any incompatibility with the versions you install.\n\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">#install python3 and archivebox with your system package manager\n# apt\u002Fbrew\u002Fpip\u002Fetc install ... (see Quickstart instructions above)\n\u003Cbr\u002F>\nwhich -a archivebox    # see where you have installed archivebox\narchivebox install     # auto install all the extractors and extras\narchivebox --version   # see info and check validity of installed dependencies\n\u003C\u002Fcode>\u003C\u002Fpre>\n  \nInstalling directly on \u003Cstrong>Windows without Docker or WSL\u002FWSL2\u002FCygwin is not officially supported\u003C\u002Fstrong> (I cannot respond to Windows support tickets), but some advanced users have reported getting it working.\n\n\u003Ch4>Learn More\u003C\u002Fh4>\n\u003Cul>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FInstall#dependencies\">Wiki: Install (Dependencies)\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FChromium-Install\">Wiki: Chromium Install\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUpgrading-or-Merging-Archives\">Wiki: Upgrading or Merging Archives\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FTroubleshooting#installing\">Wiki: Troubleshooting (Installing)\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\n\u003C\u002Fdetails>\n\u003Cbr\u002F>\n\n\n## Archive Layout\n\nAll of ArchiveBox's state (SQLite DB, content, config, logs, etc.) is stored in a single folder per collection.\n\n\u003Cbr\u002F>\n\u003Cdetails>\n\u003Csummary>\u003Ci>Expand to learn more about the layout of Archivebox's data on-disk...\u003C\u002Fi>\u003C\u002Fsummary>\u003Cbr\u002F>\n\nData folders can be created anywhere (`~\u002Farchivebox\u002Fdata` or `$PWD\u002Fdata` as seen in our examples), and you can create as many data folders as you want to hold different collections.\nAll \u003Ccode>archivebox\u003C\u002Fcode> CLI commands are designed to be run from inside an ArchiveBox data folder, starting with \u003Ccode>archivebox init\u003C\u002Fcode> to initialize a new collection inside an empty directory.\n\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">mkdir -p ~\u002Farchivebox\u002Fdata && cd ~\u002Farchivebox\u002Fdata   # just an example, can be anywhere\narchivebox init\u003C\u002Fcode>\u003C\u002Fpre>\n\nThe on-disk layout is optimized to be easy to browse by hand and durable long-term. The main index is a standard \u003Ccode>index.sqlite3\u003C\u002Fcode> database in the root of the data folder (it can also be \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FPublishing-Your-Archive#2-export-and-host-it-as-static-html\">exported as static JSON\u002FHTML\u003C\u002Fa>), and the archive snapshots are organized by date-added timestamp in the \u003Ccode>data\u002Farchive\u002F\u003C\u002Fcode> subfolder.\n\n\u003Cimg src=\"https:\u002F\u002Fuser-images.githubusercontent.com\u002F511499\u002F117453293-c7b91600-af12-11eb-8a3f-aa48b0f9da3c.png\" width=\"400px\" align=\"right\" style=\"float: right\"\u002F>\n\n\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\">data\u002F\n    index.sqlite3\n    ArchiveBox.conf\n    archive\u002F\n        ...\n        1617687755\u002F\n            index.html\n            index.json\n            screenshot.png\n            media\u002Fsome_video.mp4\n            warc\u002F1617687755.warc.gz\n            git\u002Fsomerepo.git\n            ...\n\u003C\u002Fcode>\u003C\u002Fpre>\n\nEach snapshot subfolder \u003Ccode>data\u002Farchive\u002FTIMESTAMP\u002F\u003C\u002Fcode> includes a static \u003Ccode>index.json\u003C\u002Fcode> and \u003Ccode>index.html\u003C\u002Fcode> describing its contents, and the snapshot extractor outputs are plain files within the folder.\n\n\u003Ch4>Learn More\u003C\u002Fh4>\n\u003Cul>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FSetting-Up-Storage\">Wiki: Setting Up Storage (SMB, NFS, S3, B2, Google Drive, etc.)\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUsage#Disk-Layout\">Wiki: Usage (Disk Layout)\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUsage#large-archives\">Wiki: Usage (Large Archives)\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FSecurity-Overview#output-folder\">Wiki: Security Overview (Output Folder)\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FPublishing-Your-Archive\">Wiki: Publishing Your Archive\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FUpgrading-or-Merging-Archives\">Wiki: Upgrading or Merging Archives\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\n\u003C\u002Fdetails>\n\u003Cbr\u002F>\n\n\n## Static Archive Exporting\n\nYou can export your index as static HTML using `archivebox list` (so you can view it without an ArchiveBox server).\n\n\u003Cbr\u002F>\n\u003Cdetails>\n\u003Csummary>\u003Ci>Expand to learn how to export your ArchiveBox collection...\u003C\u002Fi>\u003C\u002Fsummary>\u003Cbr\u002F>\n\n\u003Cblockquote>\n\u003Cp>\u003Cem>NOTE: These exports are not paginated, exporting many URLs or the entire archive at once may be slow. Use the filtering CLI flags on the \u003Ccode>archivebox list\u003C\u002Fcode> command to export specific Snapshots or ranges.\u003C\u002Fem>\u003C\u002Fp>\n\u003C\u002Fblockquote>\n\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\"># archivebox list --help\narchivebox list --html --with-headers > index.html     # export to static html table\narchivebox list --json --with-headers > index.json     # export to json blob\narchivebox list --csv=timestamp,url,title > index.csv  # export to csv spreadsheet\n\n# (if using Docker Compose, add the -T flag when piping)\n# docker compose run -T archivebox list --html 'https:\u002F\u002Fexample.com' > index.json\n\u003C\u002Fcode>\u003C\u002Fpre>\n\nThe paths in the static exports are relative, make sure to keep them next to your `.\u002Farchive` folder when backing them up or viewing them.\n\n\u003Ch4>Learn More\u003C\u002Fh4>\n\n\u003Cul>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FPublishing-Your-Archive#2-export-and-host-it-as-static-html\">Wiki: Publishing Your Archive (Exporting as Static HTML)\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FSecurity-Overview#publishing\">Wiki: Security Overview (Publishing)\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FConfiguration#public_index--public_snapshots--public_add_view\">Wiki: Configuration (\u003Ccode>PUBLIC_INDEX\u003C\u002Fcode>, \u003Ccode>PUBLIC_SNAPSHOTS\u003C\u002Fcode>, \u003Ccode>PUBLIC_ADD_VIEW\u003C\u002Fcode>)\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\n\u003C\u002Fdetails>\n\u003Cbr\u002F>\n\n\n\u003Cdiv align=\"center\" style=\"text-align: center\">\n\u003Cimg src=\"https:\u002F\u002Fdocs.monadical.com\u002Fuploads\u002Fupload_b6900afc422ae699bfefa2dcda3306f3.png\" width=\"100%\" alt=\"security graphic\"\u002F>\n\u003C\u002Fdiv>\n\n\n## Caveats\n\n### Archiving Private Content\n\n\u003Ca id=\"archiving-private-urls\">\u003C\u002Fa>\n\nIf you're importing pages with private content or URLs containing secret tokens you don't want public (e.g Google Docs, paywalled content, unlisted videos, etc.), **you may want to disable some of the extractor methods to avoid leaking that content to 3rd party APIs or the public**.\n\n\u003Cbr\u002F>\n\u003Cdetails>\n\u003Csummary>\u003Ci>Expand to learn about privacy, permissions, and user accounts...\u003C\u002Fi>\u003C\u002Fsummary>\n\n\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\"># don't save private content to ArchiveBox, e.g.:\narchivebox add 'https:\u002F\u002Fdocs.google.com\u002Fdocument\u002Fd\u002F12345somePrivateDocument'\narchivebox add 'https:\u002F\u002Fvimeo.com\u002FsomePrivateVideo'\n\n# restrict the main index, Snapshot content, and Add Page to authenticated users as-needed:\narchivebox config --set PUBLIC_INDEX=False\narchivebox config --set PUBLIC_SNAPSHOTS=False\narchivebox config --set PUBLIC_ADD_VIEW=False\narchivebox manage createsuperuser\n\u003C\u002Fcode>\u003C\u002Fpre>\n\n\u003Cblockquote>\n\u003Cp>\u003Cem>CAUTION: Assume anyone \u003Cem>viewing\u003C\u002Fem> your archives will be able to see any cookies, session tokens, or private URLs passed to ArchiveBox during archiving.\u003C\u002Fem>\n\u003Cem>Make sure to secure your ArchiveBox data and don't share snapshots with others without stripping out sensitive headers and content first.\u003C\u002Fem>\u003C\u002Fp>\n\u003C\u002Fblockquote>\n\n\u003Ch4>Learn More\u003C\u002Fh4>\n\n\u003Cul>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FPublishing-Your-Archive\">Wiki: Publishing Your Archive\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FSecurity-Overview\">Wiki: Security Overview\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FChromium-Install#setting-up-a-chromium-user-profile\">Wiki: Chromium Install (Setting Up a User Profile)\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FConfiguration#chrome_user_data_dir\">Wiki: Configuration (\u003Ccode>CHROME_USER_DATA_DIR\u003C\u002Fcode>)\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FConfiguration#cookies_file\">Wiki: Configuration (\u003Ccode>COOKIES_FILE\u003C\u002Fcode>)\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\n\u003C\u002Fdetails>\n\u003Cbr\u002F>\n\n\n### Security Risks of Viewing Archived JS\n\nBe aware that malicious archived JS can access the contents of other pages in your archive when viewed. Because the Web UI serves all viewed snapshots from a single domain, they share a request context and **typical CSRF\u002FCORS\u002FXSS\u002FCSP protections do not work to prevent cross-site request attacks**. See the [Security Overview](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FSecurity-Overview#stealth-mode) page and [Issue #239](https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fissues\u002F239) for more details.\n\n\n\u003Cbr\u002F>\n\u003Cdetails>\n\u003Csummary>\u003Ci>Expand to see risks and mitigations...\u003C\u002Fi>\u003C\u002Fsummary>\n\n\n\u003Cpre lang=\"bash\">\u003Ccode style=\"white-space: pre-line\"># visiting an archived page with malicious JS:\nhttps:\u002F\u002F127.0.0.1:8000\u002Farchive\u002F1602401954\u002Fexample.com\u002Findex.html\n\n# example.com\u002Findex.js can now make a request to read everything from:\nhttps:\u002F\u002F127.0.0.1:8000\u002Findex.html\nhttps:\u002F\u002F127.0.0.1:8000\u002Farchive\u002F*\n# then example.com\u002Findex.js can send it off to some evil server\n\u003C\u002Fcode>\u003C\u002Fpre>\n\n\u003Cblockquote>\n\u003Cp>\u003Cem>NOTE: Only the \u003Ccode>wget\u003C\u002Fcode> &amp; \u003Ccode>dom\u003C\u002Fcode> extractor methods execute archived JS when viewing snapshots, all other archive methods produce static output that does not execute JS on viewing.\u003C\u002Fem>\u003Cbr\u002F>\n\u003Cem>If you are worried about these issues ^ you should disable these extractors using:\u003Cbr\u002F> \u003Ccode>archivebox config --set SAVE_WGET=False SAVE_DOM=False\u003C\u002Fcode>.\u003C\u002Fem>\u003C\u002Fp>\n\u003C\u002Fblockquote>\n\n\u003Ch4>Learn More\u003C\u002Fh4>\n\u003Cul>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FSecurity-Overview\">Wiki: Security Overview\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fissues\u002F239\">ArchiveBox Github Issue: #239\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fsecurity\u002Fadvisories\u002FGHSA-cr45-98w9-gwqx\">Security Advisory: \u003Ccode>CVE-2023-45815\u003C\u002Fcode>\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FArchiveBox\u002FArchiveBox\u002Fwiki\u002FSecurity-Overview#publishing\">Wiki: Security Overview (Publishing)\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002F","ArchiveBox 是一个开源的自托管网页存档应用，可以从URL、浏览器历史记录、书签等来源保存网页内容。它支持将网页以多种格式如HTML、PDF、图片和视频等保存下来，并且保证这些文件在未来几十年内仍可读取。项目采用Python编写，具备CLI、REST API及webhook功能，方便与其他服务集成。适用于个人或组织对重要网页内容进行备份归档，比如保存研究资料、法律证据、社交媒体上的照片和视频等场景。",2,"2026-06-11 02:48:42","top_language"]