[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-7707":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":16,"compositeScore":19,"rankGlobal":10,"rankLanguage":10,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":23,"hasPages":21,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":25,"readmeContent":26,"aiSummary":27,"trendingCount":16,"starSnapshotCount":16,"syncStatus":28,"lastSyncTime":29,"discoverSource":30},7707,"wayback-machine-downloader","hartator\u002Fwayback-machine-downloader","hartator","Download an entire website from the Wayback Machine.","",null,"Ruby",5896,802,127,183,0,4,24,69.11,"Other",false,"master",true,[],"2026-06-12 04:00:35","# Wayback Machine Downloader\n\n[![Gem Version](https:\u002F\u002Fbadge.fury.io\u002Frb\u002Fwayback_machine_downloader.svg)](https:\u002F\u002Frubygems.org\u002Fgems\u002Fwayback_machine_downloader\u002F)\n[![Build Status](https:\u002F\u002Ftravis-ci.org\u002Fhartator\u002Fwayback-machine-downloader.svg?branch=master)](https:\u002F\u002Ftravis-ci.org\u002Fhartator\u002Fwayback-machine-downloader)\n\nDownload an entire website from the Internet Archive Wayback Machine.\n\n## Installation\n\nYou need to install Ruby on your system (>= 1.9.2) - if you don't already have it.\nThen run:\n\n    gem install wayback_machine_downloader\n\n**Tip:** If you run into permission errors, you might have to add `sudo` in front of this command.\n\n## Basic Usage\n\nRun wayback_machine_downloader with the base url of the website you want to retrieve as a parameter (e.g., http:\u002F\u002Fexample.com):\n\n    wayback_machine_downloader http:\u002F\u002Fexample.com\n\n## How it works\n\nIt will download the last version of every file present on Wayback Machine to `.\u002Fwebsites\u002Fexample.com\u002F`. It will also re-create a directory structure and auto-create `index.html` pages to work seamlessly with Apache and Nginx. All files downloaded are the original ones and not Wayback Machine rewritten versions. This way, URLs and links structure are the same as before.\n\n## Advanced Usage\n\n\tUsage: wayback_machine_downloader http:\u002F\u002Fexample.com\n\n\tDownload an entire website from the Wayback Machine.\n\n\tOptional options:\n\t    -d, --directory PATH             Directory to save the downloaded files into\n\t\t\t\t\t     Default is .\u002Fwebsites\u002F plus the domain name\n\t    -s, --all-timestamps             Download all snapshots\u002Ftimestamps for a given website\n\t    -f, --from TIMESTAMP             Only files on or after timestamp supplied (ie. 20060716231334)\n\t    -t, --to TIMESTAMP               Only files on or before timestamp supplied (ie. 20100916231334)\n\t    -e, --exact-url                  Download only the url provided and not the full site\n\t    -o, --only ONLY_FILTER           Restrict downloading to urls that match this filter\n\t\t\t\t\t     (use \u002F\u002F notation for the filter to be treated as a regex)\n\t    -x, --exclude EXCLUDE_FILTER     Skip downloading of urls that match this filter\n\t\t\t\t\t     (use \u002F\u002F notation for the filter to be treated as a regex)\n\t    -a, --all                        Expand downloading to error files (40x and 50x) and redirections (30x)\n\t    -c, --concurrency NUMBER         Number of multiple files to download at a time\n\t\t\t\t\t     Default is one file at a time (ie. 20)\n\t    -p, --maximum-snapshot NUMBER    Maximum snapshot pages to consider (Default is 100)\n\t\t\t\t\t     Count an average of 150,000 snapshots per page\n\t    -l, --list                       Only list file urls in a JSON format with the archived timestamps, won't download anything\n\t    \n## Specify directory to save files to\n\n    -d, --directory PATH\n\nOptional. By default, Wayback Machine Downloader will download files to `.\u002Fwebsites\u002F` followed by the domain name of the website. You may want to save files in a specific directory using this option.\n\nExample:\n\n    wayback_machine_downloader http:\u002F\u002Fexample.com --directory downloaded-backup\u002F\n    \n## All Timestamps\n\n    -s, --all-timestamps \n\nOptional. This option will download all timestamps\u002Fsnapshots for a given website. It will uses the timestamp of each snapshot as directory.\n\nExample:\n\n    wayback_machine_downloader http:\u002F\u002Fexample.com --all-timestamps \n    \n    Will download:\n    \twebsites\u002Fexample.com\u002F20060715085250\u002Findex.html\n    \twebsites\u002Fexample.com\u002F20051120005053\u002Findex.html\n    \twebsites\u002Fexample.com\u002F20060111095815\u002Fimg\u002Flogo.png\n    \t...\n\n## From Timestamp\n\n    -f, --from TIMESTAMP\n\nOptional. You may want to supply a from timestamp to lock your backup to a specific version of the website. Timestamps can be found inside the urls of the regular Wayback Machine website (e.g., https:\u002F\u002Fweb.archive.org\u002Fweb\u002F20060716231334\u002Fhttp:\u002F\u002Fexample.com). You can also use years (2006), years + month (200607), etc. It can be used in combination of To Timestamp.\nWayback Machine Downloader will then fetch only file versions on or after the timestamp specified.\n\nExample:\n\n    wayback_machine_downloader http:\u002F\u002Fexample.com --from 20060716231334\n\n## To Timestamp\n\n    -t, --to TIMESTAMP\n\nOptional. You may want to supply a to timestamp to lock your backup to a specific version of the website. Timestamps can be found inside the urls of the regular Wayback Machine website (e.g., https:\u002F\u002Fweb.archive.org\u002Fweb\u002F20100916231334\u002Fhttp:\u002F\u002Fexample.com). You can also use years (2010), years + month (201009), etc. It can be used in combination of From Timestamp.\nWayback Machine Downloader will then fetch only file versions on or before the timestamp specified.\n\nExample:\n\n    wayback_machine_downloader http:\u002F\u002Fexample.com --to 20100916231334\n    \n## Exact Url\n\n\t-e, --exact-url \n\nOptional. If you want to retrieve only the file matching exactly the url provided, you can use this flag. It will avoid downloading anything else.\n\nFor example, if you only want to download only the html homepage file of example.com:\n\n    wayback_machine_downloader http:\u002F\u002Fexample.com --exact-url \n\n\n## Only URL Filter\n\n     -o, --only ONLY_FILTER\n\nOptional. You may want to retrieve files which are of a certain type (e.g., .pdf, .jpg, .wrd...) or are in a specific directory. To do so, you can supply the `--only` flag with a string or a regex (using the '\u002Fregex\u002F' notation) to limit which files Wayback Machine Downloader will download.\n\nFor example, if you only want to download files inside a specific `my_directory`:\n\n    wayback_machine_downloader http:\u002F\u002Fexample.com --only my_directory\n\nOr if you want to download every images without anything else:\n\n    wayback_machine_downloader http:\u002F\u002Fexample.com --only \"\u002F\\.(gif|jpg|jpeg)$\u002Fi\"\n\n## Exclude URL Filter\n\n     -x, --exclude EXCLUDE_FILTER\n\nOptional. You may want to retrieve files which aren't of a certain type (e.g., .pdf, .jpg, .wrd...) or aren't in a specific directory. To do so, you can supply the `--exclude` flag with a string or a regex (using the '\u002Fregex\u002F' notation) to limit which files Wayback Machine Downloader will download.\n\nFor example, if you want to avoid downloading files inside `my_directory`:\n\n    wayback_machine_downloader http:\u002F\u002Fexample.com --exclude my_directory\n\nOr if you want to download everything except images:\n\n    wayback_machine_downloader http:\u002F\u002Fexample.com --exclude \"\u002F\\.(gif|jpg|jpeg)$\u002Fi\"\n\n## Expand downloading to all file types\n\n     -a, --all\n\nOptional. By default, Wayback Machine Downloader limits itself to files that responded with 200 OK code. If you also need errors files (40x and 50x codes) or redirections files (30x codes), you can use the `--all` or `-a` flag and Wayback Machine Downloader will download them in addition of the 200 OK files. It will also keep empty files that are removed by default.\n\nExample:\n\n    wayback_machine_downloader http:\u002F\u002Fexample.com --all\n\n## Only list files without downloading\n\n     -l, --list\n\nIt will just display the files to be downloaded with their snapshot timestamps and urls. The output format is JSON. It won't download anything. It's useful for debugging or to connect to another application.\n\nExample:\n\n    wayback_machine_downloader http:\u002F\u002Fexample.com --list\n\n## Maximum number of snapshot pages to consider\n\n    -p, --snapshot-pages NUMBER    \n\nOptional. Specify the maximum number of snapshot pages to consider. Count an average of 150,000 snapshots per page. 100 is the default maximum number of snapshot pages and should be sufficient for most websites. Use a bigger number if you want to download a very large website.\n\nExample:\n\n    wayback_machine_downloader http:\u002F\u002Fexample.com --snapshot-pages 300    \n\n## Download multiple files at a time\n\n    -c, --concurrency NUMBER  \n\nOptional. Specify the number of multiple files you want to download at the same time. Allows one to speed up the download of a website significantly. Default is to download one file at a time.\n\nExample:\n\n    wayback_machine_downloader http:\u002F\u002Fexample.com --concurrency 20\n\n## Using the Docker image\n\nAs an alternative installation way, we have a Docker image! Retrieve the wayback-machine-downloader Docker image this way:\n\n    docker pull hartator\u002Fwayback-machine-downloader\n\nThen, you should be able to use the Docker image to download websites. For example:\n\n    docker run --rm -it -v $PWD\u002Fwebsites:\u002Fwebsites hartator\u002Fwayback-machine-downloader http:\u002F\u002Fexample.com\n\n## Contributing\n\nContributions are welcome! Just submit a pull request via GitHub.\n\nTo run the tests:\n\n    bundle install\n    bundle exec rake test\n","Wayback Machine Downloader 是一个用于从互联网档案馆的 Wayback Machine 下载整个网站的工具。其核心功能包括下载网站的所有文件，并重新创建目录结构，自动生成 `index.html` 页面以便与 Apache 和 Nginx 无缝配合。该工具支持多种高级选项，如指定时间戳范围、过滤特定 URL、排除某些文件类型等，以满足不同需求。适用于需要备份或恢复已删除或不可用的网站内容的场景，尤其对于历史研究、数据分析和网站维护等领域具有重要价值。",2,"2026-06-11 03:13:54","top_language"]