[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-2703":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":14,"compositeScore":19,"rankGlobal":10,"rankLanguage":10,"license":20,"archived":21,"fork":22,"defaultBranch":23,"hasWiki":22,"hasPages":22,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":27,"readmeContent":28,"aiSummary":29,"trendingCount":16,"starSnapshotCount":16,"syncStatus":30,"lastSyncTime":31,"discoverSource":32},2703,"pyspider","binux\u002Fpyspider","binux","A Powerful Spider(Web Crawler) System in Python.","http:\u002F\u002Fdocs.pyspider.org\u002F",null,"Python",16811,3630,1,274,0,3,10,72.5,"Apache License 2.0",true,false,"master",[25,26],"crawler","python","2026-06-12 04:00:15","pyspider [![Build Status]][Travis CI] [![Coverage Status]][Coverage]\n========\n\nA Powerful Spider(Web Crawler) System in Python.\n\n- Write script in Python\n- Powerful WebUI with script editor, task monitor, project manager and result viewer\n- [MySQL](https:\u002F\u002Fwww.mysql.com\u002F), [MongoDB](https:\u002F\u002Fwww.mongodb.org\u002F), [Redis](http:\u002F\u002Fredis.io\u002F), [SQLite](https:\u002F\u002Fwww.sqlite.org\u002F), [Elasticsearch](https:\u002F\u002Fwww.elastic.co\u002Fproducts\u002Felasticsearch); [PostgreSQL](http:\u002F\u002Fwww.postgresql.org\u002F) with [SQLAlchemy](http:\u002F\u002Fwww.sqlalchemy.org\u002F) as database backend\n- [RabbitMQ](http:\u002F\u002Fwww.rabbitmq.com\u002F), [Redis](http:\u002F\u002Fredis.io\u002F) and [Kombu](http:\u002F\u002Fkombu.readthedocs.org\u002F) as message queue\n- Task priority, retry, periodical, recrawl by age, etc...\n- Distributed architecture, Crawl Javascript pages, Python 2.{6,7}, 3.{3,4,5,6} support, etc...\n\nTutorial: [http:\u002F\u002Fdocs.pyspider.org\u002Fen\u002Flatest\u002Ftutorial\u002F](http:\u002F\u002Fdocs.pyspider.org\u002Fen\u002Flatest\u002Ftutorial\u002F)  \nDocumentation: [http:\u002F\u002Fdocs.pyspider.org\u002F](http:\u002F\u002Fdocs.pyspider.org\u002F)  \nRelease notes: [https:\u002F\u002Fgithub.com\u002Fbinux\u002Fpyspider\u002Freleases](https:\u002F\u002Fgithub.com\u002Fbinux\u002Fpyspider\u002Freleases)  \n\nSample Code \n-----------\n\n```python\nfrom pyspider.libs.base_handler import *\n\n\nclass Handler(BaseHandler):\n    crawl_config = {\n    }\n\n    @every(minutes=24 * 60)\n    def on_start(self):\n        self.crawl('http:\u002F\u002Fscrapy.org\u002F', callback=self.index_page)\n\n    @config(age=10 * 24 * 60 * 60)\n    def index_page(self, response):\n        for each in response.doc('a[href^=\"http\"]').items():\n            self.crawl(each.attr.href, callback=self.detail_page)\n\n    def detail_page(self, response):\n        return {\n            \"url\": response.url,\n            \"title\": response.doc('title').text(),\n        }\n```\n\n\nInstallation\n------------\n\n* `pip install pyspider`\n* run command `pyspider`, visit [http:\u002F\u002Flocalhost:5000\u002F](http:\u002F\u002Flocalhost:5000\u002F)\n\n**WARNING:** WebUI is open to the public by default, it can be used to execute any command which may harm your system. Please use it in an internal network or [enable `need-auth` for webui](http:\u002F\u002Fdocs.pyspider.org\u002Fen\u002Flatest\u002FCommand-Line\u002F#-config).\n\nQuickstart: [http:\u002F\u002Fdocs.pyspider.org\u002Fen\u002Flatest\u002FQuickstart\u002F](http:\u002F\u002Fdocs.pyspider.org\u002Fen\u002Flatest\u002FQuickstart\u002F)\n\nContribute\n----------\n\n* Use It\n* Open [Issue], send PR\n* [User Group]\n* [中文问答](http:\u002F\u002Fsegmentfault.com\u002Ft\u002Fpyspider)\n\n\nTODO\n----\n\n### v0.4.0\n\n- [ ] a visual scraping interface like [portia](https:\u002F\u002Fgithub.com\u002Fscrapinghub\u002Fportia)\n\n\nLicense\n-------\nLicensed under the Apache License, Version 2.0\n\n\n[Build Status]:         https:\u002F\u002Fimg.shields.io\u002Ftravis\u002Fbinux\u002Fpyspider\u002Fmaster.svg?style=flat\n[Travis CI]:            https:\u002F\u002Ftravis-ci.org\u002Fbinux\u002Fpyspider\n[Coverage Status]:      https:\u002F\u002Fimg.shields.io\u002Fcoveralls\u002Fbinux\u002Fpyspider.svg?branch=master&style=flat\n[Coverage]:             https:\u002F\u002Fcoveralls.io\u002Fr\u002Fbinux\u002Fpyspider\n[Try]:                  https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftry-pyspider-blue.svg?style=flat\n[Issue]:                https:\u002F\u002Fgithub.com\u002Fbinux\u002Fpyspider\u002Fissues\n[User Group]:           https:\u002F\u002Fgroups.google.com\u002Fgroup\u002Fpyspider-users\n","pyspider 是一个强大的基于 Python 的网页爬虫系统。它支持使用 Python 编写爬虫脚本，并提供了一个功能丰富的 Web 界面，包括脚本编辑器、任务监控、项目管理和结果查看等功能。该系统支持多种数据库后端如 MySQL、MongoDB、Redis 和 SQLite 等，以及消息队列 RabbitMQ 和 Redis。此外，pyspider 还具备任务优先级设置、重试机制、周期性抓取和按时间重新抓取等特性，并支持分布式架构和 JavaScript 页面的抓取。适合用于需要高效稳定地从互联网上收集数据的各种场景，如市场分析、内容聚合或学术研究等。",2,"2026-06-11 02:50:59","top_language"]