[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-9941":3},{"id":4,"name":5,"fullName":6,"owner":5,"repo":5,"description":7,"homepage":8,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":15,"stars7d":15,"stars30d":16,"stars90d":15,"forks30d":15,"starsTrendScore":15,"compositeScore":17,"rankGlobal":9,"rankLanguage":9,"license":18,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":19,"hasPages":19,"topics":21,"createdAt":9,"pushedAt":9,"updatedAt":31,"readmeContent":32,"aiSummary":33,"trendingCount":15,"starSnapshotCount":15,"syncStatus":34,"lastSyncTime":35,"discoverSource":36},9941,"pachyderm","pachyderm\u002Fpachyderm","Data-Centric Pipelines and Data Versioning","https:\u002F\u002Fwww.pachyderm.com\u002F",null,"Go",6293,575,150,711,0,3,39.28,"Apache License 2.0",false,"master",[22,23,24,25,26,27,28,29,30,5],"analytics","big-data","containers","data-analysis","data-science","distributed-systems","docker","go","kubernetes","2026-06-12 02:02:14","\u003Cp align=\"center\">\n\t\u003Cimg src='.\u002FPachyderm_Icon-01.svg' height='225' title='Pachyderm'>\n\u003C\u002Fp>\n\n[![GitHub release](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Frelease\u002Fpachyderm\u002Fpachyderm.svg?style=flat-square)](https:\u002F\u002Fgithub.com\u002Fpachyderm\u002Fpachyderm\u002Freleases)\n[![GitHub license](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-Pachyderm-blue)](https:\u002F\u002Fgithub.com\u002Fpachyderm\u002Fpachyderm\u002Fblob\u002Fmaster\u002FLICENSE)\n[![GoDoc](https:\u002F\u002Fgodoc.org\u002Fgithub.com\u002Fpachyderm\u002Fpachyderm?status.svg)](https:\u002F\u002Fpkg.go.dev\u002Fgithub.com\u002Fpachyderm\u002Fpachyderm\u002Fv2\u002Fsrc\u002Fclient)\n[![Go Report Card](https:\u002F\u002Fgoreportcard.com\u002Fbadge\u002Fgithub.com\u002Fpachyderm\u002Fpachyderm)](https:\u002F\u002Fgoreportcard.com\u002Freport\u002Fgithub.com\u002Fpachyderm\u002Fpachyderm)\n[![Slack Status](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fslack-pachyderm-brightgreen.svg?logo=slack)](https:\u002F\u002Fwww.pachyderm.com\u002Fslack)\n[![CLA assistant](https:\u002F\u002Fcla-assistant.io\u002Freadme\u002Fbadge\u002Fpachyderm\u002Fpachyderm)](https:\u002F\u002Fcla-assistant.io\u002Fpachyderm\u002Fpachyderm)\n\n# Pachyderm – Automate data transformations with data versioning and lineage\n\n\nPachyderm is cost-effective at scale, enabling data engineering teams to automate complex pipelines with sophisticated data transformations across any type of data. Our unique approach provides parallelized processing of multi-stage, language-agnostic pipelines with data versioning and data lineage tracking. Pachyderm delivers the ultimate CI\u002FCD engine for data. \n\n## Features\n\n- Data-driven pipelines automatically trigger based on detecting data changes.\n- Immutable data lineage with data versioning of any data type. \n- Autoscaling and parallel processing built on Kubernetes for resource orchestration.\n- Uses standard object stores for data storage with automatic deduplication.  \n- Runs across all major cloud providers and on-premises installations.\n\n\n## Getting Started\nTo start deploying your end-to-end version-controlled data pipelines, run Pachyderm [locally](https:\u002F\u002Fdocs.pachyderm.com\u002Flatest\u002Fset-up\u002Flocal-deploy\u002F) or you can also [deploy on AWS\u002FGCE\u002FAzure](https:\u002F\u002Fdocs.pachyderm.com\u002Flatest\u002Fset-up\u002Fcloud-deploy) in about 5 minutes. \n\nYou can also refer to our complete [documentation](https:\u002F\u002Fdocs.pachyderm.com) to see tutorials, check out example projects, and learn about advanced features of Pachyderm.\n\nIf you'd like to see some examples and learn about core use cases for Pachyderm:\n- [Examples](https:\u002F\u002Fgithub.com\u002Fpachyderm\u002Fexamples)\n- [Use Cases](https:\u002F\u002Fwww.pachyderm.com\u002Fuse-cases\u002F)\n- [Case Studies](https:\u002F\u002Fwww.pachyderm.com\u002Fcase-studies\u002F)\n\n## Documentation\n\n[Official Documentation](https:\u002F\u002Fdocs.pachyderm.com\u002F)\n\n## Community\nKeep up to date and get Pachyderm support via:\n- [![Twitter](https:\u002F\u002Fimg.shields.io\u002Ftwitter\u002Ffollow\u002Fpachyderminc?style=social)](https:\u002F\u002Ftwitter.com\u002Fpachyderminc) Follow us on Twitter.\n- [![Slack Status](https:\u002F\u002Fbadge.slack.pachyderm.io\u002Fbadge.svg)](https:\u002F\u002Fslack.pachyderm.io) Join our community [Slack Channel](https:\u002F\u002Fwww.pachyderm.com\u002Fslack) to get help from the Pachyderm team and other users.\n\n## Contributing\nTo get started, sign the [Contributor License Agreement](https:\u002F\u002Fcla-assistant.io\u002Fpachyderm\u002Fpachyderm).\n\nYou should also check out our [contributing guide](https:\u002F\u002Fdocs.pachyderm.com\u002Flatest\u002Fcontributing\u002Fsetup\u002F).\n\nSend us PRs, we would love to see what you do! You can also check our GH issues for things labeled \"help-wanted\" as a good place to start. We're sometimes bad about keeping that label up-to-date, so if you don't see any, just let us know.\n\n## Usage Metrics\n\nPachyderm automatically reports anonymized usage metrics. These metrics help us\nunderstand how people are using Pachyderm and make it better.  They can be\ndisabled by setting the env variable `METRICS` to `false` in the pachd\ncontainer.","Pachyderm 是一个专注于数据版本控制和数据驱动流水线的自动化工具。它通过自动检测数据变化来触发流水线，支持多种数据类型的数据版本控制与数据血缘追踪，并利用Kubernetes实现资源编排的自动扩展和平行处理。此外，Pachyderm 能够在标准对象存储上运行，支持数据自动去重，并且兼容各大主流云服务提供商及本地部署环境。该项目非常适合需要构建高效、可追溯的数据处理流程的企业或团队使用，特别是在大数据分析、机器学习模型训练等场景中表现尤为出色。",2,"2026-06-11 03:25:31","top_topic"]