[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-1178":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":10,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":22,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":32,"readmeContent":33,"aiSummary":34,"trendingCount":16,"starSnapshotCount":16,"syncStatus":35,"lastSyncTime":36,"discoverSource":37},1178,"data-engineer-handbook","DataExpert-io\u002Fdata-engineer-handbook","DataExpert-io","This is a repo with links to everything you'd ever want to learn about data engineering","",null,"Jupyter Notebook",41617,7868,475,25,0,22,89,356,97,117,false,"main",true,[26,27,28,29,30,31],"apachespark","awesome","bigdata","data","dataengineering","sql","2026-06-12 04:00:08","# The Data Engineering Handbook\n\u003Ca href=\"https:\u002F\u002Ftrendshift.io\u002Frepositories\u002F8755\" target=\"_blank\">\u003Cimg src=\"https:\u002F\u002Ftrendshift.io\u002Fapi\u002Fbadge\u002Frepositories\u002F8755\" alt=\"DataExpert-io%2Fdata-engineer-handbook | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"\u002F>\u003C\u002Fa>\n\nThis repo has all the resources you need to become an amazing data engineer!\n\n## Getting started\n\nIf you are new to data engineering, start by following this [2024 breaking into data engineering roadmap](https:\u002F\u002Fblog.dataengineer.io\u002Fp\u002Fthe-2024-breaking-into-data-engineering)\n\nIf you are here for the [4-week free beginner boot camp](https:\u002F\u002Flearn.dataexpert.io\u002Fprogram\u002Fthe-absolute-beginner-data-engineering-boot-camp-starting-august-7th-6453\u002Fdetails) you can check out:\n- [introduction](beginner-bootcamp\u002Fintroduction.md)\n- [software needed](beginner-bootcamp\u002Fsoftware.md)\n\nIf you are here for the [6-week free intermediate boot camp](https:\u002F\u002Flearn.dataexpert.io\u002Fprogram\u002Ffree-community-boot-camp\u002Fdetails) you can check out\n- [introduction](intermediate-bootcamp\u002Fintroduction.md)\n- [software needed](intermediate-bootcamp\u002Fsoftware.md)\n\n\nFor more applied learning:\n- Check out the [projects](projects.md) section for more hands-on examples!\n- Check out the [interviews](interviews.md) section for more advice on how to pass data engineering interviews!\n- Check out the [books](books.md) section for a list of high quality data engineering books\n- Check out the [communities](communities.md) section for a list of high quality data engineering communities to join\n- Check out the [newsletter](newsletters.md) section to learn via email \n\n\n## Resources\n\n### Great [list of over 25 books](books.md)\n\nTop 3 must read books are:\n- [Fundamentals of Data Engineering](https:\u002F\u002Fwww.amazon.com\u002FFundamentals-Data-Engineering-Robust-Systems\u002Fdp\u002F1098108302\u002F)\n- [Designing Data-Intensive Applications](https:\u002F\u002Fwww.amazon.com\u002FDesigning-Data-Intensive-Applications-Reliable-Maintainable\u002Fdp\u002F1449373321\u002F)\n- [Designing Machine Learning Systems](https:\u002F\u002Fwww.amazon.com\u002FDesigning-Machine-Learning-Systems-Production-Ready\u002Fdp\u002F1098107969)\n\n### Great [list of over 10 communities to join](communities.md):\n\nTop must-join communities for DE:\n- [DataExpert.io Community Discord](https:\u002F\u002Fdiscord.gg\u002FJGumAXncAK)\n- [Data Talks Club Slack](https:\u002F\u002Fdatatalks.club\u002Fslack)\n- [Data Engineer Things Community](https:\u002F\u002Fwww.dataengineerthings.org\u002F)\n\nTop must-join communities for ML:\n- [AdalFlow Discord](https:\u002F\u002Fdiscord.com\u002Finvite\u002FezzszrRZvT)\n- [Chip Huyen MLOps Discord](https:\u002F\u002Fdiscord.gg\u002Fdzh728c5t3)\n\n### Companies:\n\n- Orchestration  \n  - [Mage](https:\u002F\u002Fwww.mage.ai)\n  - [Astronomer](https:\u002F\u002Fwww.astronomer.io)\n  - [Prefect](https:\u002F\u002Fwww.prefect.io)\n  - [Dagster](https:\u002F\u002Fwww.dagster.io)\n  - [Airflow](https:\u002F\u002Fairflow.apache.org\u002F)\n  - [Kestra](https:\u002F\u002Fkestra.io\u002F) \n  - [Shipyard](https:\u002F\u002Fwww.shipyardapp.com\u002F)\n  - [Hamilton](https:\u002F\u002Fgithub.com\u002Fdagworks-inc\u002Fhamilton)\n- Data Lake \u002F Cloud\n  - [Tabular](https:\u002F\u002Fwww.tabular.io)\n  - [Microsoft](https:\u002F\u002Fwww.microsoft.com)\n  - [Databricks](https:\u002F\u002Fwww.databricks.com\u002Fcompany\u002Fabout-us)\n  - [Onehouse](https:\u002F\u002Fwww.onehouse.ai)\n  - [Delta Lake](https:\u002F\u002Fdelta.io\u002F)\n  - [Ilum](https:\u002F\u002Filum.cloud\u002F)\n  - [DuckLake](https:\u002F\u002Fducklake.select\u002F)\n  - [Apache Iceberg](https:\u002F\u002Ficeberg.apache.org\u002F)\n  - [Apache Polaris](https:\u002F\u002Fpolaris.apache.org\u002F)\n  - [Lakekeeper](https:\u002F\u002Flakekeeper.io\u002F)\n- Data Warehouse\n  - [Snowflake](https:\u002F\u002Fwww.snowflake.com\u002Fen\u002F)\n  - [Firebolt](https:\u002F\u002Fwww.firebolt.io\u002F)\n  - [Databend](https:\u002F\u002Fwww.databend.com\u002F)\n- Data Quality\n  - [dbt](https:\u002F\u002Fwww.getdbt.com\u002F)\n  - [Metaplane](https:\u002F\u002Fwww.metaplane.dev\u002F)\n  - [Gable](https:\u002F\u002Fwww.gable.ai)\n  - [Great Expectations](https:\u002F\u002Fwww.greatexpectations.io)\n  - [Streamdal](https:\u002F\u002Fstreamdal.com)\n  - [Coalesce](https:\u002F\u002Fcoalesce.io\u002F)\n  - [Soda](https:\u002F\u002Fwww.soda.io\u002F)\n  - [DQOps](https:\u002F\u002Fdqops.com\u002F)\n  - [HEDDA.IO](https:\u002F\u002Fhedda.io)\n  - [Dingo](https:\u002F\u002Fgithub.com\u002FMigoXLab\u002Fdingo)\n- Education Companies\n  - [DataExpert.io](https:\u002F\u002Fwww.dataexpert.io)\n  - [LearnDataEngineering.com](https:\u002F\u002Fwww.learndataengineering.com)\n  - [AlgoExpert](https:\u002F\u002Fwww.algoexpert.io)\n  - [ByteByteGo](https:\u002F\u002Fwww.bytebytego.com)\n- Analytics \u002F Visualization\n  - [Preset](https:\u002F\u002Fwww.preset.io)\n  - [Starburst](https:\u002F\u002Fwww.starburst.io)\n  - [Metabase](https:\u002F\u002Fwww.metabase.com\u002F)\n  - [Looker Studio](https:\u002F\u002Flookerstudio.google.com\u002Foverview)\n  - [Tableau](https:\u002F\u002Fwww.tableau.com\u002F)\n  - [Power BI](https:\u002F\u002Fpowerbi.microsoft.com\u002F)\n  - [Hex](https:\u002F\u002Fhex.ai\u002F)\n  - [Apache Superset](https:\u002F\u002Fsuperset.apache.org\u002F)\n  - [Evidence](https:\u002F\u002Fevidence.dev)\n  - [Redash](https:\u002F\u002Fredash.io\u002F)\n  - [Lightdash](https:\u002F\u002Flightdash.com\u002F)\n- Data Integration\n  - [Cube](https:\u002F\u002Fcube.dev)\n  - [Fivetran](https:\u002F\u002Fwww.fivetran.com)\n  - [Airbyte](https:\u002F\u002Fairbyte.io)\n  - [dlt](https:\u002F\u002Fdlthub.com\u002F)\n  - [Sling](https:\u002F\u002Fslingdata.io\u002F)\n  - [Meltano](https:\u002F\u002Fmeltano.com\u002F)\n  - [Estuary](https:\u002F\u002Festuary.dev\u002F)\n  - [Arpe.io](https:\u002F\u002Farpe.io\u002F)\n- Semantic Layers\n  - [Cube](https:\u002F\u002Fcube.dev)\n  - [dbt Semantic Layer](https:\u002F\u002Fwww.getdbt.com\u002Fproduct\u002Fsemantic-layer) \n- Modern OLAP\n  - [Apache Druid](https:\u002F\u002Fdruid.apache.org\u002F)\n  - [ClickHouse](https:\u002F\u002Fclickhouse.com\u002F)\n  - [Apache Pinot](https:\u002F\u002Fpinot.apache.org\u002F)\n  - [Apache Kylin](https:\u002F\u002Fkylin.apache.org\u002F)\n  - [DuckDB](https:\u002F\u002Fduckdb.org\u002F)\n  - [QuestDB](https:\u002F\u002Fquestdb.io\u002F)\n  - [StarRocks](https:\u002F\u002Fwww.starrocks.io\u002F)\n- LLM application library\n  - [AdalFlow](https:\u002F\u002Fgithub.com\u002FSylphAI-Inc\u002FAdalFlow)\n  - [LangChain](https:\u002F\u002Fgithub.com\u002Flangchain-ai\u002Flangchain)\n  - [LlamaIndex](https:\u002F\u002Fgithub.com\u002Frun-llama\u002Fllama_index)\n- Real-Time Data\n  - [Aggregations.io](https:\u002F\u002Faggregations.io)\n  - [Responsive](https:\u002F\u002Fwww.responsive.dev\u002F)\n  - [RisingWave](https:\u002F\u002Frisingwave.com\u002F)\n  - [Striim](https:\u002F\u002Fwww.striim.com\u002F)\n- Data Lineage\n  - [OpenLineage](https:\u002F\u002Fopenlineage.io\u002F)\n\n\n### Data Engineering blogs of companies:\n\n- [Netflix](https:\u002F\u002Fnetflixtechblog.com\u002Ftagged\u002Fbig-data)\n- [Uber](https:\u002F\u002Fwww.uber.com\u002Fblog\u002Fhouston\u002Fdata\u002F?uclick_id=b2f43229-f3f4-4bae-bd5d-10a05db2f70c)\n- [Databricks](https:\u002F\u002Fwww.databricks.com\u002Fblog\u002Fcategory\u002Fengineering\u002Fdata-engineering)\n- [Airbnb](https:\u002F\u002Fmedium.com\u002Fairbnb-engineering\u002Fdata\u002Fhome)\n- [Amazon AWS Blog](https:\u002F\u002Faws.amazon.com\u002Fblogs\u002Fbig-data\u002F)\n- [Microsoft Data Architecture Blogs](https:\u002F\u002Ftechcommunity.microsoft.com\u002Ft5\u002Fdata-architecture-blog\u002Fbg-p\u002FDataArchitectureBlog)\n- [Microsoft Fabric Blog](https:\u002F\u002Fblog.fabric.microsoft.com\u002F)\n- [Oracle](https:\u002F\u002Fblogs.oracle.com\u002Fdatawarehousing\u002F)\n- [Meta](https:\u002F\u002Fengineering.fb.com\u002Fcategory\u002Fdata-infrastructure\u002F)\n- [Onehouse](https:\u002F\u002Fwww.onehouse.ai\u002Fblog)\n- [Estuary Blog](https:\u002F\u002Festuary.dev\u002Fblog\u002F)\n\n### Data Engineering Whitepapers:\n\n- [A Five-Layered Business Intelligence Architecture](https:\u002F\u002Fibimapublishing.com\u002Farticles\u002FCIBIMA\u002F2011\u002F695619\u002F695619.pdf)\n- [Lakehouse:A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics](https:\u002F\u002Fwww.cidrdb.org\u002Fcidr2021\u002Fpapers\u002Fcidr2021_paper17.pdf)\n- [Big Data Quality: A Data Quality Profiling Model](https:\u002F\u002Flink.springer.com\u002Fchapter\u002F10.1007\u002F978-3-030-23381-5_5)\n- [The Data Lakehouse: Data Warehousing and More](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.08697)\n- [Spark: Cluster Computing with Working Sets](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.5555\u002F1863103.1863113)\n- [The Google File System](https:\u002F\u002Fresearch.google\u002Fpubs\u002Fthe-google-file-system\u002F)\n- [Building a Universal Data Lakehouse](https:\u002F\u002Fwww.onehouse.ai\u002Fwhitepaper\u002Fonehouse-universal-data-lakehouse-whitepaper)\n- [XTable in Action: Seamless Interoperability in Data Lakes](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.09621)\n- [MapReduce: Simplified Data Processing on Large Clusters](https:\u002F\u002Fresearch.google\u002Fpubs\u002Fmapreduce-simplified-data-processing-on-large-clusters\u002F)\n- [Tidy Data](https:\u002F\u002Fvita.had.co.nz\u002Fpapers\u002Ftidy-data.pdf)\n- [Data Engineering Whitepapers](https:\u002F\u002Fwww.ssp.sh\u002Fbrain\u002Fdata-engineering-whitepapers\u002F)\n\n### Social Media Accounts\n\nHere's the mostly comprehensive list of data engineering creators: \n**(You have to have at least 5k followers somewhere to be added!)**\n\n\n#### YouTube \n| Name                        | YouTube Channel                                                                                         | Follower Count |\n|----------------------------|---------------------------------------------------------------------------------------------------------|---------------:|\n| ByteByteGo                 | [ByteByteGo](https:\u002F\u002Fwww.youtube.com\u002Fc\u002FByteByteGo)                                             | 1,000,000+     |\n| Data with Baraa            | [Data with Baraa](https:\u002F\u002Fwww.youtube.com\u002F@DataWithBaraa)                                       | 195,000+     |\n| Zach Wilson                | [Data with Zach](https:\u002F\u002Fwww.youtube.com\u002F@eczachly_)                                          | 150,000+       |\n| Shashank Mishra            | [E-learning Bridge](https:\u002F\u002Fwww.youtube.com\u002F@shashank_mishra)                                   | 100,000+       |\n| Seattle Data Guy           | [Seattle Data Guy](https:\u002F\u002Fwww.youtube.com\u002Fc\u002FSeattleDataGuy)                                  | 100,000+       |\n| TrendyTech                 | [TrendyTech](https:\u002F\u002Fwww.youtube.com\u002Fc\u002FTrendytechInsights)                                   | 100,000+       |\n| Darshil Parmar             | [Darshil Parmar](https:\u002F\u002Fwww.youtube.com\u002F@DarshilParmar)                                       | 100,000+       |\n| Andreas Kretz              | [Andreas Kretz](https:\u002F\u002Fwww.youtube.com\u002Fc\u002Fandreaskayy)                                          | 100,000+       |\n| The Ravit Show             | [The Ravit Show](https:\u002F\u002Fyoutube.com\u002F@theravitshow)                                           | 100,000+       |\n| Guy in a Cube              | [Guy in a Cube](https:\u002F\u002Fwww.youtube.com\u002F@GuyInACube)                                            | 100,000+       |\n| Adam Marczak               | [Adam Marczak](https:\u002F\u002Fwww.youtube.com\u002F@AdamMarczakYT)                                         | 100,000+       |\n| nullQueries                | [nullQueries](https:\u002F\u002Fwww.youtube.com\u002F@nullQueries)                                             | 100,000+       |\n| TECHTFQ by Thoufiq         | [TECHTFQ by Thoufiq](https:\u002F\u002Fwww.youtube.com\u002F@techTFQ)                                         | 100,000+       |\n| SQLBI                       | [SQLBI](https:\u002F\u002Fwww.youtube.com\u002F@SQLBI)                                                     | 100,000+       |\n| Alex Freberg               | [Alex The Analyst](https:\u002F\u002Fwww.youtube.com\u002F@AlexTheAnalyst)                                     | 100,000+       |\n| Ankur Ranjan               | [Big Data Show](https:\u002F\u002Fwww.youtube.com\u002F@TheBigDataShow)                                        | 100,000+       |\n| Prashanth Kumar Pandey     | [ScholarNest](https:\u002F\u002Fwww.youtube.com\u002F@ScholarNest)                                              | 77,000+        |\n| ITVersity                  | [ITVersity](https:\u002F\u002Fwww.youtube.com\u002F@itversity)                                                  | 67,000+        |\n| Soumil Shah                | [Soumil Shah](https:\u002F\u002Fwww.youtube.com\u002F@SoumilShah)                                               | 50,000         |\n| Ansh Lamba                 | [Ansh Lamba](https:\u002F\u002Fwww.youtube.com\u002F@AnshLambaJSR)                                              | 18,000+        |\n| Azure Lib                  | [Azure Lib](https:\u002F\u002Fwww.youtube.com\u002F@azurelib-academy)                                        | 10,000+        |\n| Advancing Analytics        | [Advancing Analytics](https:\u002F\u002Fwww.youtube.com\u002F@AdvancingAnalytics)                               | 10,000+        |\n| Kahan Data Solutions       | [Kahan Data Solutions](https:\u002F\u002Fwww.youtube.com\u002F@KahanDataSolutions)                               | 10,000+        |\n| Ankit Bansal               | [Ankit Bansal](https:\u002F\u002Fyoutube.com\u002F@ankitbansal6)                                               | 10,000+        |\n| Mr. K Talks Tech           | [Mr. K Talks Tech](https:\u002F\u002Fwww.youtube.com\u002Fchannel\u002FUCzdOan4AmF65PmLLks8Lmww)                      | 10,000+        |\n| Samuel Focht               | [Python Basics](https:\u002F\u002Fwww.youtube.com\u002F@PythonBasics)                                           | 10,000+        |\n| Mehdi Ouazza              | [Mehdio DataTV](https:\u002F\u002Fwww.youtube.com\u002F@mehdio)                                                    | 3,000+         |\n| Alex Merced                | [Alex Merced Data](https:\u002F\u002Fwww.youtube.com\u002F@alexmerceddata_)                                            | N\u002FA           |\n| John Kutay                 | [John Kutay](https:\u002F\u002Fwww.youtube.com\u002F@striiminc) | N\u002FA           |\n| Emil Kaminski              | [Databricks For Professionals](https:\u002F\u002Fwww.youtube.com\u002F@DatabricksPro)                           | 5,000+          |\n\n\n#### LinkedIn\n\n| Name                      | LinkedIn Profile                                                                                         | Follower Count |\n|--------------------------|----------------------------------------------------------------------------------------------------------|---------------:|\n| Zach Wilson              | [Zach Wilson](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Feczachly)                                                     | 400,000+       |\n| Chip Huyen               | [Chip Huyen](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fchiphuyen\u002F)                                    | 250,000+       |\n| Shashank Mishra          | [Shashank Mishra](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fshashank219\u002F)                                     | 100,000+       |\n| Seattle Data Guy         | [Ben Rogojan](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fbenjaminrogojan)                                        | 100,000+       |\n| TrendyTech               | [Sumit Mittal](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fbigdatabysumit\u002F)                                   | 100,000+       |\n| Darshil Parmar           | [Darshil Parmar](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fdarshil-parmar\u002F)                                   | 100,000+       |\n| Andreas Kretz            | [Andreas Kretz](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fandreas-kretz)                                     | 100,000+       |\n| ByteByteGo (Alex Xu)     | [Alex Xu](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Falexxubyte)                                             | 100,000+       |\n| Azure Lib (Deepak Goyal) | [Deepak Goyal](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fdeepak-goyal-93805a17\u002F)                              | 100,000+       |\n| Alex Freberg             | [Alex Freberg](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Falex-freberg\u002F)                                     | 100,000+       |\n| SQLBI (Marco Russo)      | [Marco Russo](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fsqlbi)                                                  | 50,000+        |\n| Ankit Bansal             | [Ankit Bansal](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fankitbansal6\u002F)                                        | 50,000+        |\n| Marc Lamberti            | [Marc Lamberti](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fmarclamberti)                                       | 50,000+        |\n| Ankur Ranjan             | [Ankur Ranjan](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fthebigdatashow\u002F)                                       | 48,000+        |\n| ITVersity (Durga Gadiraju)| [Durga Gadiraju](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fdurga0gadiraju\u002F)                                   | 48,000+        |\n| Prashanth Kumar Pandey   | [Prashanth Kumar Pandey](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fprashant-kumar-pandey\u002F)                       | 37,000+        |\n| Alex Merced              | [Alex Merced](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Falexmerced)                                           | 30,000+        |\n| Ijaz Ali                 | [Ijaz Ali](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fijaz-ali-6aaa87122\u002F)                                       | 24,000+        |\n| Mehdi Ouazza             | [Mehdi Ouazza](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fmehd-io\u002F)                                        | 20,000+        |\n| Ananth Packkildurai      | [Ananth Packkildurai](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fananthdurai\u002F)                                    | 18,000+        |\n| Ansh Lamba               | [Ansh Lamba](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fansh-lamba-793681184\u002F)                                    | 13,000+        |\n| Manojkumar Vadivel       | [Manojkumar Vadivel](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fmanojvsj\u002F)                                        | 12,000+        |\n| Advancing Analytics      | [Simon Whiteley](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fsimon-whiteley-uk\u002F)                                  | 10,000+        |\n| Li Yin                   | [Li Yin](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fli-yin-ai\u002F)                                                  | 10,000+        |\n| Jaco van Gelder          | [Jaco van Gelder](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fjwvangelder\u002F)                                       | 10,000+        |\n| Joseph Machado           | [Joseph Machado](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fjosephmachado1991\u002F)                                  | 10,000+        |\n| Eric Roby                | [Eric Roby](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fcodingwithroby\u002F)                                           | 10,000+        |\n| Simon Späti              | [Simon Späti](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fsspaeti\u002F)                                            | 10,000+        |\n| Constantin Lungu         | [Constantin Lungu](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fconstantin-lungu-668b8756)                         | 10,000+        |\n| Lakshmi Sontenam         | [Lakshmi Sontenam](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fshivaga9esh)                                      | 9,500+         |\n| Dani Pálma               | [Daniel Pálma](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fdanthelion\u002F)                                          | 9,000+         |\n| Soumil Shah              | [Soumil Shah](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fshah-soumil\u002F)                                          | 8,000+         |\n| Arnaud Milleker          | [Arnaud Milleker](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Farnaudmilleker\u002F)                                    | 7,000+         |\n| Dimitri Visnadi          | [Dimitri Visnadi](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fvisnadi\u002F)                                    | 7,000+         |\n| Lenny                    | [Lenny A](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Flennyardiles\u002F)                                         | 6,000+         |\n| Dipankar Mazumdar        | [Dipankar Mazumdar](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fdipankar-mazumdar\u002F)                                 | 5,000+         |\n| Daniel Ciocirlan         | [Daniel Ciocirlan](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fdanielciocirlan)                                    | 5,000+         |\n| Hugo Lu                  | [Hugo Lu](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fhugo-lu-confirmed\u002F)                                           | 5,000+         |\n| Tobias Macey             | [Tobias Macey](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Ftmacey)                                                 | 5,000+         |\n| Marcos Ortiz             | [Marcos Ortiz](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fmlortiz)                                             | 5,000+         |\n| Julien Hurault           | [Julien Hurault](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fjulienhuraultanalytics\u002F)                               | 5,000+         |\n| John Kutay               | [John Kutay](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fjohnkutay\u002F)                                               | 5,000+         |\n| Hassaan Akbar            | [Hassaan Akbar](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fehassaan)                                              | 5,000+         |\n| Subhankar                | [Subhankar](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fsubhankarumass\u002F)                                            | 5,000+         |\n| Nitin                    | [Nitin](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Ftomernitin29\u002F)                                                        | N\u002FA           |\n| Hassaan                    | [Hassaan](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fshassaan\u002F)                                                        | 5000+           |\n| Javier de la Torre             | [Javier](www.linkedin.com\u002Fin\u002Fjavier-de-la-torre-medina)                                                        | 5000+           |\n\n\n#### X\u002FTwitter\n\n| Name              | X\u002FTwitter Profile                                                 | Follower Count |\n|-------------------|------------------------------------------------------------------|---------------:|\n| ByteByteGo        | [alexxubyte](https:\u002F\u002Ftwitter.com\u002Falexxubyte\u002F)            | 100,000+       |\n| Dan Kornas        | [@dankornas](https:\u002F\u002Fwww.twitter.com\u002Fdankornas)           | 66,000+        |\n| Zach Wilson       | [EcZachly](https:\u002F\u002Fwww.twitter.com\u002FEcZachly)          | 30,000+        |\n| Seattle Data Guy  | [SeattleDataGuy](https:\u002F\u002Fwww.twitter.com\u002FSeattleDataGuy)   | 10,000+        |\n| SQLBI             | [marcorus](https:\u002F\u002Fx.com\u002Fmarcorus)                       | 10,000+        |\n| Joseph Machado    | [startdataeng](https:\u002F\u002Ftwitter.com\u002Fstartdataeng)         | 5,000+         |\n| Alex Merced       | [@amdatalakehouse](https:\u002F\u002Fwww.twitter.com\u002Famdatalakehouse)      | N\u002FA           |\n| John Kutay        | [@JohnKutay](https:\u002F\u002Fx.com\u002FJohnKutay)                            | N\u002FA           |\n| Mehdi Ouazza      | [mehd_io](https:\u002F\u002Fx.com\u002Fmehd_io)                                 | N\u002FA           |\n\n\n#### Instagram\n\n| Name           | Instagram Profile                                                                   | Follower Count |\n|----------------|--------------------------------------------------------------------------------------|---------------:|\n| Sundas Khalid  | [sundaskhalidd](https:\u002F\u002Fwww.instagram.com\u002Fsundaskhalidd)                              | 300,000+       |\n| Zach Wilson    | [eczachly](https:\u002F\u002Fwww.instagram.com\u002Feczachly)                             | 150,000+       |\n| Andreas Kretz  | [learndataengineering](https:\u002F\u002Fwww.instagram.com\u002Flearndataengineering)          | 5,000+         |\n| Alex Merced    | [@alexmercedcoder](https:\u002F\u002Fwww.instagram.com\u002Falexmercedcoder)                       | N\u002FA           |\n\n#### TikTok\n\n| Name            | TikTok Profile                                                                   | Follower Count |\n|-----------------|----------------------------------------------------------------------------------|---------------:|\n| Zach Wilson     | [@eczachly](https:\u002F\u002Fwww.tiktok.com\u002F@eczachly)                            | 70,000+        |\n| Alex Freberg    | [@alex_the_analyst](https:\u002F\u002Fwww.tiktok.com\u002F@alex_the_analyst)             | 10,000+        |\n| Mehdi Ouazza    | [@mehdio_datatv](https:\u002F\u002Fwww.tiktok.com\u002F@mehdio_datatv)                          | N\u002FA           |\n\n\n### Great Podcasts\n\n- [The Data Engineering Show](https:\u002F\u002Fwww.dataengineeringshow.com\u002F)\n- [Data Engineering Podcast](https:\u002F\u002Fwww.dataengineeringpodcast.com\u002F)\n- [DataTopics](https:\u002F\u002Fwww.datatopics.io\u002F)\n- [The Data Engineering Side Of Data](https:\u002F\u002Fpodcasts.apple.com\u002Fus\u002Fpodcast\u002Fthe-engineering-side-of-data\u002Fid1566999533)\n- [DataWare](https:\u002F\u002Fwww.ascend.io\u002Fdataaware-podcast\u002F)\n- [The Data Coffee Break Podcast](https:\u002F\u002Fwww.deezer.com\u002Fus\u002Fshow\u002F5293247)\n- [The Datastack show](https:\u002F\u002Fdatastackshow.com\u002F)\n- [Intricity101 Data Sharks Podcast](https:\u002F\u002Fwww.intricity.com\u002Flearningcenter\u002Fpodcast)\n- [Drill to Detail with Mark Rittman](https:\u002F\u002Fwww.rittmananalytics.com\u002Fdrilltodetail\u002F)\n- [Analytics Power Hour](https:\u002F\u002Fanalyticshour.io\u002F)\n- [Catalog & cocktails](https:\u002F\u002Flisten.casted.us\u002Fpublic\u002F127\u002FCatalog-%26-Cocktails-2fcf8728)\n- [Datatalks](https:\u002F\u002Fdatatalks.club\u002Fpodcast.html)\n- [Data Brew by Databricks](https:\u002F\u002Fwww.databricks.com\u002Fdiscover\u002Fdata-brew)\n- [The Data Cloud Podcast by Snowflake](https:\u002F\u002Frise-of-the-data-cloud.simplecast.com\u002F)\n- [What's New in Data](https:\u002F\u002Fwww.striim.com\u002Fpodcast\u002F)\n- [Open||Source||Data by Datastax](https:\u002F\u002Fwww.datastax.com\u002Fresources\u002Fpodcast\u002Fopen-source-data)\n- [Streaming Audio by confluent](https:\u002F\u002Fdeveloper.confluent.io\u002Fpodcast\u002F)\n- [The Data Scientist Show](https:\u002F\u002Fpodcasts.apple.com\u002Fus\u002Fpodcast\u002Fthe-data-scientist-show\u002Fid1584430381)\n- [MLOps.community](https:\u002F\u002Fpodcast.mlops.community\u002F)\n- [Monday Morning Data Chat](https:\u002F\u002Fopen.spotify.com\u002Fshow\u002F3Km3lBNzJpc1nOTJUtbtMh)\n- [The Data Chief](https:\u002F\u002Fwww.thoughtspot.com\u002Fdata-chief\u002Fpodcast)\n- [The Joe Reis Show](https:\u002F\u002Fopen.spotify.com\u002Fshow\u002F3mcKitYGS4VMG2eHd2PfDN)\n- [Data Bytes](https:\u002F\u002Fopen.spotify.com\u002Fshow\u002F6VbjON5Ck9QYInBnmoqrDE)\n- [Super Data Science: ML & AI Podcast with Jon Krohn](https:\u002F\u002Fopen.spotify.com\u002Fshow\u002F1n8P7ZSgfVLVJ3GegxPat1)\n\n### Great [list of 20+ newsletters](newsletters.md)\n\nTop must follow newsletters for data engineering:\n- [DataEngineer.io Newsletter](https:\u002F\u002Fblog.dataengineer.io)\n- [Joe Reis](https:\u002F\u002Fjoereis.substack.com)\n- [Start Data Engineering](https:\u002F\u002Fwww.startdataengineering.com)\n- [Data Engineering Weekly](https:\u002F\u002Fwww.dataengineeringweekly.com)\n- [Data Engineer Things](https:\u002F\u002Fdataengineerthings.substack.com\u002F)\n\n### Glossaries:\n- [Data Engineering Vault](https:\u002F\u002Fwww.ssp.sh\u002Fbrain\u002Fdata-engineering\u002F)\n- [Airbyte Data Glossary](https:\u002F\u002Fglossary.airbyte.com\u002F)\n- [Data Engineering Wiki by Reddit](https:\u002F\u002Fdataengineering.wiki\u002FIndex)\n- [Seconda Glossary](https:\u002F\u002Fwww.secoda.co\u002Fglossary\u002F)\n- [Glossary Databricks](https:\u002F\u002Fwww.databricks.com\u002Fglossary)\n- [Airtable Glossary](https:\u002F\u002Fairtable.com\u002FshrGh8BqZbkfkbrfk\u002FtbluZ3ayLHC3CKsDb)\n- [Data Engineering Glossary by Dagster](https:\u002F\u002Fdagster.io\u002Fglossary)\n\n\n### Design Patterns\n\n- [Cumulative Table Design](https:\u002F\u002Fwww.github.com\u002FDataExpert-io\u002Fcumulative-table-design)\n- [Microbatch Deduplication](https:\u002F\u002Fwww.github.com\u002FEcZachly\u002Fmicrobatch-hourly-deduped-tutorial)\n- [The Little Book of Pipelines](https:\u002F\u002Fwww.github.com\u002FEcZachly\u002Flittle-book-of-pipelines)\n- [Data Developer Platform](https:\u002F\u002Fdatadeveloperplatform.org\u002Farchitecture\u002F)\n\n### Courses \u002F Academies\n\n- [DataExpert.io course](https:\u002F\u002Fwww.dataexpert.io) use code **HANDBOOK10** for a discount!\n- [LearnDataEngineering.com](https:\u002F\u002Fwww.learndataengineering.com)\n- [Technical Freelancer Academy](https:\u002F\u002Fwww.technicalfreelanceracademy.com\u002F) Use code **zwtech** for a discount!\n- [IBM Data Engineering for Everyone](https:\u002F\u002Fwww.edx.org\u002Flearn\u002Fdata-engineering\u002Fibm-data-engineering-basics-for-everyone)\n- [Qwiklabs](https:\u002F\u002Fwww.qwiklabs.com\u002F)\n- [DataCamp](https:\u002F\u002Fwww.datacamp.com\u002F)\n- [Udemy Courses from Shruti Mantri](https:\u002F\u002Fwww.udemy.com\u002Fuser\u002Fshruti-mantri-5\u002F)\n- [Rock the JVM](https:\u002F\u002Frockthejvm.com\u002F) teaches Spark (in Scala), Flink and others\n- [Data Engineering Zoomcamp by DataTalksClub](https:\u002F\u002Fdatatalks.club\u002F)\n- [Efficient Data Processing in Spark](https:\u002F\u002Fjosephmachado.podia.com\u002Fefficient-data-processing-in-spark)\n- [Scaler](https:\u002F\u002Fwww.scaler.com\u002F)\n- [DataTeams - Data Engingeer hiring platform](https:\u002F\u002Fwww.datateams.ai\u002F)\n- [Udemy Courses from Daniel Blanco](https:\u002F\u002Fdanielblanco.dev\u002Flinks)\n- [DeepLearning.AI Data Engineering Professional Certificate](https:\u002F\u002Fwww.coursera.org\u002Fprofessional-certificates\u002Fdata-engineering)\n\n### Certifications Courses\n\n- [Google Cloud Certified - Professional Data Engineer](https:\u002F\u002Fcloud.google.com\u002Fcertification\u002Fdata-engineer)\n- [Databricks - Certified Associate Developer for Apache Spark](https:\u002F\u002Fwww.databricks.com\u002Flearn\u002Fcertification\u002Fapache-spark-developer-associate)\n- [Databricks - Data Engineer Associate](https:\u002F\u002Fwww.databricks.com\u002Flearn\u002Fcertification\u002Fdata-engineer-associate)\n- [Databricks - Data Engineer Professional](https:\u002F\u002Fwww.databricks.com\u002Flearn\u002Fcertification\u002Fdata-engineer-professional)\n- [Microsoft DP-203: Data Engineering on Microsoft Azure](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fcredentials\u002Fcertifications\u002Fexams\u002Fdp-203\u002F?tab=tab-learning-paths)\n- [Microsoft DP-600: Fabric Analytics Engineer Associate](https:\u002F\u002Flearn.microsoft.com\u002Fcredentials\u002Fcertifications\u002Ffabric-analytics-engineer-associate\u002F)\n- [Microsoft DP-700: Fabric Data Engineer Associate](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fcredentials\u002Fcertifications\u002Ffabric-data-engineer-associate\u002F?practice-assessment-type=certification)\n- [AWS Certified Data Engineer - Associate](https:\u002F\u002Faws.amazon.com\u002Fcertification\u002Fcertified-data-engineer-associate\u002F)\n","DataExpert-io\u002Fdata-engineer-handbook 是一个汇集了数据工程领域学习资源的项目。它提供了从入门到进阶的数据工程学习路径，包括免费的初学者和中级训练营、实战项目、面试指南以及精选书籍推荐等。项目使用 Jupyter Notebook 作为主要工具之一，涵盖 Apache Spark、SQL 等大数据处理技术。此项目非常适合希望系统性学习数据工程知识并提升技能的技术人员，无论是刚开始接触数据工程领域的新人还是希望深化理解的专业人士都能从中受益。",2,"2026-06-11 02:42:08","top_all"]