[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-1050":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":23,"hasPages":25,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":34,"readmeContent":35,"aiSummary":36,"trendingCount":16,"starSnapshotCount":16,"syncStatus":37,"lastSyncTime":38,"discoverSource":39},1050,"spark","apache\u002Fspark","apache","Apache Spark - A unified analytics engine for large-scale data processing","https:\u002F\u002Fspark.apache.org\u002F",null,"Scala",43441,29218,1987,49,0,1,48,193,14,102,"Apache License 2.0",false,"master",true,[27,28,29,30,31,32,5,33],"big-data","java","jdbc","python","r","scala","sql","2026-06-12 04:00:07","# Apache Spark\n\nSpark is a unified analytics engine for large-scale data processing. It provides\nhigh-level APIs in Scala, Java, Python, and R (Deprecated), and an optimized engine that\nsupports general computation graphs for data analysis. It also supports a\nrich set of higher-level tools including Spark SQL for SQL and DataFrames,\npandas API on Spark for pandas workloads, MLlib for machine learning, GraphX for graph processing,\nand Structured Streaming for stream processing.\n\n- Official version: \u003Chttps:\u002F\u002Fspark.apache.org\u002F>\n- Development version: \u003Chttps:\u002F\u002Fapache.github.io\u002Fspark\u002F>\n\n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-Apache%202.0-blue.svg)](https:\u002F\u002Fopensource.org\u002Flicenses\u002FApache-2.0)\n[![Maven Central](https:\u002F\u002Fimg.shields.io\u002Fmaven-central\u002Fv\u002Forg.apache.spark\u002Fspark-core_2.13.svg?filter=!*preview*)](https:\u002F\u002Fsearch.maven.org\u002Fsearch?q=g:org.apache.spark)\n[![Java](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FJava-17+-orange.svg)](https:\u002F\u002Fadoptium.net\u002Ftemurin\u002Freleases\u002F?version=17)\n[![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_main.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_main.yml)\n[![PySpark Coverage](https:\u002F\u002Fcodecov.io\u002Fgh\u002Fapache\u002Fspark\u002Fbranch\u002Fmaster\u002Fgraph\u002Fbadge.svg)](https:\u002F\u002Fcodecov.io\u002Fgh\u002Fapache\u002Fspark)\n[![PyPI Downloads](https:\u002F\u002Fstatic.pepy.tech\u002Fpersonalized-badge\u002Fpyspark?period=month&units=international_system&left_color=black&right_color=orange&left_text=PyPI%20downloads)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fpyspark\u002F)\n\n\n## Online Documentation\n\nYou can find the latest Spark documentation, including a programming\nguide, on the [project web page](https:\u002F\u002Fspark.apache.org\u002Fdocumentation.html).\nThis README file only contains basic setup instructions.\n\n## Build Pipeline Status\n\n| Branch     | Status                                                                                                                                                                                                          |\n|------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| master     | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Frelease.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Frelease.yml)                                               |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fpublish_snapshot.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fpublish_snapshot.yml)                             |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_infra_images_cache.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_infra_images_cache.yml)             |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_java21.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_java21.yml)                                     |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_java25.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_java25.yml)                                     |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_non_ansi.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_non_ansi.yml)                                 |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_uds.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_uds.yml)                                           |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_rockdb_as_ui_backend.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_rockdb_as_ui_backend.yml)         |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_maven.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_maven.yml)                                       |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_maven_java21.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_maven_java21.yml)                         |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_maven_java25.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_maven_java25.yml)                         |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_maven_java21_macos26.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_maven_java21_macos26.yml)         |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_maven_java21_arm.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_maven_java21_arm.yml)                 |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_coverage.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_coverage.yml)                                 |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_3.10.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_3.10.yml)                           |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_3.11.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_3.11.yml)                           |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_3.12_classic_only.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_3.12_classic_only.yml) |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_3.12_arm.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_3.12_arm.yml)                   |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_3.12_macos26.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_3.12_macos26.yml)           |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_3.12_pandas_3.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_3.12_pandas_3.yml)         |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_3.13.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_3.13.yml)                           |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_3.14.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_3.14.yml)                           |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_3.14_nogil.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_3.14_nogil.yml)               |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_minimum.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_minimum.yml)                     |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_ps_minimum.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_ps_minimum.yml)               |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_connect40.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_connect40.yml)                 |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_connect.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_python_connect.yml)                     |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_sparkr_window.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_sparkr_window.yml)                       |\n| branch-4.x | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch4x.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch4x.yml)                                 |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch4x_java21.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch4x_java21.yml)                   |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch4x_non_ansi.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch4x_non_ansi.yml)               |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch4x_maven.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch4x_maven.yml)                     |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch4x_maven_java21.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch4x_maven_java21.yml)       |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch4x_python.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch4x_python.yml)                   |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch4x_python_pypy3.10.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch4x_python_pypy3.10.yml) |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch4x_python_3.14.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch4x_python_3.14.yml)           |\n| branch-4.2 | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch42.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch42.yml)                                 |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch42_java21.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch42_java21.yml)                   |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch42_non_ansi.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch42_non_ansi.yml)               |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch42_maven.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch42_maven.yml)                     |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch42_maven_java21.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch42_maven_java21.yml)       |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch42_python.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch42_python.yml)                   |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch42_python_pypy3.10.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch42_python_pypy3.10.yml) |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch42_python_3.14.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch42_python_3.14.yml)           |\n| branch-4.1 | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch41.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch41.yml)                                 |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch41_java21.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch41_java21.yml)                   |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch41_non_ansi.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch41_non_ansi.yml)               |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch41_maven.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch41_maven.yml)                     |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch41_maven_java21.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch41_maven_java21.yml)       |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch41_python.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch41_python.yml)                   |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch41_python_pypy3.10.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch41_python_pypy3.10.yml) |\n| branch-4.0 | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch40.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch40.yml)                                 |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch40_java21.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch40_java21.yml)                   |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch40_non_ansi.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch40_non_ansi.yml)               |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch40_maven.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch40_maven.yml)                     |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch40_maven_java21.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch40_maven_java21.yml)       |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch40_python.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch40_python.yml)                   |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch40_python_pypy3.10.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch40_python_pypy3.10.yml) |\n| branch-3.5 | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch35.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch35.yml)                                 |\n|            | [![GitHub Actions Build](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch35_python.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fapache\u002Fspark\u002Factions\u002Fworkflows\u002Fbuild_branch35_python.yml)                   |\n\n\n## Building Spark\n\nSpark is built using [Apache Maven](https:\u002F\u002Fmaven.apache.org\u002F).\nTo build Spark and its example programs, run:\n\n```bash\n.\u002Fbuild\u002Fmvn -DskipTests clean package\n```\n\n(You do not need to do this if you downloaded a pre-built package.)\n\nMore detailed documentation is available from the project site, at\n[\"Building Spark\"](https:\u002F\u002Fspark.apache.org\u002Fdocs\u002Flatest\u002Fbuilding-spark.html).\n\nFor general development tips, including info on developing Spark using an IDE, see [\"Useful Developer Tools\"](https:\u002F\u002Fspark.apache.org\u002Fdeveloper-tools.html).\n\n## Interactive Scala Shell\n\nThe easiest way to start using Spark is through the Scala shell:\n\n```bash\n.\u002Fbin\u002Fspark-shell\n```\n\nTry the following command, which should return 1,000,000,000:\n\n```scala\nscala> spark.range(1000 * 1000 * 1000).count()\n```\n\n## Interactive Python Shell\n\nAlternatively, if you prefer Python, you can use the Python shell:\n\n```bash\n.\u002Fbin\u002Fpyspark\n```\n\nAnd run the following command, which should also return 1,000,000,000:\n\n```python\n>>> spark.range(1000 * 1000 * 1000).count()\n```\n\n## Example Programs\n\nSpark also comes with several sample programs in the `examples` directory.\nTo run one of them, use `.\u002Fbin\u002Frun-example \u003Cclass> [params]`. For example:\n\n```bash\n.\u002Fbin\u002Frun-example SparkPi\n```\n\nwill run the Pi example locally.\n\nYou can set the MASTER environment variable when running examples to submit\nexamples to a cluster. This can be spark:\u002F\u002F URL,\n\"yarn\" to run on YARN, and \"local\" to run\nlocally with one thread, or \"local[N]\" to run locally with N threads. You\ncan also use an abbreviated class name if the class is in the `examples`\npackage. For instance:\n\n```bash\nMASTER=spark:\u002F\u002Fhost:7077 .\u002Fbin\u002Frun-example SparkPi\n```\n\nMany of the example programs print usage help if no params are given.\n\n## Running Tests\n\nTesting first requires [building Spark](#building-spark). Once Spark is built, tests\ncan be run using:\n\n```bash\n.\u002Fdev\u002Frun-tests\n```\n\nPlease see the guidance on how to\n[run tests for a module, or individual tests](https:\u002F\u002Fspark.apache.org\u002Fdeveloper-tools.html#individual-tests).\n\nThere is also a Kubernetes integration test, see resource-managers\u002Fkubernetes\u002Fintegration-tests\u002FREADME.md\n\n## A Note About Hadoop Versions\n\nSpark uses the Hadoop core library to talk to HDFS and other Hadoop-supported\nstorage systems. Because the protocols have changed in different versions of\nHadoop, you must build Spark against the same version that your cluster runs.\n\nPlease refer to the build documentation at\n[\"Specifying the Hadoop Version and Enabling YARN\"](https:\u002F\u002Fspark.apache.org\u002Fdocs\u002Flatest\u002Fbuilding-spark.html#specifying-the-hadoop-version-and-enabling-yarn)\nfor detailed guidance on building for a particular distribution of Hadoop, including\nbuilding for particular Hive and Hive Thriftserver distributions.\n\n## Configuration\n\nPlease refer to the [Configuration Guide](https:\u002F\u002Fspark.apache.org\u002Fdocs\u002Flatest\u002Fconfiguration.html)\nin the online documentation for an overview on how to configure Spark.\n\n## Contributing\n\nPlease review the [Contribution to Spark guide](https:\u002F\u002Fspark.apache.org\u002Fcontributing.html)\nfor information on how to get started contributing to the project.\n","Apache Spark 是一个用于大规模数据处理的统一分析引擎。它提供了Scala、Java、Python和R（已弃用）的高级API，以及支持通用计算图的优化引擎，适用于数据分析任务。Spark还包含一系列高级工具，如Spark SQL用于SQL和DataFrame操作，pandas API on Spark用于pandas工作负载，MLlib用于机器学习，GraphX用于图处理，以及Structured Streaming用于流处理。这些特性使得Spark非常适合需要高效处理大量数据并进行复杂分析的应用场景，比如大数据分析、机器学习模型训练与预测等。",2,"2026-06-11 02:41:18","top_all"]