[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-5493":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":14,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":23,"defaultBranch":24,"hasWiki":22,"hasPages":23,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":30,"readmeContent":31,"aiSummary":32,"trendingCount":16,"starSnapshotCount":16,"syncStatus":17,"lastSyncTime":33,"discoverSource":34},5493,"xsv","BurntSushi\u002Fxsv","BurntSushi","A fast CSV command line toolkit written in Rust.","",null,"Rust",10749,327,1,131,0,2,6,3,42.55,"The Unlicense",true,false,"master",[26,27,28,29],"cli","command-line","csv","rust","2026-06-12 02:01:11","# `xsv` is now unmaintained\n\nIn lieu of `xsv`, I'd recommend either\n[qsv](https:\u002F\u002Fgithub.com\u002Fdathere\u002Fqsv)\nor\n[xan](https:\u002F\u002Fgithub.com\u002Fmedialab\u002Fxan).\n\n-------------------------------------------------------------------------------\n\n\nxsv is a command line program for indexing, slicing, analyzing, splitting\nand joining CSV files. Commands should be simple, fast and composable:\n\n1. Simple tasks should be easy.\n2. Performance trade offs should be exposed in the CLI interface.\n3. Composition should not come at the expense of performance.\n\nThis README contains information on how to\n[install `xsv`](https:\u002F\u002Fgithub.com\u002FBurntSushi\u002Fxsv#installation), in addition to\na quick tour of several commands.\n\n[![Linux build status](https:\u002F\u002Fapi.travis-ci.org\u002FBurntSushi\u002Fxsv.svg)](https:\u002F\u002Ftravis-ci.org\u002FBurntSushi\u002Fxsv)\n[![Windows build status](https:\u002F\u002Fci.appveyor.com\u002Fapi\u002Fprojects\u002Fstatus\u002Fgithub\u002FBurntSushi\u002Fxsv?svg=true)](https:\u002F\u002Fci.appveyor.com\u002Fproject\u002FBurntSushi\u002Fxsv)\n[![](https:\u002F\u002Fmeritbadge.herokuapp.com\u002Fxsv)](https:\u002F\u002Fcrates.io\u002Fcrates\u002Fxsv)\n\nDual-licensed under MIT or the [UNLICENSE](https:\u002F\u002Funlicense.org).\n\n\n### Available commands\n\n* **cat** - Concatenate CSV files by row or by column.\n* **count** - Count the rows in a CSV file. (Instantaneous with an index.)\n* **fixlengths** - Force a CSV file to have same-length records by either\n  padding or truncating them.\n* **flatten** - A flattened view of CSV records. Useful for viewing one record\n  at a time. e.g., `xsv slice -i 5 data.csv | xsv flatten`.\n* **fmt** - Reformat CSV data with different delimiters, record terminators\n  or quoting rules. (Supports ASCII delimited data.)\n* **frequency** - Build frequency tables of each column in CSV data. (Uses\n  parallelism to go faster if an index is present.)\n* **headers** - Show the headers of CSV data. Or show the intersection of all\n  headers between many CSV files.\n* **index** - Create an index for a CSV file. This is very quick and provides\n  constant time indexing into the CSV file.\n* **input** - Read CSV data with exotic quoting\u002Fescaping rules.\n* **join** - Inner, outer and cross joins. Uses a simple hash index to make it\n  fast.\n* **partition** - Partition CSV data based on a column value.\n* **sample** - Randomly draw rows from CSV data using reservoir sampling (i.e.,\n  use memory proportional to the size of the sample).\n* **reverse** - Reverse order of rows in CSV data.\n* **search** - Run a regex over CSV data. Applies the regex to each field\n  individually and shows only matching rows.\n* **select** - Select or re-order columns from CSV data.\n* **slice** - Slice rows from any part of a CSV file. When an index is present,\n  this only has to parse the rows in the slice (instead of all rows leading up\n  to the start of the slice).\n* **sort** - Sort CSV data.\n* **split** - Split one CSV file into many CSV files of N chunks.\n* **stats** - Show basic types and statistics of each column in the CSV file.\n  (i.e., mean, standard deviation, median, range, etc.)\n* **table** - Show aligned output of any CSV data using\n  [elastic tabstops](https:\u002F\u002Fgithub.com\u002FBurntSushi\u002Ftabwriter).\n\n\n### A whirlwind tour\n\nLet's say you're playing with some of the data from the\n[Data Science Toolkit](https:\u002F\u002Fgithub.com\u002Fpetewarden\u002Fdstkdata), which contains\nseveral CSV files. Maybe you're interested in the population counts of each\ncity in the world. So grab the data and start examining it:\n\n```bash\n$ curl -LO https:\u002F\u002Fburntsushi.net\u002Fstuff\u002Fworldcitiespop.csv\n$ xsv headers worldcitiespop.csv\n1   Country\n2   City\n3   AccentCity\n4   Region\n5   Population\n6   Latitude\n7   Longitude\n```\n\nThe next thing you might want to do is get an overview of the kind of data that\nappears in each column. The `stats` command will do this for you:\n\n```bash\n$ xsv stats worldcitiespop.csv --everything | xsv table\nfield       type     min            max            min_length  max_length  mean          stddev         median     mode         cardinality\nCountry     Unicode  ad             zw             2           2                                                   cn           234\nCity        Unicode   bab el ahmar  Þykkvibaer     1           91                                                  san jose     2351892\nAccentCity  Unicode   Bâb el Ahmar  ïn Bou Chella  1           91                                                  San Antonio  2375760\nRegion      Unicode  00             Z9             0           2                                        13         04           397\nPopulation  Integer  7              31480498       0           8           47719.570634  302885.559204  10779                   28754\nLatitude    Float    -54.933333     82.483333      1           12          27.188166     21.952614      32.497222  51.15        1038349\nLongitude   Float    -179.983333    180            1           14          37.08886      63.22301       35.28      23.8         1167162\n```\n\nThe `xsv table` command takes any CSV data and formats it into aligned columns\nusing [elastic tabstops](https:\u002F\u002Fgithub.com\u002FBurntSushi\u002Ftabwriter). You'll\nnotice that it even gets alignment right with respect to Unicode characters.\n\nSo, this command takes about 12 seconds to run on my machine, but we can speed\nit up by creating an index and re-running the command:\n\n```bash\n$ xsv index worldcitiespop.csv\n$ xsv stats worldcitiespop.csv --everything | xsv table\n...\n```\n\nWhich cuts it down to about 8 seconds on my machine. (And creating the index\ntakes less than 2 seconds.)\n\nNotably, the same type of \"statistics\" command in another\n[CSV command line toolkit](https:\u002F\u002Fcsvkit.readthedocs.io\u002F)\ntakes about 2 minutes to produce similar statistics on the same data set.\n\nCreating an index gives us more than just faster statistics gathering. It also\nmakes slice operations extremely fast because *only the sliced portion* has to\nbe parsed. For example, let's say you wanted to grab the last 10 records:\n\n```bash\n$ xsv count worldcitiespop.csv\n3173958\n$ xsv slice worldcitiespop.csv -s 3173948 | xsv table\nCountry  City               AccentCity         Region  Population  Latitude     Longitude\nzw       zibalonkwe         Zibalonkwe         06                  -19.8333333  27.4666667\nzw       zibunkululu        Zibunkululu        06                  -19.6666667  27.6166667\nzw       ziga               Ziga               06                  -19.2166667  27.4833333\nzw       zikamanas village  Zikamanas Village  00                  -18.2166667  27.95\nzw       zimbabwe           Zimbabwe           07                  -20.2666667  30.9166667\nzw       zimre park         Zimre Park         04                  -17.8661111  31.2136111\nzw       ziyakamanas        Ziyakamanas        00                  -18.2166667  27.95\nzw       zizalisari         Zizalisari         04                  -17.7588889  31.0105556\nzw       zuzumba            Zuzumba            06                  -20.0333333  27.9333333\nzw       zvishavane         Zvishavane         07      79876       -20.3333333  30.0333333\n```\n\nThese commands are *instantaneous* because they run in time and memory\nproportional to the size of the slice (which means they will scale to\narbitrarily large CSV data).\n\nSwitching gears a little bit, you might not always want to see every column in\nthe CSV data. In this case, maybe we only care about the country, city and\npopulation. So let's take a look at 10 random rows:\n\n```bash\n$ xsv select Country,AccentCity,Population worldcitiespop.csv \\\n  | xsv sample 10 \\\n  | xsv table\nCountry  AccentCity       Population\ncn       Guankoushang\nza       Klipdrift\nma       Ouled Hammou\nfr       Les Gravues\nla       Ban Phadèng\nde       Lüdenscheid      80045\nqa       Umm ash Shubrum\nbd       Panditgoan\nus       Appleton\nua       Lukashenkivske\n```\n\nWhoops! It seems some cities don't have population counts. How pervasive is\nthat?\n\n```bash\n$ xsv frequency worldcitiespop.csv --limit 5\nfield,value,count\nCountry,cn,238985\nCountry,ru,215938\nCountry,id,176546\nCountry,us,141989\nCountry,ir,123872\nCity,san jose,328\nCity,san antonio,320\nCity,santa rosa,296\nCity,santa cruz,282\nCity,san juan,255\nAccentCity,San Antonio,317\nAccentCity,Santa Rosa,296\nAccentCity,Santa Cruz,281\nAccentCity,San Juan,254\nAccentCity,San Miguel,254\nRegion,04,159916\nRegion,02,142158\nRegion,07,126867\nRegion,03,122161\nRegion,05,118441\nPopulation,(NULL),3125978\nPopulation,2310,12\nPopulation,3097,11\nPopulation,983,11\nPopulation,2684,11\nLatitude,51.15,777\nLatitude,51.083333,772\nLatitude,50.933333,769\nLatitude,51.116667,769\nLatitude,51.133333,767\nLongitude,23.8,484\nLongitude,23.2,477\nLongitude,23.05,476\nLongitude,25.3,474\nLongitude,23.1,459\n```\n\n(The `xsv frequency` command builds a frequency table for each column in the\nCSV data. This one only took 5 seconds.)\n\nSo it seems that most cities do not have a population count associated with\nthem at all. No matter—we can adjust our previous command so that it only\nshows rows with a population count:\n\n```bash\n$ xsv search -s Population '[0-9]' worldcitiespop.csv \\\n  | xsv select Country,AccentCity,Population \\\n  | xsv sample 10 \\\n  | xsv table\nCountry  AccentCity       Population\nes       Barañáin         22264\nes       Puerto Real      36946\nat       Moosburg         4602\nhu       Hejobaba         1949\nru       Polyarnyye Zori  15092\ngr       Kandíla          1245\nis       Ólafsvík         992\nhu       Decs             4210\nbg       Sliven           94252\ngb       Leatherhead      43544\n```\n\nErk. Which country is `at`? No clue, but the Data Science Toolkit has a CSV\nfile called `countrynames.csv`. Let's grab it and do a join so we can see which\ncountries these are:\n\n```bash\ncurl -LO https:\u002F\u002Fgist.githubusercontent.com\u002Fanonymous\u002F063cb470e56e64e98cf1\u002Fraw\u002F98e2589b801f6ca3ff900b01a87fbb7452eb35c7\u002Fcountrynames.csv\n$ xsv headers countrynames.csv\n1   Abbrev\n2   Country\n$ xsv join --no-case  Country sample.csv Abbrev countrynames.csv | xsv table\nCountry  AccentCity       Population  Abbrev  Country\nes       Barañáin         22264       ES      Spain\nes       Puerto Real      36946       ES      Spain\nat       Moosburg         4602        AT      Austria\nhu       Hejobaba         1949        HU      Hungary\nru       Polyarnyye Zori  15092       RU      Russian Federation | Russia\ngr       Kandíla          1245        GR      Greece\nis       Ólafsvík         992         IS      Iceland\nhu       Decs             4210        HU      Hungary\nbg       Sliven           94252       BG      Bulgaria\ngb       Leatherhead      43544       GB      Great Britain | UK | England | Scotland | Wales | Northern Ireland | United Kingdom\n```\n\nWhoops, now we have two columns called `Country` and an `Abbrev` column that we\nno longer need. This is easy to fix by re-ordering columns with the `xsv\nselect` command:\n\n```bash\n$ xsv join --no-case  Country sample.csv Abbrev countrynames.csv \\\n  | xsv select 'Country[1],AccentCity,Population' \\\n  | xsv table\nCountry                                                                              AccentCity       Population\nSpain                                                                                Barañáin         22264\nSpain                                                                                Puerto Real      36946\nAustria                                                                              Moosburg         4602\nHungary                                                                              Hejobaba         1949\nRussian Federation | Russia                                                          Polyarnyye Zori  15092\nGreece                                                                               Kandíla          1245\nIceland                                                                              Ólafsvík         992\nHungary                                                                              Decs             4210\nBulgaria                                                                             Sliven           94252\nGreat Britain | UK | England | Scotland | Wales | Northern Ireland | United Kingdom  Leatherhead      43544\n```\n\nPerhaps we can do this with the original CSV data? Indeed we can—because\njoins in `xsv` are fast.\n\n```bash\n$ xsv join --no-case Abbrev countrynames.csv Country worldcitiespop.csv \\\n  | xsv select '!Abbrev,Country[1]' \\\n  > worldcitiespop_countrynames.csv\n$ xsv sample 10 worldcitiespop_countrynames.csv | xsv table\nCountry                      City                   AccentCity             Region  Population  Latitude    Longitude\nSri Lanka                    miriswatte             Miriswatte             36                  7.2333333   79.9\nRomania                      livezile               Livezile               26      1985        44.512222   22.863333\nIndonesia                    tawainalu              Tawainalu              22                  -4.0225     121.9273\nRussian Federation | Russia  otar                   Otar                   45                  56.975278   48.305278\nFrance                       le breuil-bois robert  le Breuil-Bois Robert  A8                  48.945567   1.717026\nFrance                       lissac                 Lissac                 B1                  45.103094   1.464927\nAlbania                      lumalasi               Lumalasi               46                  40.6586111  20.7363889\nChina                        motzushih              Motzushih              11                  27.65       111.966667\nRussian Federation | Russia  svakino                Svakino                69                  55.60211    34.559785\nRomania                      tirgu pancesti         Tirgu Pancesti         38                  46.216667   27.1\n```\n\nThe `!Abbrev,Country[1]` syntax means, \"remove the `Abbrev` column and remove\nthe second occurrence of the `Country` column.\" Since we joined with\n`countrynames.csv` first, the first `Country` name (fully expanded) is now\nincluded in the CSV data.\n\nThis `xsv join` command takes about 7 seconds on my machine. The performance\ncomes from constructing a very simple hash index of one of the CSV data files\ngiven. The `join` command does an inner join by default, but it also has left,\nright and full outer join support too.\n\n\n### Installation\n\nBinaries for Windows, Linux and macOS are available [from Github](https:\u002F\u002Fgithub.com\u002FBurntSushi\u002Fxsv\u002Freleases\u002Flatest).\n\nIf you're a **macOS Homebrew** user, then you can install xsv\nfrom homebrew-core:\n\n```\n$ brew install xsv\n```\n\nIf you're a **macOS MacPorts** user, then you can install xsv\nfrom the [official ports](https:\u002F\u002Fwww.macports.org\u002Fports.php?by=name&substr=xsv):\n\n```\n$ sudo port install xsv\n```\n\nIf you're a **Nix\u002FNixOS** user, you can install xsv from nixpkgs:\n\n```\n$ nix-env -i xsv\n```\n\nAlternatively, you can compile from source by\n[installing Cargo](https:\u002F\u002Fcrates.io\u002Finstall)\n([Rust's](https:\u002F\u002Fwww.rust-lang.org\u002F) package manager)\nand installing `xsv` using Cargo:\n\n```bash\ncargo install xsv\n```\n\nCompiling from this repository also works similarly:\n\n```bash\ngit clone git:\u002F\u002Fgithub.com\u002FBurntSushi\u002Fxsv\ncd xsv\ncargo build --release\n```\n\nCompilation will probably take a few minutes depending on your machine. The\nbinary will end up in `.\u002Ftarget\u002Frelease\u002Fxsv`.\n\n\n### Benchmarks\n\nI've compiled some [very rough\nbenchmarks](https:\u002F\u002Fgithub.com\u002FBurntSushi\u002Fxsv\u002Fblob\u002Fmaster\u002FBENCHMARKS.md) of\nvarious `xsv` commands.\n\n\n### Motivation\n\nHere are several valid criticisms of this project:\n\n1. You shouldn't be working with CSV data because CSV is a terrible format.\n2. If your data is gigabytes in size, then CSV is the wrong storage type.\n3. Various SQL databases provide all of the operations available in `xsv` with\n   more sophisticated indexing support. And the performance is a zillion times\n   better.\n\nI'm sure there are more criticisms, but the impetus for this project was a 40GB\nCSV file that was handed to me. I was tasked with figuring out the shape of the\ndata inside of it and coming up with a way to integrate it into our existing\nsystem. It was then that I realized that every single CSV tool I knew about was\nwoefully inadequate. They were just too slow or didn't provide enough\nflexibility. (Another project I had comprised of a few dozen CSV files. They\nwere smaller than 40GB, but they were each supposed to represent the same kind\nof data. But they all had different column and unintuitive column names. Useful\nCSV inspection tools were critical here—and they had to be reasonably fast.)\n\nThe key ingredients for helping me with my task were indexing, random sampling,\nsearching, slicing and selecting columns. All of these things made dealing with\n40GB of CSV data a bit more manageable (or dozens of CSV files).\n\nGetting handed a large CSV file *once* was enough to launch me on this quest.\nFrom conversations I've had with others, CSV data files this large don't seem\nto be a rare event. Therefore, I believe there is room for a tool that has a\nhope of dealing with data that large.\n\n\n### Naming collision\n\nThis project is unrelated to another similar project with the same name:\nhttps:\u002F\u002Fmj.ucw.cz\u002Fsw\u002Fxsv\u002F\n","xsv 是一个用 Rust 编写的快速 CSV 命令行工具包，用于处理 CSV 文件的索引、切片、分析、分割和合并。其核心功能包括数据统计、排序、筛选等，并支持通过命令行接口进行简单高效的组合操作。该工具旨在使简单任务易于执行，同时保证性能不受影响。适用于需要对大量 CSV 数据进行快速处理和分析的场景，如数据清洗、预处理或日常的数据管理任务。尽管项目现已不再维护，但其设计理念和技术实现仍值得参考。","2026-06-11 03:03:39","top_language"]