[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-5577":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":15,"stars7d":16,"stars30d":17,"stars90d":15,"forks30d":15,"starsTrendScore":18,"compositeScore":19,"rankGlobal":10,"rankLanguage":10,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":21,"hasPages":21,"topics":23,"createdAt":10,"pushedAt":10,"updatedAt":30,"readmeContent":31,"aiSummary":32,"trendingCount":15,"starSnapshotCount":15,"syncStatus":33,"lastSyncTime":34,"discoverSource":35},5577,"toydb","erikgrinaker\u002Ftoydb","erikgrinaker","Distributed SQL database in Rust, written as an educational project","",null,"Rust",7244,626,92,0,3,27,1,39.39,"Apache License 2.0",false,"main",[24,25,26,27,28,29],"database","distributed","mvcc","raft","rust","sql","2026-06-12 02:01:12","# \u003Ca>\u003Cimg src=\".\u002Fdocs\u002Farchitecture\u002Fimages\u002Ftoydb.svg\" height=\"40\" valign=\"top\" \u002F>\u003C\u002Fa> toyDB\n\nDistributed SQL database in Rust, built from scratch as an educational project. Main features:\n\n* [Raft distributed consensus][raft] for linearizable state machine replication.\n\n* [ACID transactions][txn] with MVCC-based snapshot isolation.\n\n* [Pluggable storage engine][storage] with [BitCask][bitcask] and [in-memory][memory] backends.\n\n* [Iterator-based query engine][query] with [heuristic optimization][optimizer] and time-travel \n  support.\n\n* [SQL interface][sql] including joins, aggregates, and transactions.\n\ntoyDB is intended to be simple and understandable, and also functional and correct. Other aspects\nlike performance, scalability, and availability are non-goals -- these are major sources of\ncomplexity in production-grade databases, and obscure the basic underlying concepts. Shortcuts have\nbeen taken where possible.\n\nI originally wrote toyDB in 2020 to learn more about database internals. Since then, I've spent\nseveral years building real distributed SQL databases at\n[CockroachDB](https:\u002F\u002Fgithub.com\u002Fcockroachdb\u002Fcockroach) and\n[Neon](https:\u002F\u002Fgithub.com\u002Fneondatabase\u002Fneon). Based on this experience, I've rewritten toyDB as a\nsimple illustration of the architecture and concepts behind distributed SQL databases.\n\n[raft]: https:\u002F\u002Fgithub.com\u002Ferikgrinaker\u002Ftoydb\u002Fblob\u002Fmain\u002Fsrc\u002Fraft\u002Fmod.rs\n[txn]: https:\u002F\u002Fgithub.com\u002Ferikgrinaker\u002Ftoydb\u002Fblob\u002Fmain\u002Fsrc\u002Fstorage\u002Fmvcc.rs\n[storage]: https:\u002F\u002Fgithub.com\u002Ferikgrinaker\u002Ftoydb\u002Fblob\u002Fmain\u002Fsrc\u002Fstorage\u002Fengine.rs\n[bitcask]: https:\u002F\u002Fgithub.com\u002Ferikgrinaker\u002Ftoydb\u002Fblob\u002Fmain\u002Fsrc\u002Fstorage\u002Fbitcask.rs\n[memory]: https:\u002F\u002Fgithub.com\u002Ferikgrinaker\u002Ftoydb\u002Fblob\u002Fmain\u002Fsrc\u002Fstorage\u002Fmemory.rs\n[query]: https:\u002F\u002Fgithub.com\u002Ferikgrinaker\u002Ftoydb\u002Fblob\u002Fmain\u002Fsrc\u002Fsql\u002Fexecution\u002Fexecutor.rs\n[optimizer]: https:\u002F\u002Fgithub.com\u002Ferikgrinaker\u002Ftoydb\u002Fblob\u002Fmain\u002Fsrc\u002Fsql\u002Fplanner\u002Foptimizer.rs\n[sql]: https:\u002F\u002Fgithub.com\u002Ferikgrinaker\u002Ftoydb\u002Fblob\u002Fmain\u002Fsrc\u002Fsql\u002Fparser\u002Fparser.rs\n\n## Documentation\n\n* [Architecture guide](docs\u002Farchitecture\u002Findex.md): a guided tour of toyDB's code and architecture.\n\n* [SQL examples](docs\u002Fexamples.md): walkthrough of toyDB's SQL features.\n\n* [SQL reference](docs\u002Fsql.md): reference documentation for toyDB's SQL dialect.\n\n* [References](docs\u002Freferences.md): research materials used while building toyDB.\n\n## Usage\n\nWith a [Rust compiler](https:\u002F\u002Fwww.rust-lang.org\u002Ftools\u002Finstall) installed, a local five-node \ncluster can be built and started as:\n\n```\n$ .\u002Fcluster\u002Frun.sh\nStarting 5 nodes on ports 9601-9605 with data under cluster\u002F*\u002Fdata\u002F.\nTo connect to node 1, run: cargo run --release --bin toysql\n\ntoydb4 21:03:55 [INFO] Listening on [::1]:9604 (SQL) and [::1]:9704 (Raft)\ntoydb1 21:03:55 [INFO] Listening on [::1]:9601 (SQL) and [::1]:9701 (Raft)\ntoydb2 21:03:55 [INFO] Listening on [::1]:9602 (SQL) and [::1]:9702 (Raft)\ntoydb3 21:03:55 [INFO] Listening on [::1]:9603 (SQL) and [::1]:9703 (Raft)\ntoydb5 21:03:55 [INFO] Listening on [::1]:9605 (SQL) and [::1]:9705 (Raft)\ntoydb2 21:03:56 [INFO] Starting new election for term 1\n[...]\ntoydb2 21:03:56 [INFO] Won election for term 1, becoming leader\n```\n\nA command-line client can be built and used with node 1 on `localhost:9601`:\n\n```\n$ cargo run --release --bin toysql\nConnected to toyDB node n1. Enter !help for instructions.\ntoydb> CREATE TABLE movies (id INTEGER PRIMARY KEY, title VARCHAR NOT NULL);\ntoydb> INSERT INTO movies VALUES (1, 'Sicario'), (2, 'Stalker'), (3, 'Her');\ntoydb> SELECT * FROM movies;\n1, 'Sicario'\n2, 'Stalker'\n3, 'Her'\n```\n\ntoyDB supports most common SQL features, including joins, aggregates, and transactions. Below is an\n`EXPLAIN` query plan of a more complex query (fetches all movies from studios that have released any\nmovie with an IMDb rating of 8 or more):\n\n```\ntoydb> EXPLAIN SELECT m.title, g.name AS genre, s.name AS studio, m.rating\n  FROM movies m JOIN genres g ON m.genre_id = g.id,\n    studios s JOIN movies good ON good.studio_id = s.id AND good.rating >= 8\n  WHERE m.studio_id = s.id\n  GROUP BY m.title, g.name, s.name, m.rating, m.released\n  ORDER BY m.rating DESC, m.released ASC, m.title ASC;\n\nRemap: m.title, genre, studio, m.rating (dropped: m.released)\n└─ Order: m.rating desc, m.released asc, m.title asc\n   └─ Projection: m.title, g.name as genre, s.name as studio, m.rating, m.released\n      └─ Aggregate: m.title, g.name, s.name, m.rating, m.released\n         └─ HashJoin: inner on m.studio_id = s.id\n            ├─ HashJoin: inner on m.genre_id = g.id\n            │  ├─ Scan: movies as m\n            │  └─ Scan: genres as g\n            └─ HashJoin: inner on s.id = good.studio_id\n               ├─ Scan: studios as s\n               └─ Scan: movies as good (good.rating > 8 OR good.rating = 8)\n```\n\n## Architecture\n\ntoyDB's architecture is fairly typical for a distributed SQL database: a transactional\nkey\u002Fvalue store managed by a Raft cluster with a SQL query engine on top. See the\n[architecture guide](.\u002Fdocs\u002Farchitecture\u002Findex.md) for more details.\n\n[![toyDB architecture](.\u002Fdocs\u002Farchitecture\u002Fimages\u002Farchitecture.svg)](.\u002Fdocs\u002Farchitecture\u002Findex.md)\n\n## Tests\n\ntoyDB mainly uses [Goldenscripts](https:\u002F\u002Fgithub.com\u002Ferikgrinaker\u002Fgoldenscript) for tests. These \nscript various scenarios, capture events and output, and later assert that the behavior remains the \nsame. See e.g.:\n\n* [Raft cluster tests](https:\u002F\u002Fgithub.com\u002Ferikgrinaker\u002Ftoydb\u002Ftree\u002Fmain\u002Fsrc\u002Fraft\u002Ftestscripts\u002Fnode)\n* [MVCC transaction tests](https:\u002F\u002Fgithub.com\u002Ferikgrinaker\u002Ftoydb\u002Ftree\u002Fmain\u002Fsrc\u002Fstorage\u002Ftestscripts\u002Fmvcc)\n* [SQL execution tests](https:\u002F\u002Fgithub.com\u002Ferikgrinaker\u002Ftoydb\u002Ftree\u002Fmain\u002Fsrc\u002Fsql\u002Ftestscripts)\n* [End-to-end tests](https:\u002F\u002Fgithub.com\u002Ferikgrinaker\u002Ftoydb\u002Ftree\u002Fmain\u002Ftests\u002Fscripts)\n\nRun tests with `cargo test`, or have a look at the latest \n[CI run](https:\u002F\u002Fgithub.com\u002Ferikgrinaker\u002Ftoydb\u002Factions\u002Fworkflows\u002Fci.yml).\n\n## Benchmarks\n\ntoyDB is not optimized for performance, but comes with a `workload` benchmark tool that can run \nvarious workloads against a toyDB cluster. For example:\n\n```sh\n# Start a 5-node toyDB cluster.\n$ .\u002Fcluster\u002Frun.sh\n[...]\n\n# Run a read-only benchmark via all 5 nodes.\n$ cargo run --release --bin workload read\nPreparing initial dataset... done (0.179s)\nSpawning 16 workers... done (0.006s)\nRunning workload read (rows=1000 size=64 batch=1)...\n\nTime   Progress     Txns      Rate       p50       p90       p99      pMax\n1.0s      13.1%    13085   13020\u002Fs     1.3ms     1.5ms     1.9ms     8.4ms\n2.0s      27.2%    27183   13524\u002Fs     1.3ms     1.5ms     1.8ms     8.4ms\n3.0s      41.3%    41301   13702\u002Fs     1.2ms     1.5ms     1.8ms     8.4ms\n4.0s      55.3%    55340   13769\u002Fs     1.2ms     1.5ms     1.8ms     8.4ms\n5.0s      70.0%    70015   13936\u002Fs     1.2ms     1.5ms     1.8ms     8.4ms\n6.0s      84.7%    84663   14047\u002Fs     1.2ms     1.4ms     1.8ms     8.4ms\n7.0s      99.6%    99571   14166\u002Fs     1.2ms     1.4ms     1.7ms     8.4ms\n7.1s     100.0%   100000   14163\u002Fs     1.2ms     1.4ms     1.7ms     8.4ms\n\nVerifying dataset... done (0.002s)\n```\n\nThe available workloads are:\n\n* `read`: single-row primary key lookups.\n* `write`: single-row inserts to sequential primary keys.\n* `bank`: bank transfers between various customers and accounts. To make things interesting, this\n  includes joins, secondary indexes, sorting, and conflicts.\n\nFor more information about workloads and parameters, run `cargo run --bin workload -- --help`.\n\nExample workload results are listed below. Write performance is atrocious, due to\n[fsync](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FSync_(Unix)) and a lack of write batching in the Raft layer.\nDisabling fsync, or using the in-memory engine, significantly improves write performance (at the\nexpense of durability).\n\n| Workload | BitCask     | BitCask w\u002Fo fsync | Memory      |\n|----------|-------------|-------------------|-------------|\n| `read`   | 14163 txn\u002Fs | 13941 txn\u002Fs       | 13949 txn\u002Fs |\n| `write`  | 35 txn\u002Fs    | 4719 txn\u002Fs        | 7781 txn\u002Fs  |\n| `bank`   | 21 txn\u002Fs    | 1120 txn\u002Fs        | 1346 txn\u002Fs  |\n\n## Debugging\n\n[VSCode](https:\u002F\u002Fcode.visualstudio.com) and the [CodeLLDB](https:\u002F\u002Fmarketplace.visualstudio.com\u002Fitems?itemName=vadimcn.vscode-lldb)\nextension can be used to debug toyDB, with the debug configuration under `.vscode\u002Flaunch.json`.\n\nUnder the \"Run and Debug\" tab, select e.g. \"Debug executable 'toydb'\" or \"Debug unit tests in\nlibrary 'toydb'\".\n\n## Credits\n\nThe toyDB logo is courtesy of [@jonasmerlin](https:\u002F\u002Fgithub.com\u002Fjonasmerlin).","toyDB是一个用Rust编写的分布式SQL数据库，旨在作为教育项目。它实现了基于Raft一致性算法的线性化状态机复制，支持MVCC快照隔离的ACID事务，并提供了插件式存储引擎（包括BitCask和内存后端）。此外，该项目还具有基于迭代器的查询引擎，支持启发式优化及时间旅行功能，以及完整的SQL接口，涵盖连接、聚合与事务处理等功能。由于其设计初衷是简化并易于理解数据库内部机制，因此在性能、可扩展性和可用性方面并未做过多优化，更适合用于教学场景或个人学习数据库系统架构时参考。",2,"2026-06-11 03:04:10","top_language"]