[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-3561":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":14,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":22,"hasPages":24,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":28,"readmeContent":29,"aiSummary":30,"trendingCount":16,"starSnapshotCount":16,"syncStatus":13,"lastSyncTime":31,"discoverSource":32},3561,"mori","shikokuchuo\u002Fmori","shikokuchuo","Shared Memory for R Objects","https:\u002F\u002Fshikokuchuo.net\u002Fmori\u002F",null,"C",136,2,3,1,0,4,21,9,51.53,"Other",false,"main",true,[26,27],"r","shared-memory","2026-06-12 04:00:18","\n\u003C!-- README.md is generated from README.Rmd. Please edit that file -->\n\n# mori\n\n\u003C!-- badges: start -->\n\n[![CRAN\nstatus](https:\u002F\u002Fwww.r-pkg.org\u002Fbadges\u002Fversion\u002Fmori)](https:\u002F\u002FCRAN.R-project.org\u002Fpackage=mori)\n[![R-CMD-check](https:\u002F\u002Fgithub.com\u002Fshikokuchuo\u002Fmori\u002Factions\u002Fworkflows\u002FR-CMD-check.yaml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fshikokuchuo\u002Fmori\u002Factions\u002Fworkflows\u002FR-CMD-check.yaml)\n[![Codecov test\ncoverage](https:\u002F\u002Fcodecov.io\u002Fgh\u002Fshikokuchuo\u002Fmori\u002Fgraph\u002Fbadge.svg)](https:\u002F\u002Fapp.codecov.io\u002Fgh\u002Fshikokuchuo\u002Fmori)\n\u003C!-- badges: end -->\n\n      ________\n     \u002F\\ mori  \\\n    \u002F  \\       \\\n    \\  \u002F  森   \u002F\n     \\\u002F_______\u002F\n\nShared Memory for R Objects\n\n→ `share()` writes an R object into shared memory and returns a shared\nversion\n\n→ Compact ALTREP serialization — shared objects travel transparently\nthrough `serialize()` and `mirai()`\n\n→ Lazy access and automatic cleanup — read on demand; freed by R’s\ngarbage collector\n\n→ OS-level shared memory (POSIX \u002F Win32) — pure C, no external\ndependencies\n\n\u003Cbr \u002F>\n\n## Installation\n\n``` r\ninstall.packages(\"mori\")\n```\n\n## Why mori\n\n\u003Ca href=\"#why-mori\">\u003Cimg src=\"man\u002Ffigures\u002Fmori-diagram.svg\" alt=\"Diagram showing share() writing an object once into OS-backed shared memory, which is then memory-mapped by other processes using zero-copy ALTREP wrappers\" width=\"720\" \u002F>\u003C\u002Fa>\n\nParallel computing multiplies memory. When 8 workers each need the same\n200 MB dataset, that is 1.6 GB of serialization, transfer, and\ndeserialization — with 8 separate copies consuming RAM.\n\n`share()` writes the data into shared memory once and each worker maps\nthe same physical pages — turning per-worker copies into per-worker\nreferences.\n\n``` r\nlibrary(mori)\nlibrary(mirai)\nlibrary(lobstr)\n\ndaemons(8)\n\n# 200 MB data frame — 5 columns × 5M rows\ndf \u003C- as.data.frame(matrix(rnorm(25e6), ncol = 5))\nshared_df \u003C- share(df)\n```\n\nWithout mori, each worker holds the full data frame. With mori, each\nworker holds a small reference into the shared region:\n\n``` r\nmirai_map(1:8, \\(i, data) format(lobstr::obj_size(data)),\n          .args = list(data = df))[.flat] |> unique()\n#> [1] \"200.00 MB\"\n\nmirai_map(1:8, \\(i, data) format(lobstr::obj_size(data)),\n          .args = list(data = shared_df))[.flat] |> unique()\n#> [1] \"824 B\"\n```\n\nAvoiding 8 × 200 MB of serialize \u002F deserialize also translates into a\nsignificant runtime saving:\n\n``` r\nboot_mean \u003C- \\(i, data) colMeans(data[sample(nrow(data), replace = TRUE), ])\n\n# Without mori — each daemon deserializes a full copy\nmirai_map(1:8, boot_mean, .args = list(data = df))[] |> system.time()\n#>    user  system elapsed \n#>   0.709  12.272   8.631\n\n# With mori — each daemon maps the same shared memory\nmirai_map(1:8, boot_mean, .args = list(data = shared_df))[] |> system.time()\n#>    user  system elapsed \n#>   0.002   0.004   4.991\n\ndaemons(0)\n```\n\n## Usage\n\nWorkers must run on the same machine — mori shares physical RAM, not\nbytes over a network.\n\n### Sharing by name\n\n`shared_name()` returns the shared memory name of a shared object;\n`map_shared()` opens a region by that name — useful for handing a\nreference between processes without going through serialization:\n\n``` r\nx \u003C- share(rnorm(1e6))\n\nshared_name(x)\n#> [1] \"\u002Fmori_4d1b_1\"\n\n# Another process — here the same one — can map the region by name\ny \u003C- map_shared(shared_name(x))\nidentical(x[], y[])\n#> [1] TRUE\n```\n\n### Sharing through serialization\n\nThe ALTREP serialization hooks emit the same identifier on the wire, so\nthe serialized form is a few bytes regardless of the data size:\n\n``` r\nlength(serialize(x, NULL))\n#> [1] 124\n```\n\nThis is transparent to any R serialization pathway — `mirai`,\n`parallel`, `callr`, and base R `serialize()` all carry shared objects\nas references rather than copies.\n\nSub-elements of a shared list serialize as references too — each element\ntravels as a path into the parent shared region, not as the full data:\n\n``` r\ndaemons(3)\n\n# Share a list — all 3 vectors in a single shared region\nlst \u003C- share(list(a = rnorm(1e6), b = rnorm(1e6), c = rnorm(1e6)))\n\n# Each element arrives on the worker as a zero-copy reference\nmirai_map(lst, \\(v) format(lobstr::obj_size(v)))[.flat] |> unique()\n#> [1] \"904 B\"\n\ndaemons(0)\n```\n\n## How It Works\n\n### What gets shared\n\nAll atomic vector types and lists \u002F data frames are written directly\ninto shared memory, with attributes preserved end-to-end. Pairlists are\ncoerced to lists. `share()` returns ALTREP wrappers that point into the\nshared pages — no deserialization, no per-process memory allocation.\n\nAll other R objects (environments, closures, language objects) are\nreturned unchanged by `share()` — no shared memory region is created.\n\n### Lazy access\n\nA data frame lives in a single shared region; columns are read on\ndemand, so a worker that needs 3 of 100 columns only loads 3. Character\nstrings are accessed lazily per element.\n\n``` r\ndf \u003C- share(as.data.frame(matrix(rnorm(1e7), ncol = 100)))\nshared_name(df)        # one region for all 100 columns\n#> [1] \"\u002Fmori_4d1b_3\"\nshared_name(df[[50]])  # sub-path into the same region\n#> [1] \"\u002Fmori_4d1b_3[50]\"\n```\n\n### Lifetime\n\nShared memory is managed by R’s garbage collector. The shared memory\nregion stays alive as long as any shared object backed by it remains\nreferenced in R — the original returned by `share()`, or a column or\nsub-list extracted from it, in this or another process. When no\nreferences remain, the garbage collector frees the shared memory\nautomatically.\n\n**Important:** Always assign the result of `share()` to a variable. The\nshared memory is kept alive by the R object reference — if the result is\nused temporarily (not assigned), the garbage collector may free the\nshared memory before a consumer process has mapped it.\n\n### Copy-on-write\n\nShared data is mapped read-only, preventing corruption of the shared\nregion. Mutations are always local — R’s copy-on-write mechanism ensures\nother processes continue reading the original shared data:\n\n- **Structural changes** to a list or data frame (adding, removing, or\n  reordering elements) produce a regular R list. The shared region is\n  unaffected.\n- **Modifying values** within a shared vector (e.g., `X[1] \u003C- 0`)\n  materializes just that vector into a private copy. Other vectors in\n  the same shared region stay zero-copy.\n\n–\n\nPlease note that the mori project is released with a [Contributor Code\nof Conduct](https:\u002F\u002Fshikokuchuo.net\u002Fmori\u002FCODE_OF_CONDUCT.html). By\ncontributing to this project, you agree to abide by its terms.\n","mori 是一个用于R对象的共享内存库。它通过`share()`函数将R对象写入共享内存，并返回一个共享版本，从而允许在多个R进程中高效地共享数据。其核心功能包括紧凑的ALTREP序列化、按需读取和自动垃圾回收清理，以及基于操作系统级别的共享内存（支持POSIX和Win32标准）。mori完全用C语言编写，不依赖任何外部库。该项目非常适合需要在同一台机器上并行处理大规模数据集的场景，尤其是在多进程环境下可以显著减少内存占用和提高数据处理速度，避免了因多次复制大对象而造成的资源浪费。","2026-06-11 02:54:42","CREATED_QUERY"]