[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-82979":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":22,"hasPages":22,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":34,"readmeContent":35,"aiSummary":36,"trendingCount":15,"starSnapshotCount":15,"syncStatus":14,"lastSyncTime":37,"discoverSource":38},82979,"nbd-vram","c0deJedi\u002Fnbd-vram","c0deJedi","Use your NVIDIA GPU's VRAM as swap space on Linux. Built for laptops with soldered memory and no upgrade path. If you have an RTX card sitting there with 8GB of VRAM and you're getting swapped to SSD, this puts that VRAM to work","https:\u002F\u002Fwww.seanlobjoit.com",null,"Shell",468,12,2,0,3,86,163,27,3.34,"MIT License",false,"main",[25,26,27,28,29,30,31,32,33],"cuda","gpu","laptop","linux","memory","nbd","nvidia","swap","vram","2026-06-12 02:04:29","# nbd-vram\n\nUse your NVIDIA GPU's VRAM as swap space on Linux.\n\nBuilt for hybrid graphics laptops with soldered memory and no upgrade path. The display runs off the integrated AMD\u002FATI GPU. The NVIDIA card sits idle most of the time, its VRAM completely unused. This puts that VRAM to work as high-priority swap.\n\nTested on: AMD\u002FATI + RTX 3070 Laptop (GA104M, 16 GB RAM, 8 GB VRAM), driver 580.159.03, kernel 6.17, Pop!_OS. Allocated 7 GB for swap. End result including zram and SSD swap: ~46 GB total addressable memory, tripled from stock. Overflow order: RAM fills, then VRAM absorbs the spill (PCIe), then zram compresses the rest (CPU), then SSD only if everything else is exhausted.\n\n![demo](demo.gif)\n\n---\n\n## How it works\n\nA small daemon allocates VRAM via the CUDA driver API, then serves it as a block device using the NBD (Network Block Device) protocol over a Unix socket. The kernel's built-in `nbd` driver connects to it and exposes `\u002Fdev\u002FnbdX`. From there it's a normal swap device.\n\nData path: kernel swap subsystem - \u002Fdev\u002FnbdX - nbd kernel driver - Unix socket - nbd-vram daemon - cuMemcpyHtoD\u002FDtoH - GPU VRAM.\n\nNo kernel module to write or maintain. No NVIDIA kernel symbols. Survives kernel and driver updates without rebuilding anything.\n\n---\n\n## Why not the NVIDIA P2P API?\n\nThe \"obvious\" approach is `nvidia_p2p_get_pages_persistent`, which pins VRAM pages in BAR1 so the CPU can access them directly via `ioremap_wc`. Every existing project that tried this route hits the same wall: the NVIDIA driver returns `EINVAL` on consumer GeForce GPUs. Both the persistent and non-persistent variants, both flag values. It's gated at the RM level for Quadro\u002Fdatacenter SKUs only, regardless of driver version.\n\nThe other approach - directly `ioremap_wc` the BAR1 physical address without going through the P2P API - also doesn't work. The GPU's internal page tables only have ~16 MiB of BAR1 mapped (just the display framebuffer). Reads from the rest return zeros. `mkswap` appears to succeed, then `swapon` fails because the swap header isn't actually there.\n\nThe NBD approach sidesteps all of this. `cuMemcpyHtoD` and `cuMemcpyDtoH` work on any CUDA GPU without any special permissions.\n\n---\n\n## Requirements\n\n- NVIDIA GPU with CUDA support (any consumer RTX\u002FGTX card)\n- NVIDIA driver with `libcuda.so.1` (no CUDA toolkit needed)\n- Linux kernel 3.0+ (nbd module, built into most distros)\n- `nbd-client` package\n- `gcc`, `make`\n\n---\n\n## Install\n\n```sh\ngit clone https:\u002F\u002Fgithub.com\u002Fc0dejedi\u002Fnbd-vram\ncd nbd-vram\nsudo .\u002Finstall.sh\nsudo systemctl start vram-swap-nbd\n```\n\nVerify:\n\n```sh\nswapon --show\n# NAME       TYPE      SIZE USED PRIO\n# \u002Fdev\u002Fnbd0  partition   7G   0B 1500\n```\n\nThe service is enabled on install, so it comes up automatically on every boot.\n\n---\n\n## Configuration\n\nEdit `\u002Fetc\u002Fsystemd\u002Fsystem\u002Fvram-swap-nbd.service`:\n\n```ini\nEnvironment=VRAM_SETUP_SIZE_MB=7168    # how much VRAM to use\nEnvironment=VRAM_SWAP_PRIORITY=1500   # swap priority (higher = used first)\n```\n\nThe daemon tries the requested size first and backs off in 512 MiB steps if the GPU is short on memory - so it will grab as much as it can even if the display compositor is already loaded. `VRAM_SETUP_SIZE_MB` is the ceiling, not a hard requirement.\n\nAfter changing, run `sudo systemctl daemon-reload && sudo systemctl restart vram-swap-nbd`.\n\n---\n\n## Power management\n\nThe installer asks whether to enable power-aware management on first install. If enabled, the service automatically stops when you unplug from AC (or when battery drops below a threshold), and restarts when power is restored. Manual `systemctl stop` is always respected and won't be overridden.\n\nTo change settings after install, edit `\u002Fetc\u002Fnbd-vram.conf`. Changes take effect on the next poll (within 60 seconds) or immediately on the next AC plug\u002Funplug event.\n\n---\n\n## Smoke test (without installing)\n\n```sh\nsudo bash test-nbd.sh\n```\n\nAllocates VRAM, connects the NBD device, does a 1 MiB write\u002Freadback check, activates swap, then prints teardown instructions. `install.sh` handles teardown automatically if a test instance is running.\n\nTo stress the full partition after the smoke test passes:\n\n```sh\nsudo bash test-fill.sh\n```\n\nWrites the entire VRAM partition with zeros, verifies a sample read back, then auto-restores swap on exit.\n\n---\n\n## Performance\n\nTested on RTX 3070 Laptop (8 GB VRAM), kernel 6.17, Pop!_OS. Compared against NVMe cryptswap (dm-crypt, PCIe 4.0). All benchmarks run with O_DIRECT to bypass page cache.\n\nThree benchmarks are in `benchmarks\u002F`. Each runs NVMe first, then starts the VRAM service and runs the same test against the block device. State is restored on exit.\n\n```sh\nsudo bash benchmarks\u002Fbench-throughput.sh   # sequential read\u002Fwrite (dd, 2 GiB, O_DIRECT)\nsudo bash benchmarks\u002Fbench-iops.sh         # 4K random IOPS (fio, libaio, iodepth=32)\nsudo bash benchmarks\u002Fbench-latency.sh      # per-operation latency (ioping, 20 requests)\n```\n\n`fio` and `ioping` are installed automatically if missing.\n\n---\n\n### Sequential throughput (dd, 2 GiB)\n\n![bench-throughput](benchmarks\u002Fbench-throughput.gif)\n\n| Device | Write | Read |\n|--------|-------|------|\n| NVMe | 2.7 GB\u002Fs | 2.9 GB\u002Fs |\n| VRAM (nbd) | 1.1 GB\u002Fs | 2.3 GB\u002Fs |\n\nVRAM is slower for large sequential transfers. The bottleneck is the NBD + CUDA userspace round-trip - every block crosses a Unix socket and a `cuMemcpy` call, which adds overhead that NVMe's direct kernel block path doesn't pay. Sequential throughput is not the primary swap workload (the kernel swaps individual 4K pages, not 4 MiB streams) - see the IOPS and latency benchmarks below.\n\n---\n\n### 4K random IOPS (fio, libaio, iodepth=32)\n\n![bench-iops](benchmarks\u002Fbench-iops.gif)\n\n| Device | Read IOPS | Write IOPS | Avg latency |\n|--------|-----------|------------|-------------|\n| NVMe | 45.4k | 45.3k | 343 us |\n| VRAM (nbd) | 28.7k | 28.7k | 550 us |\n\nNVMe wins for sustained random I\u002FO. At iodepth=32, NVMe can have 32 requests genuinely in flight simultaneously; the NBD+CUDA path serialises them through the daemon, so the depth advantage is reduced. The VRAM daemon also adds CPU overhead that the NVMe path does not pay. For continuous high-throughput swap pressure, NVMe is faster.\n\nThe picture changes for sporadic access - see the latency benchmark below.\n\n---\n\n### Per-operation latency (ioping, 4K reads, 1 request\u002Fsec)\n\n![bench-latency](benchmarks\u002Fbench-latency.gif)\n\n| Device | Min | Avg | Max |\n|--------|-----|-----|-----|\n| NVMe | 120 us | 9.05 ms | 10.1 ms |\n| VRAM (nbd) | 134 us | 335 us | 490 us |\n\n**VRAM is 27x faster average latency.** The NVMe drive is physically capable of ~112 us (visible on the warmup request) but APST (Autonomous Power State Transitions) puts it to sleep between requests. At 1 request per second - the rate of sporadic swap access - it wakes cold almost every time and pays a ~9 ms penalty. VRAM has no power states and responds in 133-490 us consistently.\n\nThis is the scenario that matters most in practice. Memory pressure on a laptop is rarely a sustained GB\u002Fs flood - it is individual 4K page faults arriving seconds apart. Every one of those faults stalls waiting for the swap device to respond. At 9 ms per fault, NVMe swap is felt. At 335 us, VRAM swap is not.\n\n---\n\n## Uninstall\n\n```sh\nsudo bash uninstall.sh\n```\n\n---\n\n## License\n\nMIT - Sean Lobjoit (c0dejedi)\n","nbd-vram 项目允许用户在 Linux 系统中将 NVIDIA GPU 的显存作为交换空间使用，特别适用于那些内存焊接且无法升级的笔记本电脑。其核心功能是通过 CUDA 驱动 API 分配显存，并利用 NBD 协议将其作为块设备提供给系统，从而实现高效的数据交换。技术上，它无需编写或维护内核模块，能够适应驱动和内核更新。此方案非常适合于配备独立 NVIDIA 显卡但内存受限的混合图形笔记本，在不增加物理内存的情况下显著提升系统可用内存总量。","2026-06-11 04:09:47","CREATED_QUERY"]