[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-81909":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":13,"compositeScore":19,"rankGlobal":10,"rankLanguage":10,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":23,"hasPages":21,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":40,"readmeContent":41,"aiSummary":42,"trendingCount":16,"starSnapshotCount":16,"syncStatus":15,"lastSyncTime":43,"discoverSource":44},81909,"memforge","Paradoxdov\u002Fmemforge","Paradoxdov","UEFI memory diagnostic tool - 14 stress tests, per-chip stuck-bit mapping, MCA capture, marathon mode, plain-language verdict with DIMM warranty info","https:\u002F\u002Fgithub.com\u002FParadoxdov\u002Fmemforge",null,"C",38,3,32,2,0,1,6,1.81,"MIT License",false,"main",true,[25,26,27,28,29,30,31,32,33,34,35,36,37,38,39],"c","ddr3","ddr4","ddr5","diagnostics","dram","efi","gnu-efi","hardware","low-level","memory-controller","memory-testing","memtest","spd","uefi","2026-06-12 02:04:21","# MemForge2\n\n**Latest release: [v0.4.66](https:\u002F\u002Fgithub.com\u002FParadoxdov\u002Fmemforge\u002Freleases\u002Flatest)** — download `MemForge2.efi`, copy to `EFI\u002FBOOT\u002Floader.efi` on a FAT32 USB.\n\nUEFI memory diagnostic tool for shop \u002F repair use. Boots from USB before any\nOS loads, runs 14 stress and pattern tests in parallel on every CPU core,\ncaptures hardware ECC events via MCA registers, snapshots environmental\ncontext at every error, and writes a structured report to the USB stick for\noffline review.\n\n![Memory failure detected — DDR3 stuck-bit + damaged bank, pinpointed to Samsung DIMM1](docs\u002Fscreenshot-failure.jpg)\n\n*Real run on a refurbished HP EliteDesk 8300 build: 396 errors in 54 seconds,\npinpointed to **DIMM1**, with stuck-bit and damaged-bank patterns visible.\nSPD readout identified the part as Samsung M378B5773CH0-CK0 for warranty\nreplacement. (Per-chip mapping in this screenshot pre-dates a DDR3 SPD-offset\nfix — see commit history; current builds give the correct chip U-number for\nDDR3 modules.) The simple verdict screen (shown above) is the default after\nevery run; press **[D]** for the technical breakdown — full 14-test table,\nper-error address\u002FDIMM\u002FDRAM-coord records, MCA bank diff, SPD timings, and\nBW degradation trend.*\n\n## Project status\n\nThis is a side project I maintain alongside my main job (PC assembly \u002F\nrepair). Issues are addressed as my schedule allows, usually on weekends\n— weekdays leave very little time. Critical bugs (data corruption, system\nhangs) take priority over feature requests. Thanks for your patience.\n\n## What it actually does\n\n- 14 memory tests covering pattern faults, retention, row hammering, address\n  decoding, sustained thermal stress, L3-cache cell faults, and stride-based\n  bandwidth profiling.\n- Direct SPD EEPROM read via SMBus — pulls serial number, manufacturing\n  date, JEDEC manufacturer ID and **full primary timings** (CL \u002F tRCD \u002F\n  tRP \u002F tRAS \u002F tRFC \u002F tCK) from each DIMM (info SMBIOS Type 17 does not\n  expose).\n- MCA (Machine Check Architecture) snapshot before\u002Fafter the run — surfaces\n  ECC errors that pattern tests cannot see by design (silently corrected by\n  the iMC).\n- Heuristic DRAM coordinate decode (bank-group \u002F bank \u002F row \u002F column) with\n  stuck-row and stuck-bank cluster detection.\n- HWP + PL1\u002FPL2 lift on Intel, CPPC2 lift on AMD — actually pushes the CPU\n  to turbo P-state inside UEFI, which most memory testers do not do (their\n  tests run at base frequency).\n- y-cruncher-style mixed-port Thermal Soak: saturates FMA, shuffle and\n  integer ALU ports simultaneously instead of FMA alone.\n- **Per-error environmental snapshot** — every recorded error captures\n  temp\u002FCPU watts\u002Fthrottle\u002FVID at the exact moment, so the operator sees\n  \"this byte flipped at t+182 s, temp 87 °C, 1.235 V\" instead of only\n  the run-wide peak.\n- **Marathon mode** — run for 1-24 hours with multipass-iterator wrap so\n  every cycle covers fresh memory. Catches intermittent failures that only\n  surface after 2-4 h of sustained thermal load.\n- **Cold\u002Fwarm boot delta** — persists a 96-byte summary to UEFI NVRAM\n  after every run; on next boot logs deltas: `errors +3, temp +6 °C, BW\n  peak −8 %`. Lets a shop see across reboots whether the symptom\n  reproduces.\n- **Bandwidth degradation trend** — 1-min BW buckets, first-quartile vs\n  last-quartile compare flags >5 % drop as mild \u002F >15 % as severe.\n  Catches silent thermal throttling, IMC retry storms, marginal channels.\n\n## Tests\n\n| #  | Test              | Time   | What it catches |\n|----|-------------------|--------|-----------------|\n|  1 | AVX2 Sustained    | ~10 s  | VRM \u002F IMC sustained-load faults |\n|  2 | TRRespass         | ~60 s  | DDR4\u002F5 Row Hammer (Frigo et al., USENIX Sec 2020) |\n|  3 | Cache-Eviction    | ~10 s  | DRAM bus errors via CLFLUSH |\n|  4 | March-C-          | ~2 s   | 6-phase JEDEC-grade March (van de Goor 1997, 92% coverage) |\n|  5 | Thermal Soak      | 3 min  | Errors that only surface at thermal steady-state |\n|  6 | BW Soak           | 5 min  | Sustained DRAM bandwidth (memtest86+ analog) |\n|  7 | March-RAW         | ~3 s   | Read-after-write dynamic coupling faults |\n|  8 | Butterfly         | ~5 s   | Cell crosstalk via checkerboard + neighbour flip |\n|  9 | Address Pattern   | ~5 s   | Address-line faults (cell X reads value of Y) |\n| 10 | VRM Square-Wave   | ~10 s  | VRM transient response under 10 Hz load square |\n| 11 | Random Pattern    | ~6 s   | xorshift64 pseudo-random patterns × 4 |\n| 12 | Bit Fade Extended | ~8 min | DRAM retention (write → wait → verify, 2 patterns) |\n| 13 | **L3 Cache Stress** | ~10 s | L3 cell faults invisible to DRAM-only tests |\n| 14 | **Stride BW**     | ~12 s  | TLB \u002F set-associativity \u002F channel-interleave issues |\n\n## Strengths vs other tools\n\n- **Per-error environmental snapshot** — every error record includes the\n  ms-timestamp, peak temperature, CPU power and VID at the moment the\n  byte flipped. Lets you distinguish \"fails cold at 45 °C\" (stuck cell)\n  from \"fails only above 85 °C\" (IMC thermal margin) without re-running.\n- **TRRespass** — 8-sided Row Hammer with bank rotation. Bypasses TRR\u002FRFM\n  that the naive 2-aggressor Row Hammer in memtest86+ gets defeated by.\n  Reference: Frigo et al., *TRRespass: Exploiting the Many Sides of Target\n  Row Refresh*, USENIX Security 2020.\n- **March-C-** — formal industrial March algorithm. Most consumer testers\n  still use walking-1s\u002F0s from the 1980s.\n- **L3 Cache Stress** — resident workload tests L3 cells directly. Every\n  other tester CLFLUSHes between write and verify, so the L3 storage\n  layer never gets observed; marginal L3 cells flip bits we catch here\n  or trigger silent ECC fires in MCA bank 1.\n- **MCA capture** — reads `MCi_STATUS` MSRs before and after the test, diffs\n  to surface ECC corrections that pattern tests by design cannot see.\n- **Full SPD via SMBus** — direct ICH\u002FPCH SMBus EEPROM read on Intel,\n  FCH SMBus on AMD. Pulls serial number, manufacturing date, JEDEC\n  bank\u002Fcode, and the complete primary timing set\n  (CL-tRCD-tRP-tRAS-tRFC). Detects XMP\u002FEXPO vs JEDEC at a glance.\n- **DDR5 SPD5 Hub probe** — reads SPD5118 hub device-info MRs and the\n  SMBus-side PEC error counter. (On-die ECC counters per DDR5 die live in\n  MR48-51 inside the DRAM and require an iMC mailbox we don't implement\n  — but the SMBus PEC counter is the next-best signal.)\n- **HWP + PL1\u002FPL2 lift** (Intel) \u002F **CPPC2 lift** (AMD) — programs the\n  per-logical-CPU performance-state MSRs and lifts package power limits\n  so the CPU actually runs at turbo during the stress kernels. PassMark\n  explicitly admit on their own forum that their tool does not do this.\n- **Error localization** — 5 mechanisms:\n  1. DIMM name via SMBIOS Type 20 (physical address → DIMM slot)\n  2. Per-chip stuck-bit mapping (bit position → chip U-designator)\n  3. Stuck-bit detection via XOR-mask repetition\n  4. Stuck-row \u002F stuck-bank detection via DRAM-coord heuristic\n  5. 1-GB histogram of error addresses\n\n  **Honest scope of (1)**: SMBIOS Type 20 maps each physical address to\n  the slot the firmware says owns it. On servers \u002F NUMA \u002F single-channel\n  \u002F non-interleaved consumer setups this gives the exact bad DIMM. On a\n  typical dual-channel desktop the iMC interleaves addresses between the\n  channel pair, so a single bad chip on ONE stick can appear as errors\n  distributed across both sticks. Since v0.4.21 the verdict distinguishes\n  the two cases by checking whether the BIOS-reported address ranges\n  overlap (real interleave) or are disjoint (block mode); on block-mapped\n  systems \"ZAMENIT' OBE\" is the honest verdict, on real interleave it's\n  \"ZAMENIT' ODNU iz pary\". Since v0.4.25, when verdict can't be sure\n  which of a pair is at fault, the program automatically re-tests each\n  DIMM in isolation (~5 min) and produces a definitive `REPLACE X`\n  answer — no manual swapping required on most consumer desktops.\n\n  **New in v0.4.61+ — hardware-timing address decode (Haswell DDR3)**: on\n  Intel client platforms where channel interleave defeats SMBIOS Type 20,\n  the program recovers the DRAM address-mapping functions directly from a\n  row-conflict timing side-channel (à la Pessl et al. \"DRAMA\"), confirms\n  them against the published Haswell map, and attributes each error to the\n  exact (channel, DIMM) = SPD slot. The verdict then names the faulty stick\n  by its SPD serial number — and follows it correctly even when the stick is\n  moved to a different slot. Validated against a known-bad module across\n  slot swaps on OptiPlex 9020\u002F7020.\n\n  **Coverage of exact (serial + slot) identification** — the address map is\n  matched from a self-validating table, so a row that doesn't fit the silicon\n  is skipped and the tool never mis-attributes:\n\n  | Platform | Exact bad-stick ID |\n  |----------|--------------------|\n  | Intel **DDR3, dual-channel** — Sandy Bridge \u002F Ivy Bridge \u002F Haswell (OptiPlex 3010\u002F7010\u002F9010\u002F3020\u002F7020\u002F9020 and kin) | ✓ recovered & validated |\n  | DDR4 \u002F DDR5 \u002F Skylake and newer \u002F AMD | ↩ safe fallback to SMBIOS Type-20 (old behaviour — honest, but no exact slot) |\n\n  Fallback is never wrong, just less precise; new platforms are added as one\n  table row once validated against a known-bad module on that hardware.\n\n  Full write-up of the method (timing primitive, the self-validating address-map\n  table, the decode, and the validation protocol): **[docs\u002FMETHOD.md](docs\u002FMETHOD.md)**.\n- **Auto-isolation** — when the post-test verdict finds errors spread\n  across multiple DIMM address ranges, the program automatically re-runs\n  the failing kernel against each affected stick in turn (constraining\n  the test buffer to that DIMM's SMBIOS Type 20 range). Final screen\n  reads e.g. \"DDR4-A2: 0 errors \u002F DDR4-B2: 8 errors → REPLACE DDR4-B2,\n  HIGH confidence (confirmed by isolation)\". No user input required;\n  takes ~5 min on top of the main run.\n- **Marathon mode** — `MarathonHours=N` (1-24) keeps cycling tests for N\n  hours. Multipass iterator wraps when RAM coverage cycle completes, so\n  every cycle covers fresh (region, offset) pairs. Catches the\n  intermittent-failure class that 30-min runs miss but real customers hit\n  once a week.\n- **Cold\u002Fwarm boot delta** — UEFI NVRAM persistent record of each run.\n  Surfaces regressions across reboots with explicit warnings (`⚠ 3 new\n  errors since last run`, `⚠ temp rose 6 °C — check airflow\u002Fpaste`).\n- **Bandwidth degradation trend** — 1-min BW buckets, first-vs-last\n  quartile compare. Catches silent thermal throttling and IMC retry\n  storms even when no pattern errors fire.\n- **Stride-sweep BW** — drop at exactly one stride pinpoints TLB issues,\n  cache set conflicts, or channel-interleave bugs.\n- **SMBus signal integrity probe** — 16 repeated SPD-byte reads per\n  slot; mismatches\u002FNAKs flagged as I²C SI warning.\n- **Per-DIMM isolation** — `TestOnlyDimm=N` restricts the test buffer to\n  one DIMM's physical address range (via SMBIOS Type 20 + UEFI\n  `AllocateAddress`). Lets you verify each stick separately without\n  removing the others. Perfect isolation on non-interleaved memory; on\n  dual-channel desktops you isolate per channel pair.\n- **DDR5-aware** — auto-tunes Bit Fade for on-die ECC, applies 2×\n  multiplier to Row Hammer activations to compensate for TRR\u002FRFM,\n  detects XMP\u002FEXPO over-JEDEC speeds.\n\n## Output\n\nAfter the test, on the USB next to `loader.efi`:\n\n- `memforge2.log` — full run log with timestamps, per-test results, SPD\n  per DIMM with full timings, MCA bank diffs, per-error records with\n  environmental snapshots, BW trend verdict, cold\u002Fwarm boot delta, stride\n  per-stride MB\u002Fs, SMBus signal integrity.\n- `report.json` — structured data including everything above. Each error\n  record carries `at: {t_ms, temp_c, pkg_w, throttle, vid_mv}` for\n  context-aware AI analysis. A top-level `peaks` block (added v0.4.28)\n  reports run-wide max temperature, peak package power, max frequency\n  reached, peak\u002Ftheoretical bandwidth, throttle event count, and which\n  mechanism actually lifted the CPU to turbo (HWP vs legacy PERF_CTL vs\n  AMD CPPC2) so an automated analyzer can verify the CPU was genuinely\n  loaded. Plain JSON, easy to feed into any downstream tool.\n\n## Building\n\nRequires MSYS2 + mingw-w64 GCC + gnu-efi headers\u002Flibrary.\n\n```bash\nmake\n```\n\nProduces `MemForge2.efi`. Copy onto a FAT-formatted USB at\n`EFI\u002FBOOT\u002Floader.efi`, along with `quantai.ini` at root. Wrap with the\nLinux Foundation's `PreLoader.efi` if you want to keep Secure Boot enabled\n(MOK enrollment via `HashTool.efi`).\n\nIf Windows Defender Smart App Control blocks gcc, run `WHITELIST_MSYS2.bat`\nonce as admin.\n\n## Configuration\n\n`quantai.ini` on the USB root holds runtime knobs:\n\n```ini\n[Run]\nPasses=0            ; 0 = auto from RAM size \u002F BufferMB\nMultiPass=1         ; rotate buffer across regions to cover all RAM\nMaxCores=0          ; 0 = use all enabled cores\nEnableAVX=1\n;BufferMB=1024      ; auto-scaled by RAM size if commented out\n;BitFadeSeconds=60  ; DDR4 default; DDR5 auto-bumped to 120\n;BitFadeEveryPass=0 ; only on pass 1 by default\n;TestOnlyDimm=0     ; 1..N = isolate to that DIMM slot\n;MarathonHours=0    ; 0 = off, 1..24 = run for N hours\n;WatchdogSeconds=120 ; auto-reboot if a core wedges mid-test; 0 = off\n\n[Meta]\nVersion=0.4.66\nLanguage=en         ; \"ru\" or \"en\"\n\n[Display]\n;EnableAA=0         ; 1 = AA via direct framebuffer (faster, less compatible)\n;ForceBlt=0         ; 1 = always use GOP Blt (slower, max compatibility)\n;Width=1920         ; manual GOP mode override\n;Height=1080\n;FontScale=0        ; 0=auto, 1=1×, 2=2× (for 4K displays)\n```\n\nThe `[AI]` section that may appear in your local `quantai.ini` is consumed\nby an external post-test analyzer (not part of this repo) and is ignored by\nthe tester itself.\n\n## How this was built\n\nThis project is a collaboration: code written by Claude (Anthropic LLM)\nunder direction from a PC-assembly tech who provides the domain expertise,\nuse cases, and real-hardware validation. The repository owner is not a\nsystems programmer; Claude handles C, UEFI APIs, MSRs, SMBus protocol.\nSee the article on Habr for the longer story.\n\n## License\n\nMIT — see [`LICENSE`](LICENSE). Permits commercial use, modification,\ndistribution, and private use; provided as-is, no warranty.\n\n## Acknowledgements\n\n- Frigo, Giuffrida, Bos, Razavi — TRRespass attack (USENIX Sec 2020)\n- Pessl, Gruss, Maurice, Schwarz, Mangard — DRAMA address-mapping recovery (USENIX Sec 2016); basis for the timing-based bad-stick attribution ([docs\u002FMETHOD.md](docs\u002FMETHOD.md))\n- A. J. van de Goor — March-C- algorithm (1997)\n- Linux Foundation — `PreLoader.efi` \u002F `HashTool.efi` Secure Boot shims\n- gnu-efi project — UEFI headers and library\n","MemForge2 是一个UEFI内存诊断工具，专为电脑维修和检测设计。它能够执行14种压力测试，包括模式故障、数据保持、行锤击、地址解码等，并支持每芯片卡位映射和MCA错误捕获。该工具直接从USB启动，在任何操作系统加载前运行，利用所有CPU核心并行处理测试任务，确保全面覆盖。此外，MemForge2能读取SPD EEPROM中的详细信息（如序列号、生产日期、JEDEC制造商ID及完整的主要时序），并在每次错误发生时捕捉环境快照，记录温度、功耗等关键参数。其马拉松模式可连续运行1-24小时，适用于需要长时间高负载测试以发现间歇性故障的场景。此项目采用C语言编写，遵循MIT许可证。","2026-06-11 04:07:10","CREATED_QUERY"]