[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-3562":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":23,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":44,"readmeContent":45,"aiSummary":46,"trendingCount":16,"starSnapshotCount":16,"syncStatus":47,"lastSyncTime":48,"discoverSource":49},3562,"facex","facex-engine\u002Ffacex","facex-engine","Full face stack that runs entirely in the browser. Detection, 576-point 3D mesh, recognition, anti-spoof, smile — all WebAssembly, zero server. Apache 2.0.","https:\u002F\u002Ffacex-engine.github.io\u002Ffacex\u002F",null,"C",276,38,3,5,0,8,16,174,24,78.77,"Apache License 2.0",false,"main",true,[27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43],"avx-512","avx2","biometrics","c99","computer-vision","cpu-inference","deep-learning","edge-ai","embedded-ai","face-embedding","face-recognition","face-verification","low-latency","onnx-runtime","production-ready","simd","zero-dependencies","2026-06-12 04:00:18","\u003Cp align=\"center\">\n  \u003Cimg src=\"docs\u002Flogo.jpg\" alt=\"FaceX\" width=\"480\">\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\u003Cem>Full face pipeline — detect, mesh, recognize, anti-spoof — in pure WebAssembly. Trained from scratch. No cloud, no Python, no server.\u003C\u002Fem>\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n\n[![License: Apache 2.0](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-Apache_2.0-blue.svg)](LICENSE)\n[![LFW](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLFW-99.07%25-success.svg)](#benchmarks)\n[![Latency](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flatency-3.0_ms-brightgreen.svg)](#benchmarks)\n[![Browser](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fruns-100%25_in_browser-orange.svg)](https:\u002F\u002Ffacex-engine.github.io\u002Ffacex\u002Fdemo\u002F)\n[![Encryption](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fweights-AES--256--GCM-purple.svg)](#weight-encryption)\n[![Deps](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdependencies-zero-green.svg)](#the-full-surveillance-stack--no-python-no-ffmpeg-no-gpu)\n\n\u003C\u002Fp>\n\n**Full face stack that runs entirely in the browser.** Detection, 98-point landmarks, dense 3D mesh, recognition, and passive anti-spoof — all WebAssembly, zero server, ~17 MB of encrypted weights.\n\n🎬 **[Live Demo →](https:\u002F\u002Ffacex-engine.github.io\u002Ffacex\u002Fdemo\u002F)** — open in a Chromium browser, press *Start camera*, try all modes.\n📚 **[Docs in Wiki →](https:\u002F\u002Fgithub.com\u002Ffacex-engine\u002Ffacex\u002Fwiki)** — Browser quickstart, training recipes, nn2 architecture, encrypted weights, comparison vs alternatives.\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"docs\u002Fpipeline.jpg\" alt=\"FaceX pipeline\" width=\"720\">\n\u003C\u002Fp>\n\n### Everything in the demo is trained by us\n\n| Component | Status | Size | Source |\n|---|---|---|---|\n| Face detector | ✅ **ours** | 401 KB | YuNet-style FCOS, WIDER FACE |\n| 98-point landmark | ✅ **ours** | 1.1 MB | WFLW |\n| 576-point 3D mesh | ✅ **ours** | 5.6 MB | MediaPipe distillation |\n| Recognition (4 sizes) | ✅ **ours** | 0.8–8.4 MB | MobileFaceNet + ArcFace on MS1M, LFW 95.62 → 99.07% (10-fold mean) |\n| Anti-spoof | Apache 2.0 | 2 × 1.7 MB | MiniFASNet (MinivisionAI Silent-Face) |\n\nWeights ship as AES-256-GCM ciphertext and are decrypted in the browser via WebCrypto. This is **not DRM** — a determined attacker can dump the decrypted bytes from the WASM heap. What it does buy you: friction against casual scraping, per-customer key revocation for SaaS deployments, and an audit trail at the key-issuing endpoint. See the [wiki](https:\u002F\u002Fgithub.com\u002Ffacex-engine\u002Ffacex\u002Fwiki\u002FEncrypted-Weights) for the threat model and Express \u002F FastAPI integration recipes.\n\n---\n\n## The full surveillance stack — no Python, no FFmpeg, no GPU\n\nFaceX is one piece of a larger pure-C stack we built for IP-camera workloads. Every component is hand-written, zero-dependency, **flashable to firmware**:\n\n| Component | What it does | Size | Speed | Replaces |\n|---|---|---|---|---|\n| **NexusDecode** | H.264 + H.265 decoder, RTSP client | **184 KB** | 6,300 fps, **46× FFmpeg** | libav \u002F FFmpeg |\n| **NexusEncode** | H.265\u002FHEVC encoder | ~250 KB | x265-medium quality, 131 fps | x265 |\n| **NXV codec** | Surveillance-tuned video format | 121 KB | **3× smaller** than H.265, instant seek, change-map | H.265 + custom container |\n| **nn2** | YOLOv8 + MiniFASNet inference engine | 520 KB | 8.5 ms @ 320, **1.5–2× ONNX RT** | onnxruntime |\n| **FaceX (this repo)** | Detect + landmarks + embed + spoof | 148 KB native \u002F 17 MB WASM | 3 ms\u002Fface | dlib, FaceNet, InsightFace |\n\n**Pipeline numbers (one Intel i5 CPU):**\n- Decode 30 RTSP streams + run YOLO detection on each: **0.56 ms\u002Fframe** average → **70 IP cameras on one CPU core** with motion-gating + Kalman tracking.\n- Tiered storage: 70 cams × 90 days = **49 TB → 3.3 TB** (15× savings) with NXV + selective bitstream-only archiving.\n\n**Why it matters:**\n- **Flashable** — entire NVR stack fits in **\u003C2 MB** of binary, ARM\u002Fx86\u002FRISC-V, no shared libraries\n- **No FFmpeg** — no GPL contamination, no surface for codec CVEs, no 28 MB of libav `.so` files\n- **Embedded-ready** — runs on $30 SoCs (Allwinner, Rockchip, NXP i.MX), 25 cameras on 27% CPU\n- **Standalone** — every piece can be used alone or combined: decoder → motion gate → detector → tracker → recognizer → archive\n\n### Where it runs\n\nWe're not just \"x86 only\". The same code targets multiple device classes:\n\n| Target | Status | What's used |\n|---|---|---|\n| **Browser** (any modern Chromium\u002FFirefox\u002FSafari) | ✅ shipping | onnxruntime-web + AES-256-GCM weight decryption ([live demo](https:\u002F\u002Ffacex-engine.github.io\u002Ffacex\u002Fdemo\u002F)) |\n| **Linux \u002F macOS \u002F Windows x86-64** | ✅ shipping | AVX2 + AVX-512 + VNNI runtime dispatch |\n| **Apple Silicon (M1–M4)** | ✅ in PR #3 | NEON + Accelerate (AMX) + SME on M4+ + Core ML \u002F ANE bridge |\n| **ARM Linux \u002F Android (AArch64)** | ✅ in PR #3 | Hand-written NEON kernels for FP32 GEMM |\n| **NXP i.MX 8 \u002F 93 \u002F 95 NPU** | 🛠️ draft (#3) | Ethos-U65 \u002F VxDelegate \u002F XNNPACK |\n| **Espressif ESP32-P4** (RISC-V + PIE 128) | 🛠️ draft (#3) | ESP-IDF component + MIPI-CSI camera example |\n| **Firmware \u002F bare-metal MCU** | 🛠️ in progress | No `libc` deps in core; PReLU\u002FGEMM\u002FConv kernels fit in 64 KB SRAM |\n\nDecoder + encoder are pure C99 with x86 SIMD today; ARM\u002FNEON backports for NexusDecode are next.\n\n```c\n\u002F\u002F Native C: 3 ms per face\n#include \"facex.h\"\nFaceX* fx = facex_init(\"facex_xs.bin\", NULL);\nfloat emb[512];\nfacex_embed(fx, face_112x112, emb);\nfloat sim = facex_similarity(emb_a, emb_b);   \u002F\u002F >0.3 = same person\n```\n\n```bash\n# Or run the live browser demo locally\ngit clone https:\u002F\u002Fgithub.com\u002Ffacex-engine\u002Ffacex\ncd facex\u002Fwasm && python -m http.server 8000\n# open http:\u002F\u002F127.0.0.1:8000\u002Fdemo_mesh.html\n```\n\n---\n\n## What can you build with this?\n\n- **Identity verification (KYC)** — \"is this the same person?\" from selfie + ID photo, no cloud round-trip\n- **Face login** — unlock apps by face, works offline, no data leaves the device\n- **Access control** — doors, gates, turnstiles on edge hardware without GPU\n- **Proctoring** — verify exam takers are who they claim to be\n- **Smart cameras** — recognize known faces at 300+ faces\u002Fsec on a single CPU core\n- **Banking \u002F fintech onboarding** — passive liveness + face match in the browser, GDPR-friendly by construction\n- **In-store kiosks** — VIP\u002Floyalty recognition at the till, runs on a $30 SoC\n\n### Why FaceID with FaceX instead of cloud APIs\n\nYou're typically choosing between AWS Rekognition \u002F Azure Face \u002F Google Vision \u002F Paravision \u002F FaceTec ZoOm. Cost comparison for a 100 K-user app doing one face-match per session per day:\n\n| Provider | Price per 1k matches | Monthly cost (100 K MAU × 1\u002Fday) | Sends user faces to | Latency |\n|---|---:|---:|---|---:|\n| AWS Rekognition CompareFaces | $1.00 | **$3,000 \u002Fmo** | AWS us-east | 250–500 ms |\n| Azure Face API verify | $1.00–$1.50 | **$3,000–$4,500 \u002Fmo** | Azure region | 200–400 ms |\n| Google Vision FACE_DETECTION | $1.50 | **$4,500 \u002Fmo** | Google datacenter | 200–400 ms |\n| FaceTec ZoOm | per-seat licensed | **$10 K+ \u002Fyear** | Their SDK, mixed | 1–3 s (active) |\n| **FaceX in your app** | **$0** | **$0** | Nobody — stays in the user's browser | 20–30 ms |\n\nThe savings are nice. The bigger story is **compliance**: when frames never leave the device, you're outside GDPR Art. 9 (biometric) \u002F HIPAA \u002F Russia's 152-ФЗ \u002F KZ's data localization rules by construction. No DPIA, no DPA renegotiations, no \"where are the photos stored\" audit questions.\n\n### Where it's been deployed\n\nWe've shipped this stack into IP-camera NVRs, retail kiosks, and KYC flows for fintech clients. If you're evaluating it for production, the live demo is the fastest way to see what it can do — then [open an issue](https:\u002F\u002Fgithub.com\u002Ffacex-engine\u002Ffacex\u002Fissues\u002Fnew) or [email me](mailto:bauratynov@gmail.com) with your use case and I'll help you scope.\n\n## How it works\n\nFull pipeline, every step trained or written by us:\n\n1. **Detect** — own FCOS-style face detector (100K params, trained from\n   scratch on WIDER FACE; 401 KB ONNX).\n2. **Align** — 98-point WFLW landmark ConvNet (1.15M params; 1.1 MB ONNX).\n3. **3D mesh** — 576-point face mesh (5.6 MB ONNX), distilled from\n   MediaPipe FaceMesh with our 98 WFLW anchors driving the warp.\n4. **Recognize** — MobileFaceNet + ArcFace, four size variants\n   (`nano` 0.8 MB · `tiny` 1.8 MB · `standard` 3.9 MB · `xs` 8.4 MB),\n   LFW 95.6 → 99.07%.\n5. **Anti-spoof** — MiniFASNet ensemble (V2 @ 2.7 + V1SE @ 4.0),\n   MinivisionAI Apache 2.0. Also **ported to our nn2 engine — 2× faster\n   than ONNX Runtime** on the same CPU.\n\nTwo modes:\n- **Browser:** onnxruntime-web + AES-256-GCM encrypted weights, full\n  pipeline in ~25 ms\u002Fframe, **no server**.\n- **Native:** pure C, 3 ms per face, INT8 + AVX-512, beats ONNX Runtime\n  on the same hardware.\n\nTwo years of optimization: handwritten AVX2 \u002F AVX-512 \u002F NEON kernels,\nINT8 GEMM, cache-tuned layout, weight-encryption with WebCrypto handoff\nto onnxruntime — every millisecond and every kilobyte fought for.\n\n---\n\n## Benchmarks\n\nMeasured on Intel i5-11500 (6 cores, AVX-512 + VNNI):\n\n### Speed — recognition (our MobileFaceNet xs)\n\n![Speed comparison](docs\u002Fspeed_comparison.svg)\n\n| Engine | Median | Min | vs FaceX |\n|--------|-------:|----:|:--------:|\n| **FaceX (native nn2)** | **3.0 ms** | **2.87 ms** | -- |\n| ONNX Runtime 1.23 | 3.9 ms | 3.18 ms | 1.30× slower |\n| InsightFace (R34) | 17 ms | -- | 5.7× slower |\n| FaceNet (PyTorch) | 30 ms | -- | 10× slower |\n| dlib | 50+ ms | -- | 17× slower |\n\n### Speed — anti-spoof (MiniFASNet V2+V1SE ensemble)\n\nSame model, ported to our `nn2` C engine (Apache 2.0, source in [`nn2\u002F`](nn2\u002F)):\n\n| Engine | Single model | Ensemble | Speedup |\n|--------|-------------:|---------:|--------:|\n| **nn2** | **0.70 ms** | **1.43 ms** | -- |\n| ONNX Runtime 1.23 | 1.33 ms | 2.92 ms | **2.03× slower** |\n\nByte-identical predictions to PyTorch \u002F ONNX on the same input.\n\n### Accuracy — recognition (LFW verification)\n\nAll numbers are the **mean accuracy across 10-fold cross-validation**\n(InsightFace-style: tune the threshold on 9 training folds, evaluate\non the 1 held-out fold, repeat 10 times). The `±` column is the\nstandard deviation across folds. Input must be 112×112 ArcFace-aligned\nvia a 5-point similarity transform — running on un-aligned crops drops\naccuracy by ~25 points. The eval script is\n[training\u002Fscripts\u002Flfw_eval.py](training\u002Fscripts\u002Flfw_eval.py).\n\n| Variant | Params | LFW mean | ± std | ONNX size | Speed (CPU) |\n|---------|------:|---------:|------:|----------:|------------:|\n| nano | 0.20 M | 95.62% | 1.11% | 0.8 MB | 1.4 ms |\n| tiny | 0.45 M | 96.85% | 0.87% | 1.8 MB | 2.1 ms |\n| standard | 0.93 M | 98.25% | 0.68% | 3.9 MB | 2.6 ms |\n| **xs** | 2.07 M | **99.07%** | 0.40% | 8.4 MB | 3.0 ms |\n\n### Accuracy — face detection (WIDER FACE val)\n\nOur YuNet-style FCOS detector, 100 K params, trained from scratch:\n\n| Metric | Score |\n|--------|------:|\n| Best recall @ IoU 0.5 (all faces incl. tiny) | 27.5% |\n| Recall on faces ≥ 32 px | ~85% |\n| Recall on webcam-distance faces | ~95% |\n| ONNX size | 401 KB |\n| Latency on 320×320 input | \u003C 1 ms (WASM) |\n\n### Footprint\n\n![Footprint comparison](docs\u002Ffootprint.svg)\n\n| Metric | FaceX | ONNX Runtime |\n|--------|------:|-------------:|\n| Library size | **148 KB** | 28 MB |\n| Total deploy | **7 MB** | 157 MB |\n| Dependencies | **none** | Python + onnxruntime |\n| Cold start | **~100 ms** | ~350 ms |\n\n---\n\n## Quick start\n\n### C\n\n```c\n#include \"facex.h\"\n\nint main() {\n    \u002F\u002F Load engine (one-time, ~100ms)\n    FaceX* fx = facex_init(\"facex_xs.bin\", NULL);\n\n    \u002F\u002F Compute embedding (3ms per call)\n    float face[112 * 112 * 3];  \u002F\u002F RGB, HWC, [-1, 1]\n    float embedding[512];\n    facex_embed(fx, face, embedding);\n\n    \u002F\u002F Compare two faces\n    float sim = facex_similarity(emb_a, emb_b);\n    \u002F\u002F sim > 0.3 → same person\n\n    facex_free(fx);\n}\n```\n\n```bash\ngcc -O3 -march=native -Iinclude -o myapp myapp.c -L. -lfacex -lm -lpthread\n```\n\n### Go\n\n```go\nimport \"github.com\u002Ffacex-engine\u002Ffacex\u002Fgo\u002Ffacex\"\n\nff, _ := facex.New(facex.Config{\n    Exe:     \".\u002Ffacex-cli\",\n    Weights: \".\u002Ffacex_xs.bin\",\n})\ndefer ff.Close()\n\nembedding, _ := ff.Embed(rgbImage)\nsim := facex.CosSim(embA, embB)\n```\n\n### CLI (any language via stdin\u002Fstdout)\n\n```bash\n# Pipe mode: reads 112x112x3 float32 HWC, writes 512 float32\n.\u002Ffacex-cli weights.bin --server \u003C faces.raw > embeddings.raw\n```\n\n### Browser (via onnxruntime-web + AES decryption)\n\n```html\n\u003Cscript src=\"https:\u002F\u002Fcdn.jsdelivr.net\u002Fnpm\u002Fonnxruntime-web@1.21.0\u002Fdist\u002Fort.min.js\">\u003C\u002Fscript>\n\u003Cscript>\n  \u002F\u002F Fetch encrypted weights, decrypt in WebCrypto, hand bytes to ORT.\n  const buf = new Uint8Array(await (await fetch('facex_xs.enc')).arrayBuffer());\n  const iv = buf.subarray(0, 12), data = buf.subarray(12);\n  const key = await crypto.subtle.importKey('raw', KEY_BYTES,\n                                              {name:'AES-GCM'}, false, ['decrypt']);\n  const onnx = new Uint8Array(await crypto.subtle.decrypt({name:'AES-GCM', iv}, key, data));\n  const sess = await ort.InferenceSession.create(onnx, { executionProviders: ['wasm'] });\n  \u002F\u002F Inference is 100% client-side. Frames never leave the device.\n\u003C\u002Fscript>\n```\n\nFull browser pipeline (detect + 576pt mesh + recognize + anti-spoof)\nis **live at https:\u002F\u002Ffacex-engine.github.io\u002Ffacex\u002Fdemo\u002F** — open it,\npress *Start camera*, try the picker.\n\n---\n\n## Build\n\n```bash\nmake            # builds libfacex.a + facex-cli\nmake example    # builds and runs example\nmake encrypt    # builds weight encryption tool\n```\n\nRequirements: GCC with AVX2 support. Nothing else.\n\n### Cross-compile for Linux (from WSL)\n\n```bash\ngcc -O3 -march=x86-64-v3 -mavx2 -mfma -static \\\n    -DFACEX_LIB -o libfacex.a src\u002F*.c -lm -lpthread\n```\n\n---\n\n## API\n\n```c\n\u002F\u002F Initialize engine. Returns NULL on error.\n\u002F\u002F license_key: NULL for plain weights, or key string for AES-256 encrypted.\nFaceX* facex_init(const char* weights_path, const char* license_key);\n\n\u002F\u002F Compute 512-dim face embedding from 112x112 RGB image.\n\u002F\u002F rgb_hwc: float32 array [112][112][3], values in [-1, 1].\n\u002F\u002F embedding: output buffer, 512 floats (L2-normalized).\nint facex_embed(FaceX* fx, const float* rgb_hwc, float embedding[512]);\n\n\u002F\u002F Cosine similarity between two embeddings. Range [-1, 1].\nfloat facex_similarity(const float emb1[512], const float emb2[512]);\n\n\u002F\u002F Free engine resources.\nvoid facex_free(FaceX* fx);\n\n\u002F\u002F Version string.\nconst char* facex_version(void);\n```\n\n---\n\n## Architecture (recognition, MobileFaceNet xs)\n\n```\nInput: 112×112 RGB float32 in [-1, 1]\n    ↓\n  Stem: Conv 3×3 s=2 → 64 ch, PReLU\n    ↓\n  DW Stem: DW 3×3 s=1 → 64 ch, PReLU\n    ↓\n  Stage 1: 5× Inverted-Residual (t=2, c=64, first s=2)\n    ↓\n  Stage 2: 1× Inverted-Residual (t=4, c=128, s=2)\n    ↓\n  Stage 3: 6× Inverted-Residual (t=2, c=128, s=1)\n    ↓\n  Stage 4: 1× Inverted-Residual (t=4, c=128, s=2)\n    ↓\n  Stage 5: 2× Inverted-Residual (t=2, c=128, s=1)\n    ↓\n  Conv 1×1 → 512 ch, PReLU\n    ↓\n  GDConv DW 7×7 s=1 (linear-GDC) → 512×1×1\n    ↓\n  1×1 conv → 512-d embedding, BN, L2-norm\n    ↓\nOutput: 512-dim unit embedding\n```\n\n**Engine internals:**\n\n- Pure C99 + SIMD intrinsics (AVX2, FMA, AVX-512, VNNI)\n- INT8 quantized GEMM with `vpmaddubsw` (AVX2) \u002F `vpdpbusd` (VNNI)\n- FP32 packed column-panel MatMul (NR = 8 AVX2, NR = 16 AVX-512)\n- Custom thread pool with work-stealing (WaitOnAddress \u002F futex)\n- Pre-packed weights at load time for cache-optimal access\n- BN folded into preceding Conv at export time\n- AES-256-GCM weight encryption with WebCrypto handoff in the browser,\n  AES-256-CTR with hardware binding for native deployments\n- Fully shared op library between recognition, anti-spoof (MiniFASNet),\n  and YOLOv8 detection (`nn2`)\n\n---\n\n## Weight encryption\n\nFor commercial deployment with IP protection:\n\n```bash\n# Encrypt weights (binds to target machine hardware)\n.\u002Ffacex-encrypt encrypt weights.bin weights.enc \"LICENSE-KEY\"\n\n# Load encrypted weights\nFaceX* fx = facex_init(\"weights.enc\", \"LICENSE-KEY\");\n```\n\nWrong key or different machine → load fails. Original weights never\ntouch disk in plaintext on the target machine.\n\n---\n\n## Integration paths\n\n| Language | Method | Latency |\n|----------|--------|:-------:|\n| **C \u002F C++** | `libfacex.a` + `facex.h` | 3 ms (native) |\n| **Browser** | `facex.wasm` (48 KB) | 7 ms (WASM SIMD) |\n| **Go** | `go\u002Ffacex` subprocess | ~4 ms |\n| **Python** | subprocess \u002F ctypes | ~4 ms |\n| **Any** | `facex-cli --server` stdin\u002Fstdout | ~4 ms |\n\n---\n\n## Limitations\n\n- **Native build** — currently x86-64 (AVX2 \u002F AVX-512 \u002F VNNI). ARM NEON\n  paths exist in `nn2\u002Fsrc\u002Fgemm_neon.h`; full ARM build script is on the\n  roadmap, ESP32 \u002F RISC-V PIE 128 next.\n- **Browser pipeline** — uses `onnxruntime-web` with WebCrypto-decrypted\n  ONNX. WebGPU backend is supported by ORT but not yet wired into the\n  demo; would drop inference by another 3–5×.\n- **Anti-spoof** is the only non-our component (MiniFASNet, Apache 2.0,\n  MinivisionAI). Training a fully-own anti-spoof needs a commercial\n  attack dataset, which we don't have.\n\n---\n\n## Models\n\nEvery recognition \u002F detection \u002F landmark model in this repo was trained\nfrom scratch by us. Anti-spoof is the only third-party piece.\n\n### Recognition (our MobileFaceNet variants)\n\nStandard MobileFaceNet (Chen et al. 2018) topology, width-scaled\nto four sizes, ArcFace head with the numerically-stable\nangle-addition margin, trained on MS1M-RefineV2 with bf16 autocast.\n\n| Variant | Params | Width mult | Embedding dim | LFW |\n|---------|------:|-----------:|--------------:|----:|\n| nano | 0.20 M | 0.36 | 256 | 95.62% |\n| tiny | 0.45 M | 0.55 | 512 | 96.85% |\n| standard | 0.93 M | 0.90 | 512 | 98.25% |\n| xs | 2.07 M | 1.35 | 512 | 99.07% |\n\n### Face detector (ours)\n\nYuNet-inspired, but FCOS-style anchor-free. MobileNetV2-lite backbone,\n3 detection heads at strides 8 \u002F 16 \u002F 32, GIoU bbox loss + focal cls\nloss. 100 K params, 401 KB ONNX. Trained on WIDER FACE.\n\n### 98-point landmarks (ours, WFLW)\n\nMobileFaceNet-style backbone + dense head, 1.15 M params. Final NME\non WFLW val: 4.85% (test) \u002F 5.95% (large-pose subset).\n\n### 576-point 3D mesh (ours, MediaPipe distillation)\n\nSame architecture as the 98-point model, but with `Linear(256, 478*3)`\nhead — distilled from MediaPipe FaceMesh pseudo-labels with TPS-rendered\nsupervision over our WFLW frontalised crops. Error: xy 0.54 px, z 0.51\n(normalized) on held-out val. With 98 WFLW anchors driving the\nnon-rigid warp, the rendered mesh has **576 visible points** total.\n\n### Anti-spoof (MiniFASNet, Apache 2.0, MinivisionAI)\n\nWe don't train this — there's no commercial-friendly attack dataset\npublicly available. We port their two-model ensemble (V2 @ 2.7 +\nV1SE @ 4.0) into our nn2 inference engine and ship byte-identical\npredictions at **2× speed** vs ONNX Runtime.\n\n---\n\n## Repo layout\n\n```\ninclude\u002F                — public C API (facex.h, facex_mfn.h, ...)\nsrc\u002F                    — recognition engine + AES weight crypto\nnn2\u002F                    — pure-C YOLO + MiniFASNet inference engine\n                          (1.5–2× ONNX, Apache 2.0)\n   src\u002F                 — gemm, conv, ops, antispoof_ops, minifasnet\n   include\u002F             — public API headers\n   tools\u002F               — PyTorch → .bin converters\nwasm\u002F                   — browser demo (demo_mesh.html, encrypt tool)\n   tools\u002Fencrypt_models.py — AES-256-GCM encrypt all .onnx\ndocs\u002Fdemo\u002F              — GitHub Pages live demo + encrypted weights\ntraining\u002F               — all training pipelines, datasets, exporters\n   scripts\u002F             — MobileFaceNet recognition (nano\u002Ftiny\u002Fstandard\u002Fxs)\n   landmark\u002F            — 98-point WFLW\n   landmark3d\u002F          — 576-point MediaPipe distillation\n   face_detect\u002F         — own FCOS face detector trained on WIDER FACE\n   antispoof\u002F           — MiniFASNet integration\ngo\u002Ffacex\u002F               — Go binding (subprocess protocol)\npython\u002Ffacex\u002F           — Python binding (ctypes)\n```\n\n---\n\n## FAQ\n\n**Q: Is it really faster than ONNX Runtime?**\nA: Yes. Measured on the same CPU, same model, same input. FaceX median\n3.0 ms vs ONNX Runtime median 3.9 ms. The gap comes from handwritten\nSIMD kernels that avoid framework overhead.\n\n**Q: What accuracy vs ArcFace-R100?**\nA: Our `xs` (2 M params) is 99.07% LFW vs ArcFace-R100's 99.80%. 0.7%\nof recall for 50× smaller model and 10× faster inference.\n\n**Q: Can I use this commercially?**\nA: Engine code is Apache 2.0. Our trained recognition, detection,\nlandmark, and 3D-mesh weights are **also Apache 2.0** — we own them.\nOnly the anti-spoof component (MiniFASNet) is upstream Apache 2.0.\n\n**Q: Does it do face detection?**\nA: Yes. We trained an own FCOS-style detector on WIDER FACE; it\nreplaces YuNet in the browser demo and runs in \u003C1 ms.\n\n**Q: Why ONNX in the browser instead of native WASM?**\nA: We went both ways. `nn2` ships a native C engine that is 1.5–2×\nfaster than ORT. For the browser, `onnxruntime-web` gives us WebGPU,\nSIMD-WASM, and 3-line model swap without re-compiling. The encryption\nlayer (WebCrypto → ORT byte stream) sits between the network and ORT,\nso the model bytes never hit the page as plaintext.\n\n---\n\n## Citation\n\n```bibtex\n@software{facex2026,\n  author  = {Atinov, Baurzhan},\n  title   = {FaceX: Fast CPU Face Embedding Library},\n  year    = {2026},\n  url     = {https:\u002F\u002Fgithub.com\u002Ffacex-engine\u002Ffacex}\n}\n```\n\n---\n\n## License\n\nEverything in this repo trained or written by us — code, recognition,\nlandmarks, 3D mesh, face detector — is [Apache License 2.0](LICENSE).\nFree for commercial use, attribution appreciated.\n\nThe only third-party component is MiniFASNet (anti-spoof), which is\nalso Apache 2.0 from [MinivisionAI Silent-Face-Anti-Spoofing](\nhttps:\u002F\u002Fgithub.com\u002Fminivision-ai\u002FSilent-Face-Anti-Spoofing).\n\nFor commercial licensing: [bauratynov@gmail.com](mailto:bauratynov@gmail.com)\n\n---\n\n\u003Cp align=\"center\">\n  Created by \u003Cstrong>Baurzhan Atinov\u003C\u002Fstrong> (Kazakhstan)\u003Cbr>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fbauratynov\">GitHub\u003C\u002Fa>\n\u003C\u002Fp>\n","FaceX 是一个完全在浏览器中运行的全脸处理栈，包括检测、576点3D网格、识别和反欺诈等功能。项目基于WebAssembly实现，无需服务器支持，所有计算均在客户端完成。其核心功能涵盖人脸检测、高精度面部特征点定位、密集3D面部模型生成以及高效的人脸识别与反欺诈技术。特别适用于需要低延迟、隐私保护且不依赖云服务的应用场景，如在线身份验证、安全监控等。整个系统设计紧凑，仅约17MB大小，并采用AES-256-GCM加密权重文件以增强安全性。",2,"2026-06-11 02:54:42","CREATED_QUERY"]