[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-76166":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":15,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":10,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":23,"hasPages":21,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":29,"readmeContent":30,"aiSummary":31,"trendingCount":16,"starSnapshotCount":16,"syncStatus":32,"lastSyncTime":33,"discoverSource":34},76166,"jepa","keon\u002Fjepa","keon","implementing minimal versions of joint-embedding predictive architecture (JEPA)","",null,"Python",179,17,98,1,0,12,74,8,3.77,false,"main",true,[5,25,26,27,28],"pytorch","representation-learning","self-supervised-learning","world-models","2026-06-12 02:03:40","# jepa\n\nMinimal, single-file PyTorch reimplementations of the JEPA family, with paired tutorials.\n\n| File | Method | Dataset | LOC | Tutorial |\n|---|---|---|---:|---|\n| [`ijepa.py`](.\u002Fijepa.py) | I-JEPA | CIFAR-10 | 165 | [`ijepa_tutorial.md`](.\u002Fijepa_tutorial.md) |\n| [`vjepa.py`](.\u002Fvjepa.py) | V-JEPA | Moving MNIST | 194 | [`vjepa_tutorial.md`](.\u002Fvjepa_tutorial.md) |\n| [`vjepa2.py`](.\u002Fvjepa2.py) | V-JEPA 2 + V-JEPA 2-AC | synthetic moving digits | 314 | [`vjepa2_tutorial.md`](.\u002Fvjepa2_tutorial.md) |\n| [`cjepa.py`](.\u002Fcjepa.py) | C-JEPA | 3-digit bouncing video | 162 | [`cjepa_tutorial.md`](.\u002Fcjepa_tutorial.md) |\n| [`leworldmodel.py`](.\u002Fleworldmodel.py) | LeWorldModel | synthetic moving digit | 223 | [`leworldmodel_tutorial.md`](.\u002Fleworldmodel_tutorial.md) |\n\nEach algorithm file is **standalone** — only depends on `torch` and `torchvision`, no shared utilities. The matching `\u003Calgo>_extras.py` adds visualization (mask grids, loss curves, PCA\u002FLDA\u002Ft-SNE evolution, linear probe).\n\nSee [`FAITHFULNESS.md`](.\u002FFAITHFULNESS.md) for the load-bearing details each minimal implementation preserves and the educational substitutions it makes.\n\n## Quick start\n\n```bash\ngit clone git@github.com:keon\u002Fjepa.git\ncd jepa\npython -m venv .venv && source .venv\u002Fbin\u002Factivate\npip install -r requirements.txt     # pinned versions, see below\n\npython ijepa.py                     # train I-JEPA only (no plots)\npython ijepa_extras.py              # train + write all visualizations + linear probe\n```\n\nRuns on CUDA, MPS, or CPU. CIFAR-10 \u002F MNIST datasets auto-download to `.\u002Fdata\u002F`.\n\n### Reproducibility\n\nThe repo pins exact versions in [`requirements.txt`](.\u002Frequirements.txt) and [`pyproject.toml`](.\u002Fpyproject.toml):\n\n```\npython >= 3.10  (tested on 3.13.5)\ntorch == 2.11.0\ntorchvision == 0.26.0\nmatplotlib == 3.10.9\nscikit-learn == 1.8.0   # used by ijepa_extras for t-SNE\nnumpy == 2.4.4\npillow == 12.2.0\n```\n\nInstall as a package instead of installing requirements directly:\n\n```bash\npip install -e .\n```\n\n## What's where\n\n```\n.\n├── ijepa.py \u002F ijepa_extras.py                       # I-JEPA on CIFAR-10\n├── vjepa.py \u002F vjepa_extras.py                       # V-JEPA on Moving MNIST\n├── vjepa2.py \u002F vjepa2_extras.py                     # V-JEPA 2 + V-JEPA 2-AC (synthetic)\n├── cjepa.py \u002F cjepa_extras.py                       # C-JEPA on 3-digit bouncing video\n├── leworldmodel.py \u002F leworldmodel_extras.py         # LeWorldModel (end-to-end JEPA, SIGReg)\n├── ijepa_tutorial.md                                # walk-throughs that match the code\n├── vjepa_tutorial.md\n├── vjepa2_tutorial.md\n├── cjepa_tutorial.md\n├── leworldmodel_tutorial.md\n├── FAITHFULNESS.md                       # preserved details + deliberate simplifications\n├── papers\u002F                              # source PDFs bundled with the repo\n├── samples\u002F                             # mask grids, loss curves, PCA\u002FLDA\u002Ft-SNE plots\n└── figs\u002F                                # paper figures referenced by tutorials\n```\n\n## The methods, in one paragraph each\n\n**I-JEPA** ([Assran et al. 2023](https:\u002F\u002Farxiv.org\u002Fabs\u002F2301.08243)) — predict embeddings of held-out image patches from embeddings of visible patches. EMA target encoder, multi-block masking, smooth-L1 loss. The canonical self-supervised JEPA.\n\n**V-JEPA** ([Bardes et al. 2024](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.08471)) — same recipe, but 3D tubelet patches over video. Two mask groups (short-range + long-range tubes), L1 loss, EMA 0.998 → 1.0.\n\n**V-JEPA 2** ([Assran et al. 2025](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.09985)) — two-phase: V-JEPA pretraining followed by **V-JEPA 2-AC**, an action-conditioned predictor trained on frozen-encoder latents with teacher forcing + rollout. The encoder is frozen in phase 2; no EMA.\n\n**C-JEPA** ([Nam et al. 2026](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.11389)) — object-level trajectory masking with an identity anchor at $t=0$. No EMA. Bidirectional transformer over flattened slot tokens. Built on top of a pretrained object-centric encoder in the paper; here we use a frozen oracle position-slot embedding as a documented educational stand-in.\n\n**LeWorldModel** ([Maes et al. 2026](https:\u002F\u002Farxiv.org\u002Fabs\u002F2603.19312)) — end-to-end JEPA world model from pixels. No EMA, no stop-grad, no masking. The encoder and an action-conditioned AR predictor are jointly trained with two loss terms: next-embedding MSE plus a Sketch Isotropic Gaussian Regularizer (SIGReg) that prevents collapse by pushing the embedding marginals toward $\\mathcal{N}(0, 1)$.\n\n## Caveats\n\nThese are **educational** reimplementations:\n\n- ViT-tiny, not ViT-Huge. CIFAR-10 \u002F Moving MNIST \u002F synthetic videos, not ImageNet \u002F Kinetics.\n- I-JEPA hits **~52.7% linear probe** on CIFAR-10 after 100 epochs. The paper's numbers come from ViT-H\u002F14 on ImageNet for 300 epochs — different planet of compute.\n- C-JEPA skips slot discovery (uses oracle positions). Real C-JEPA requires VideoSAUR\u002FSAVi-style object-centric pretraining on top of visual features.\n- V-JEPA 2-AC is a small block-causal, action\u002Fstate-conditioned latent predictor, not Meta's 300M-parameter robot-action model; it preserves the teacher-forcing + rollout training shape.\n- LeWorldModel includes the two-term objective and projection heads needed for SIGReg, but omits the paper's control\u002Fplanning layer.\n\nEach tutorial discloses the specific deviations from its source paper and keeps code snippets aligned with the minimized implementation.\n\n## License\n\nMIT.\n","该项目实现了联合嵌入预测架构（JEPA）的最小版本。它使用Python和PyTorch框架，提供了I-JEPA、V-JEPA及其变体等算法的独立实现，并附带详细的教程。每个算法文件仅依赖于`torch`和`torchvision`库，通过额外的可视化脚本可生成如掩码网格、损失曲线及线性探针等分析图表。适用于需要进行自监督学习或表示学习的研究者与开发者，在CIFAR-10、Moving MNIST等数据集上测试模型性能或探索世界模型时尤为有用。",2,"2026-06-11 03:54:43","CREATED_QUERY"]