[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80092":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":16,"stars7d":15,"stars30d":17,"stars90d":16,"forks30d":16,"starsTrendScore":16,"compositeScore":18,"rankGlobal":10,"rankLanguage":10,"license":10,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":21,"hasPages":19,"topics":22,"createdAt":10,"pushedAt":10,"updatedAt":23,"readmeContent":24,"aiSummary":25,"trendingCount":16,"starSnapshotCount":16,"syncStatus":17,"lastSyncTime":26,"discoverSource":27},80092,"hl-imagenet","xisen-w\u002Fhl-imagenet","xisen-w","Experimenting Heuristic Learning with ImageNet","",null,"Python",60,5,58,1,0,2,2.33,false,"main",true,[],"2026-06-12 02:03:57","# HL-ImageNet: Heuristic-Learning Image Classification Without Neural Networks\n\n**Claude Code and Codex iteratively built a symbolic image classifier using classical computer vision. The main pipeline uses no neural networks, no gradient descent, and no backpropagation.**\n\nThis is an application of Jiayi Weng's [Heuristic Learning](https:\u002F\u002Ftrinkle23897.github.io\u002Flearning-beyond-gradients\u002F) framework to static image classification.\n\n---\n\n## Phase 2 (Current): 10-Class Real Image Classification\n\nA proper train\u002Fval\u002Ftest experiment with 10 real Tiny ImageNet classes. Train and validation use 2,000 images each; test uses 1,000 images.\n\n### Current Reproducible Results\n\n| System | Train top-1 | Val top-1 | Gap | Reading |\n|--------|------------:|----------:|----:|---------|\n| `base_rerank` | **55.4%** | **51.9%** | 3.5pp | Best generalizing symbolic core |\n| `full` verify rules | **84.0%** | **50.5%** | 33.5pp | High train accuracy, weak transfer |\n| archived historical endpoint | **100.0%** | not current ground truth | - | Reached in logs; exact code state is not currently reproducible |\n| small CNN baseline | 76.0% | 71.8% | 4.2pp | Learned-representation reference |\n\nThe strict current claim is:\n\n```text\nbase_rerank: 55.4% train \u002F 51.9% val\nfull verify: 84.0% train \u002F 50.5% val\n```\n\nThe historical 100% train endpoint matters because it shows that symbolic code can fit the training set very aggressively. It is not used as the current reproducible headline because the exact code state that produced it is not present at `HEAD`.\n\n### Interpretation\n\nPhase 2 does **not** show that symbolic vision solves ImageNet-10. It shows a more specific boundary:\n\n1. A symbolic HL system has enough capacity to fit real-image training data far beyond the initial baseline.\n2. The best generalizing symbolic core is much lower: roughly 52% validation accuracy.\n3. Verification rules can push train accuracy very high, but they expose a sharp memorization\u002Fgeneralization gap.\n4. The likely gap to CNNs is not raw fitting capacity. It is learned reusable representation plus regularized credit assignment.\n\n### 10 Classes\n\n| # | Class | wnid | Main confusions |\n|---|-------|------|-----------------|\n| 1 | golden retriever | n02099601 | banana, brown bear, mushroom |\n| 2 | mushroom | n07734744 | banana, brown bear, GR |\n| 3 | teapot | n04398044 | king penguin, banana, GR |\n| 4 | school bus | n04146614 | sports car |\n| 5 | banana | n07753592 | orange, school bus |\n| 6 | orange | n07747607 | banana |\n| 7 | brown bear | n02132136 | mushroom, GR, school bus |\n| 8 | king penguin | n02056570 | brown bear, sports car |\n| 9 | jellyfish | n01910747 | king penguin |\n| 10 | sports car | n04285008 | school bus, king penguin |\n\n### Data Split\n\n| Split | Images\u002Fclass | Total | Purpose |\n|-------|:---:|:---:|---------|\n| **Train** | 200 | 2,000 | HL loop tuning |\n| **Val** | 200 | 2,000 | Generalization reporting and audit |\n| **Test** | 100 | 1,000 | Touched once at the very end |\n| **External** | 50 | 500 | Official Tiny ImageNet val |\n\n### Phase 2 Architecture\n\n```\nimage (64x64 BGR)\n  -> scene graph builder (color masks, edges, texture maps, blobs)\n  -> 50+ low-level stats (hue ratios, edge density, gradients, LBP, spatial)\n  -> 10 class signatures (weighted sum of sigmoid activations + guards)\n  -> mean-centered histogram prototype blending\n  -> calibration and class repulsion\n  -> pairwise reranking (targeted discriminant pairs, gap-aware gating)\n  -> optional verify rules\n  -> prediction with proof trace\n```\n\n**Layer 1 — Class Signatures:** Each class has a signature — a weighted sum of sigmoid activations over image statistics, with guard gates:\n\n```python\npos = sum(weight_i * sigmoid(stat_i, threshold_i, steepness_i) for each positive signal)\nguards = [sigmoid(stat_j, threshold_j, negative_steepness) for each guard]\nscore = pos * min(guards)  # any guard can suppress the score\n```\n\nNo hard binary thresholds. Each sigmoid contributes 0-1, and the sum represents soft match strength.\n\n**Layer 2 — Histogram Prototype Blending:** 2D hue-saturation histograms are computed per class from training images. At inference, the image's histogram is compared to each class prototype. Mean-centered blending:\n\n```\nfinal = 0.88 * signature_score + 0.12 * (hist_score - class_mean * 0.3)\n```\n\n**Layer 3 — Pairwise Reranking with Gap-Aware Gating:** For the top-2\u002Ftop-3 candidates, specialized discriminant functions compute evidence. A swap happens only when evidence exceeds a gap-scaled threshold:\n\n```\nswap iff disc_margin > base_threshold + score_gap * gap_scale\n```\n\nTargeted pairwise discriminant functions use per-pair base thresholds and rank-dependent gap scaling.\n\n**Layer 4 — Verify Rules:** The `full` mode adds many narrow local\u002Frank\u002Ffinal verification rules. These rules improve train accuracy from 55.4% to 84.0%, but reduce validation from 51.9% to 50.5%, so they are treated as a diagnostic overfitting layer rather than the main generalizing system.\n\n### Pipeline Modes\n\n| Mode | What it includes | Role |\n|------|------------------|------|\n| `base` | signatures + histogram blend + calibration\u002Frepulsion | Core symbolic scorer |\n| `base_rerank` | `base` + pairwise reranking | Main generalizing symbolic result |\n| `full` | `base_rerank` + verify rules | Train-fitting diagnostic |\n\n### Phase 2 Accuracy Trajectory\n\n![Phase 2 Accuracy Trajectory](docs\u002Fphase2\u002Fplots\u002F01_accuracy_trajectory.png)\n\n### Phase 2 Experiment Logs\n\n- [`docs\u002Fphase2\u002Fblog.md`](docs\u002Fphase2\u002Fblog.md) — Phase 2 writeup and reflection\n- [`docs\u002Fphase2\u002Flessons.md`](docs\u002Fphase2\u002Flessons.md) — Lessons from the symbolic HL loop\n- [`docs\u002Fphase2\u002Funderstanding\u002F`](docs\u002Fphase2\u002Funderstanding\u002F) — Distilled analyses of pipeline behavior\n- [`logs\u002FREADME.md`](logs\u002FREADME.md) — Log lineage inventory and plotting rules\n- [`logs\u002Fphase2\u002F`](logs\u002Fphase2\u002F) — Phase 2 eval logs (JSON + markdown)\n\n---\n\n## Lessons Learned (Both Phases)\n\nThe full Phase 2 reflection is in **[`docs\u002Fphase2\u002Fblog.md`](docs\u002Fphase2\u002Fblog.md)** and **[`docs\u002Fphase2\u002Flessons.md`](docs\u002Fphase2\u002Flessons.md)**. Highlights:\n\n1. **Fitting is surprisingly doable** — symbolic verify rules can push train accuracy very high.\n2. **Generalization is the hard part** — the best validation number comes from the smaller `base_rerank` system, not the full verify system.\n3. **Pairwise reranking transfers better than narrow verification rules** — it targets reusable confusion structures instead of isolated failures.\n4. **Global\u002Fcoarse features hit a representation ceiling** — color coverage, edge density, texture stats, quadrant stats, and histogram prototypes do not substitute for learned local\u002Fpart features.\n5. **The codebase is the model** — thresholds, constants, prototypes, rule conditions, logs, tests, and update scripts together form the learned system.\n6. **HL needs regularization and credit assignment** — future progress should reward reusable visual operators, held-out rule selection, patch-level attribution, and object-centered perception.\n\n---\n\n## The HL Loop\n\n```\neval on train -> analyze confusion matrix -> hypothesize fix -> implement -> eval -> keep or revert -> repeat\n```\n\nEach iteration tests one hypothesis. Regressions are reverted. Claude Code and Codex maintain experiment logs, reasoning traces, plots, and feature distribution analyses throughout.\n\n---\n\n## Phase 1 (Completed): Exploratory Setup\n\n\u003Cdetails>\n\u003Csummary>Phase 1 used 4 real + 6 synthetic classes with a shared dev\u002Feval set. Click to expand.\u003C\u002Fsummary>\n\nPhase 1 demonstrated that the HL loop works, but had evaluation methodology issues (tuning and eval on the same images).\n\n### Phase 1 Results\n\n- Dev-set top-1 (all 10 classes): **86.1%** (tuned on same 230 images)\n- Held-out validation (4 hard classes): **54%** (216\u002F400)\n- Non-overlapping subset: **51.4%** (186\u002F362)\n- 248 iterations across 11 sessions (~20 hours)\n\n### Phase 1 Architecture\n\nPhase 1 used a completely different scoring system:\n\n```\nscore = required_avg * 0.6 + supporting_avg * 0.3 - excluding_avg * 0.2\n```\n\nEach class had required, supporting, and excluding feature lists. If any required feature didn't fire, the class scored zero. This was replaced entirely in Phase 2 with the sigmoid-based scoring system.\n\nPhase 1 also used a 22-function pairwise tiebreaker system (different from Phase 2's discriminant-based reranking).\n\n### Phase 1 Growth Trajectory\n\n```\nSession 1:   ~20%   baseline sensors + features\nSession 2:    35%   flat scorer (replaced broken hierarchy)\nSession 3:    44%   compound features + tiebreakers\nSession 4:    57%   tiebreaker expansion + school bus window pattern\nSession 5:    62%   spatial attention + synthetic class tiebreakers\nSession 6:    67%   eagle\u002Fbanana solved to 100%\nSession 7:    68%   plateau (DCT explored, failed)\nSession 8:    78%   banana cap + compound conjunctions\nSession 9:    80%   gradient\u002Fgreen conjunctions\nSession 10:   85%   alt required features + guard tightening\nSession 11:   86%   green+warm counter-signals (final)\n```\n\n### Phase 1 Ceiling\n\nThe remaining 32 errors (14%) came from the dog\u002Fmushroom\u002Fteapot triangle: at 64x64, all three are \"warm-colored smooth blobs.\"\n\n### Phase 1 Honesty Notes\n\n1. The 86.1% is dev-set accuracy (same images used for tuning).\n2. 6 of 10 classes used trivial synthetic images. The evaluation claim should be read as 4-class.\n3. The system stores histogram prototypes and ~50 tuned thresholds. Not \"zero learned parameters.\"\n4. What Phase 1 demonstrated: the HL loop works. Confusion-driven iteration, feature invention, and representation saturation are real phenomena.\n\nSee the [full blog post](docs\u002Fphase1\u002Fblog.md) for trajectory analysis and ceiling discussion.\n\n### Phase 1 Plots\n\n![Phase 1 Accuracy Trajectory](docs\u002Fphase1\u002Fplots\u002F01_accuracy_trajectory.png)\n\n![Phase 1 Hard Classes](docs\u002Fphase1\u002Fplots\u002F06_hard_classes.png)\n\n\u003C\u002Fdetails>\n\n---\n\n## Project Structure\n\n```\nhl-image-net\u002F\n├── hlinet\u002F\n│   ├── sensors\u002F           # Classical vision: edges, color, texture, segmentation, shape\n│   ├── scene\u002F             # Scene graph builder + spatial relations\n│   ├── features\u002F\n│   │   ├── primitives\u002F    # Color, shape features\n│   │   ├── textures\u002F      # Pattern detection\n│   │   ├── parts\u002F         # Structural parts\n│   │   ├── spatial\u002F       # Grid + layout predicates\n│   │   ├── compounds\u002F     # Phase 2 signatures, histogram prototypes\n│   │   └── concepts\u002F      # High-level concept detectors\n│   ├── classifier\u002F\n│   │   ├── predict.py     # Phase 2: signatures -> blend -> rerank -> predict\n│   │   ├── scorer.py      # Phase 1: flat scorer (legacy)\n│   │   ├── hierarchy.py   # Class hierarchy\n│   │   └── tiebreaker.py  # Phase 1: pairwise tiebreakers (legacy)\n│   ├── eval\u002F              # Dataset loader, metrics, evaluation runner\n│   └── registry.py        # Feature registry\n├── scripts\u002F\n│   ├── plot01_trajectory.py  # Generate the Phase 2 trajectory plot\n│   └── predict_image.py   # Classify a single image\n├── data\u002Fphase2\u002F           # Train\u002Fval\u002Ftest splits (not in repo)\n├── logs\u002F\n│   ├── README.md          # Log lineage inventory and plotting rules\n│   ├── log_inventory.csv  # Machine-readable audit inventory\n│   ├── phase1\u002F            # Cleaned Phase 1 eval logs\n│   ├── phase2\u002F            # Cleaned Phase 2 eval logs\n│   └── generalization\u002F    # Generalization checks and summaries\n└── docs\u002F\n    ├── phase1\u002F            # Exploratory setup, report, blog, plots\n    ├── phase2\u002F            # Main hand-built symbolic pipeline docs, understanding, reflections\n    ├── anycode\u002F           # Side experiment: unconstrained compiled classifiers\n    └── phase3\u002F            # Forward plan for local perception\n```\n\n## Quick Start\n\n```bash\npip install -e .\n\n# Run evaluation (defaults to val set)\npython -m hlinet.eval.runner\n\n# Run on train set\npython -m hlinet.eval.runner --data-dir data\u002Fphase2\u002Ftrain\n\n# Classify a single image\npython scripts\u002Fpredict_image.py path\u002Fto\u002Fimage.jpg\n```\n\n## Technical Details\n\n- **Language**: Python >=3.11\n- **Dependencies**: OpenCV, NumPy, SciPy, scikit-image, scikit-learn, NetworkX, Matplotlib\n- **Symbolic pipeline constraint**: no neural-network framework, no backpropagation, no learned embedding model\n- **Eval log inventory**: tracked in [`logs\u002FREADME.md`](logs\u002FREADME.md) and [`logs\u002Flog_inventory.csv`](logs\u002Flog_inventory.csv)\n- **Phase 1**: 250 archived eval records, exploratory setup\n- **Phase 2**: 976 archived eval records, real 10-class symbolic pipeline\n- **Coding agents**: Claude Code and Codex\n\n---\n\n## Citation\n\n```\nHeuristic Learning for Image Classification: Without Neural Networks.\nXisen Wang, May 2026.\n```\n\n## References\n\nWeng, J. (2026). *Learning Beyond Gradients*. https:\u002F\u002Ftrinkle23897.github.io\u002Flearning-beyond-gradients\u002F\n","该项目是一个基于启发式学习的图像分类实验，不使用神经网络、梯度下降或反向传播。核心功能是通过经典计算机视觉技术构建符号化的图像分类器，并在Tiny ImageNet数据集上进行10类真实图像分类测试。技术特点包括采用Jiayi Weng提出的启发式学习框架，能够实现较高的训练准确率（最高达到100%），但验证集上的泛化能力相对较低（约52%）。适合于研究和探索非深度学习方法在图像分类中的应用潜力，特别是在需要高透明度和可解释性的场景下。","2026-06-11 03:59:12","CREATED_QUERY"]