[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-81043":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":15,"stars7d":15,"stars30d":15,"stars90d":16,"forks30d":16,"starsTrendScore":17,"compositeScore":18,"rankGlobal":10,"rankLanguage":10,"license":10,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":21,"hasPages":19,"topics":22,"createdAt":10,"pushedAt":10,"updatedAt":23,"readmeContent":24,"aiSummary":25,"trendingCount":16,"starSnapshotCount":16,"syncStatus":13,"lastSyncTime":26,"discoverSource":27},81043,"leautoencoder","the-puzzler\u002Fleautoencoder","the-puzzler","Self-Teaching Autoencoder learning reconstructions through latent agreement, not pixel loss.","https:\u002F\u002Fthe-puzzler.github.io\u002Fshare\u002Fself-teaching-autoencoder.html",null,"Python",29,2,28,1,0,3,1.43,false,"main",true,[],"2026-06-12 02:04:10","# Self-Teaching Autoencoder\n\nA small experiment in training an autoencoder to teach itself.\n\nInstead of relying only on pixel reconstruction loss, the model also asks a second question: if a clean view and a masked view come from the same image, can their reconstructions be made consistent in the model's own latent space?\n\n**Blog post:** [self-teaching-autoencoder](https:\u002F\u002Fthe-puzzler.github.io\u002Fshare\u002Fself-teaching-autoencoder.html)\n\n![Our method vs baseline](our_method_vs_baseline_latent512_checkpoint100.png)\n\nThe image above compares the current method against a plain masked-autoencoder baseline at latent size `512` (96x compresion).\n\n## Quick Note\n\nFor best reconstruction result dont use masking objective, global pooling, or symmetric set up. Simply use autoencoder with step frozen judge latents from crop resized images.\n\n## Overview\n\nThis repo trains autoencoders on `CelebA`, center-cropped to `128x128`.\n\nThe main training scripts are:\n\n- `main.py`: the current self-teaching method\n- `main_regular.py`: the baseline masked-image reconstruction model\n\nThe model itself lives in `leae\u002Fautoencoder.py`.\n\n## Core Idea\n\nThe model sees two versions of the same image:\n\n- the original clean image\n- a masked version with a square region removed and filled with the image average\n\nBoth views are passed through the same autoencoder. A slowly refreshed copy of the encoder then acts as a judge. The autoencoder is trained so that its reconstructions become mutually consistent under that judge.\n\nThat is why this is a self-teaching autoencoder:\n\n- the student is the current autoencoder\n- the teacher is a target copy of the same encoder\n\n## Current Objective\n\nThe current repo is centered on the symmetric latent-consistency objective in `main.py`.\n\nFor an image `x`:\n\n```text\nz_clean = E(x)\nx_clean_hat = D(z_clean)\n\nz_masked = E(mask(x))\nx_masked_hat = D(z_masked)\n```\n\nThe target encoder `T` is then used to score the reconstructions:\n\n```text\nconsistency_loss =\n    MSE(T(x_clean_hat), T(x_masked_hat))\n\nclean_crop_loss =\n    MSE(T(crop(x)), T(crop(x_clean_hat)))\n\nmasked_crop_loss =\n    MSE(T(crop(x)), T(crop(x_masked_hat)))\n```\n\nThese are averaged into the main latent objective:\n\n```text\nmse_loss = average(consistency_loss, clean_crop_loss, masked_crop_loss)\n```\n\nThere is also a latent regularizer on both clean and masked codes:\n\n```text\nsigreg_loss =\n    0.5 * lambda * (SIGReg(z_clean) + SIGReg(z_masked))\n```\n\nFinal training loss:\n\n```text\nloss = mse_loss + sigreg_loss\n```\n\nThe important part is the symmetry:\n\n- both clean and masked branches go through the full autoencoder\n- both reconstructions are judged in the same latent space\n- both branches help define what a \"good\" reconstruction is\n\n## Baseline\n\nThe baseline in `main_regular.py` is intentionally simple:\n\n```text\nrecon = model(masked_image)\nloss = MSE(recon, image)\n```\n\nIt does not use:\n\n- a target encoder\n- latent consistency between branches\n- crop consistency losses\n- `SIGReg`\n\nSo it serves as the direct \"just reconstruct the image\" comparison.\n\n## Running\n\nuv sync\n\nuv run main.py\n\n## Repo Layout\n\n- `main.py`: current self-teaching training loop\n- `main_regular.py`: baseline training loop\n- `leae\u002Fautoencoder.py`: autoencoder architecture\n- `leae\u002Fmasking.py`: masking and crop helpers\n- `leae\u002Fprep_data.py`: dataset loading\n- `leae\u002Fsigreg.py`: latent regularizer\n\n## Why This Exists\n\nThis repo explores a simple question:\n\nCan an autoencoder learn stronger representations if it is trained to stay self-consistent under masking and reconstruction, instead of optimizing only direct pixel error?\n\nFor the longer writeup and results, see:\n\n**[https:\u002F\u002Fthe-puzzler.github.io\u002Fshare\u002Fself-teaching-autoencoder.html](https:\u002F\u002Fthe-puzzler.github.io\u002Fshare\u002Fself-teaching-autoencoder.html)**\n","the-puzzler\u002Fleautoencoder 是一个通过潜在空间一致性而非像素损失来训练自编码器的实验项目。其核心功能在于让模型自我教学，即通过对比原始图像与遮罩图像在潜在空间中的表示一致性来优化模型，而不是单纯依赖像素级别的重建误差。此外，项目引入了SIGReg正则化项以进一步提升模型性能。该技术特别适用于需要高保真度图像重建且对数据细节敏感的应用场景，如面部识别、医学影像分析等。基于Python开发，易于上手和扩展。","2026-06-11 04:03:17","CREATED_QUERY"]