[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80681":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":12,"contributorsCount":12,"subscribersCount":12,"size":12,"stars1d":12,"stars7d":14,"stars30d":15,"stars90d":12,"forks30d":12,"starsTrendScore":12,"compositeScore":12,"rankGlobal":9,"rankLanguage":9,"license":16,"archived":17,"fork":17,"defaultBranch":18,"hasWiki":17,"hasPages":17,"topics":19,"createdAt":9,"pushedAt":9,"updatedAt":20,"readmeContent":21,"aiSummary":22,"trendingCount":12,"starSnapshotCount":12,"syncStatus":23,"lastSyncTime":24,"discoverSource":25},80681,"clip-finetune-recipes","thombanal\u002Fclip-finetune-recipes","thombanal","Practical CLIP fine-tuning recipes — DDP training, LoRA, hard-negative mining, leakage checks.",null,"Python",221,0,6,67,174,"Other",false,"main",[],"2026-06-12 02:04:05","# CLIP Tuning Cookbook\n\n[![PyPI](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fclip-recipes?color=blue)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fclip-recipes\u002F)\n[![Python](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-3.10%2B-blue.svg)](https:\u002F\u002Fwww.python.org)\n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-Apache--2.0-green.svg)](LICENSE)\n[![PyTorch](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpytorch-2.1%2B-orange.svg)](https:\u002F\u002Fpytorch.org)\n[![Tests](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftests-passing-brightgreen.svg)](#)\n[![Discord](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fchat-discord-7289da.svg)](#)\n\nEnd-to-end recipes for fine-tuning CLIP-style dual encoders on **custom**\nimage-text data, with an emphasis on practical hygiene: leakage checks,\ncontrastive batch construction that actually works at 4-GPU scale, and the\nLR\u002Fwarmup schedules that don't blow up the temperature parameter.\n\n> If you're new to CLIP fine-tuning, start with `recipes\u002Fquickstart.md`.\n> If you've burned a checkpoint to bad contrastive batches before, the\n> \"Batch hygiene\" section is what you want.\n\n## Features\n\n- **Streaming data pipeline** — webdataset shards + on-the-fly de-dup against eval sets\n- **Hard-negative mining** — text\u002Fimage side, optional FAISS index\n- **DDP training loop** — proper local-loss aggregation, no all-gather memory spikes\n- **LoRA \u002F full FT \u002F linear-probe** — three recipes, one config schema\n- **Eval hooks** — zero-shot ImageNet, retrieval (Flickr30k-CN \u002F COCO-CN)\n- **Sanity scripts** — temperature drift, gradient norms, embedding collapse\n\n## Install\n\n```bash\npip install clip-recipes\n# or, from source:\npip install -e .[train]\n```\n\nFor Chinese \u002F multilingual experiments:\n\n```bash\npip install clip-recipes[zh]\n```\n\n## Quick start\n\nSingle-node, single-GPU sanity run on the bundled tiny dataset:\n\n```bash\npython -m clip_recipes.train --config configs\u002Fquickstart.yaml\n```\n\nReal run, 4xA100:\n\n```bash\ntorchrun --nproc_per_node=4 -m clip_recipes.train \\\n    --config configs\u002Flaion_400m_lora.yaml \\\n    --output_dir runs\u002Flaion_lora_v3\n```\n\n## What's in here\n\n```\nclip_recipes\u002F\n  train.py            # main entry, DDP-aware\n  data\u002F\n    webdataset.py     # tar-shard loader\n    dedup.py          # SHA-256 leakage filter\n    augment.py\n  losses\u002F\n    contrastive.py    # InfoNCE w\u002F logit_scale\n    sigclip.py        # SigLIP-style binary CE\n  models\u002F\n    builder.py        # build_clip(...) factory\n    lora.py\n  eval\u002F\n    zeroshot.py\n    retrieval.py\n  schedules.py\nconfigs\u002F\n  quickstart.yaml\n  laion_400m_lora.yaml\n  laion_full_ft.yaml\n  zh_clip_continue.yaml\n```\n\n## Configs\n\nConfigs are plain YAML. The schema is in `clip_recipes\u002Fconfig.py`; everything\noverridable from CLI via `key=value`. Example:\n\n```bash\npython -m clip_recipes.train --config configs\u002Fquickstart.yaml \\\n    train.lr=3e-5 train.batch_size=512 model.lora_r=8\n```\n\n## Batch hygiene\n\nThe single biggest win we've seen for small-team CLIP fine-tuning is **not**\na better loss or a fancier optimizer; it's making sure the contrastive batch\ncontains genuinely diverse negatives. The pipeline does three things:\n\n1. **De-dup against eval sets at shard build time** (`tools\u002Fbuild_shards.py`)\n2. **Shuffle at the shard level *and* within shards** (`webdataset.py`)\n3. **Cross-GPU negatives via local-loss aggregation** (`losses\u002Fcontrastive.py`)\n\nDon't skip step 3 if you train on >1 GPU. Without it you're effectively doing\n4 independent 256-batch InfoNCE runs and calling it 1024.\n\n## Evaluation\n\n```bash\npython -m clip_recipes.eval.zeroshot \\\n    --checkpoint runs\u002Flaion_lora_v3\u002Ffinal.pt \\\n    --dataset imagenet1k --device cuda\n```\n\nFor Chinese retrieval:\n\n```bash\npython -m clip_recipes.eval.retrieval \\\n    --checkpoint runs\u002Fzh_continue\u002Ffinal.pt \\\n    --dataset flickr30k-cn\n```\n\n## Caveats \u002F known issues\n\n- The `sigclip` loss path is less battle-tested than `contrastive`; treat numbers as preliminary.\n- TODO: gradient checkpointing for ViT-L+ — currently OOMs on 40GB cards above batch 512.\n- The webdataset loader has a sharp edge around shard counts not divisible by world size; we drop the trailing shards. See `data\u002Fwebdataset.py:42`.\n\n## Citing\n\nIf this codebase helped your paper, a footnote pointing here is appreciated.\n\n## License\n\nApache-2.0. See `LICENSE`.\n\n\u003C!-- thanks @ lab members for the laion 400m subset shards -->\n","该项目提供了一套针对CLIP风格双编码器在自定义图文数据上进行微调的实用方案，强调了实际应用中的细节处理如泄漏检查、对比批次构建及学习率\u002F预热调度等。其核心功能包括支持分布式数据并行训练、LoRA微调、硬负样本挖掘以及实时的数据流管道等技术特点。特别适合于需要对图像-文本联合表示进行优化的场景，比如跨模态检索任务或零样本分类实验。通过提供一系列详尽的配置选项和脚本，使得用户能够轻松地根据自身需求调整模型参数并开展实验。",2,"2026-06-11 04:01:37","CREATED_QUERY"]