[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-81007":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":15,"stars7d":13,"stars30d":16,"stars90d":15,"forks30d":15,"starsTrendScore":15,"compositeScore":17,"rankGlobal":10,"rankLanguage":10,"license":18,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":19,"hasPages":21,"topics":22,"createdAt":10,"pushedAt":10,"updatedAt":23,"readmeContent":24,"aiSummary":25,"trendingCount":15,"starSnapshotCount":15,"syncStatus":14,"lastSyncTime":26,"discoverSource":27},81007,"KubriCount","Verg-Avesta\u002FKubriCount","Verg-Avesta","Count Anything at Any Granularity","https:\u002F\u002Fverg-avesta.github.io\u002FKubriCount\u002F",null,"Python",36,4,2,0,5,44.6,"Apache License 2.0",false,"main",true,[],"2026-06-11 04:07:22","# KubriCount: Count Anything at Any Granularity\n\nOfficial code release and dataset-generation pipeline for **Count Anything at Any Granularity**.\n\n[🏡 Project Page](https:\u002F\u002Fverg-avesta.github.io\u002FKubriCount\u002F) | [📄 Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.10887) | [🤗 Dataset](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fliuchang666\u002FKubriCount)\n\nKubriCount is a large-scale synthetic benchmark for **multi-grained visual counting**. The project targets open-world counting settings where the intended counting granularity must be explicit: identity, attribute, category, instance type, or concept. This repository provides the code used to construct KubriCount: controllable 3D synthesis, mask-conditioned image editing, and VLM-based filtering for dense instance-level supervision with controlled distractors.\n\n## Highlights\n\n- **Multi-grained counting benchmark** with five explicit semantic levels.\n- **Fully automatic data scaling pipeline** built around 3D asset curation, Kubric-based prototype synthesis, consistent image editing, and automatic quality filtering.\n- **Dense annotations** including counts, center points, 2D\u002F3D boxes, masks, and metadata.\n- **Large-scale dataset** with 110,507 images, 157 categories, about 7.3M annotated objects, and up to 250 objects per image.\n- **Controlled generalization splits** covering seen categories, unseen assets, and unseen categories.\n\n## Dataset\n\nThe KubriCount dataset is available on Hugging Face:\n\nhttps:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fliuchang666\u002FKubriCount\n\nThe dataset can be used directly and does **not** require running the generation pipeline in this repository. The pipeline is provided for reproducibility and future dataset construction.\n\nAfter downloading or extracting the Hugging Face dataset, place it under:\n\n```text\nKubriCount\u002F\n```\n\nKubriCount contains five counting levels with train\u002Ftest splits designed for controlled generalization:\n\n- `train`: about 100K images from seen categories, excluding held-out TestA assets\n- `testA`: about 5K images with unseen assets within seen categories\n- `testB`: about 5K images with unseen categories\n\nFor Levels 2-5, each image can define two counting queries by swapping the target and distractor groups, yielding about 198K queries in total. The benchmark includes counts, center points, 2D\u002F3D boxes, masks, and metadata for multi-grained evaluation.\n\n## External Resources for Data Generation\n\nThe following resources are only needed if you want to reproduce or extend the data generation pipeline. They are not required for using the released Hugging Face dataset.\n\nLarge generated resources are intentionally not tracked by git. This repository keeps placeholder directories:\n\n```text\nassets\u002F         # 3D assets, HDRIs, and asset manifests; download link coming soon\ndocker-image\u002F   # Prebuilt Docker image archives; download link coming soon\n```\n\n## Generation Pipeline Overview\n\nKubriCount is generated in four stages:\n\n1. **3D asset curation**: build a categorized object asset bank from labeled 3D datasets and controllable 3D generation.\n2. **Prototype synthesis**: use Kubric, PyBullet, and Blender to render controllable multi-object scenes with exact instance-level metadata.\n3. **Consistent image editing**: improve visual realism while preserving object topology and annotations.\n4. **Automatic data filtering**: use a VLM inspector to reject samples with layout drift, count changes, identity corruption, background hallucination, or severe artifacts.\n\n## Granularity Levels\n\nKubriCount defines five counting levels. Each level specifies a target set and, when applicable, a controlled distractor set that differs by one semantic factor in the hierarchy.\n\n| Level | Granularity | Description |\n| --- | --- | --- |\n| L1 | Identity-level | Count all instances of a single object type. |\n| L2 | Attribute-level | Count objects distinguished by size or color. |\n| L3 | Category-level | Count one category while excluding another category. |\n| L4 | Instance-level | Count one instance type within the same category. |\n| L5 | Concept-level | Count a category\u002Fconcept with multiple instance types and semantically plausible distractors. |\n\n## Repository Layout\n\n```text\n.\n├── kubric\u002F              # Core Kubric-based rendering and simulation package\n├── docker\u002F              # Dockerfiles for building runtime environments\n├── evaluation\u002F          # Evaluation scripts for MLLMs and expert counting models\n├── assets\u002F              # Placeholder for external 3D assets and manifests\n├── docker-image\u002F        # Placeholder for external prebuilt Docker image archives\n├── KubriCount\u002F          # Placeholder for the Hugging Face dataset\n├── scripts_urdf\u002F        # Trellis asset preprocessing utilities\n├── shapenet2kubric\u002F     # ShapeNet-to-Kubric conversion utilities\n├── config_dense.json    # Dense scene generation configuration\n├── config_gpt.json      # Default scene generation configuration\n├── render_level.py      # Main multi-grained scene generation script\n├── run.sh               # CPU rendering entry point\n└── run_gpu.sh           # GPU rendering entry point\n```\n\n## Quick Start\n\n### Use the Released Dataset\n\nDownload KubriCount from Hugging Face and extract it into `KubriCount\u002F`. No rendering assets, Docker images, or API credentials are needed for dataset-only use.\n\n```text\nKubriCount\u002F\n├── train\u002F\n├── testA\u002F\n└── testB\u002F\n```\n\n### Reproduce or Extend the Generation Pipeline\n\nThe full generation pipeline requires external assets, a Kubric-compatible Docker environment, and API access for the image editing \u002F VLM filtering stages. Prebuilt Docker image archives and asset bundles will be linked here once released.\n\nExample CPU rendering command:\n\n```bash\nbash run.sh 1 1 train random config_gpt.json\n```\n\nExample GPU rendering command:\n\n```bash\nbash run_gpu.sh 1 all 1 train random config_gpt.json\n```\n\nGenerated scenes are written under `KubriCount\u002F`.\n\nPost-processing and filtering scripts:\n\n```bash\n# Initial mask-conditioned image editing.\npython banana_edit_level.py --root_path KubriCount\u002Ftrain --workers 20 --overwrite\n\n# Iterative re-editing for samples that need another editing pass.\npython banana_edit_redo.py --root_path KubriCount\u002Ftrain --workers 20 --retry_times 3\n\n# Initial VLM-based PASS\u002FFAIL filtering.\npython gemini_filter.py --root_path KubriCount\u002Ftrain --workers 20 --flush_every 1000\n\n# Iterative re-checking after re-editing.\npython gemini_filter_redo.py --root_path KubriCount\u002Ftrain --workers 20 --flush_every 1000\n```\n\n### Evaluate MLLMs\n\nMLLM evaluation scripts are available under `evaluation\u002Fmllm\u002F`. They support API-based models and local Hugging Face vision-language models:\n\n```bash\npython evaluation\u002Fmllm\u002Feval_api_models.py --help\npython evaluation\u002Fmllm\u002Feval_open_models.py --help\n```\n\nSee `evaluation\u002Fmllm\u002FREADME.md` for setup and example commands.\n\n### Evaluate Counting Expert Models\n\nKubriCount inference adapters for FamNet, LOCA, CounTR, DAVE, GeCo, Rex-Omni, CountGD++, and CountGD are available under `evaluation\u002Fcounting_expert_models\u002F`. These adapters keep the original model imports but do not vendor third-party model code or checkpoints.\n\n```bash\npython evaluation\u002Fcounting_expert_models\u002Ffamnet\u002Finference_kub_famnet_batch.py --help\npython evaluation\u002Fcounting_expert_models\u002Floca\u002Finference_kub_loca_batch.py --help\npython evaluation\u002Fcounting_expert_models\u002Fcountr\u002Finference_kub_countr_batch.py --help\npython evaluation\u002Fcounting_expert_models\u002Fdave\u002Finference_kub_dave_batch.py --help\npython evaluation\u002Fcounting_expert_models\u002Fgeco\u002Finference_kub_geco_batch.py --help\npython evaluation\u002Fcounting_expert_models\u002Frex_omni\u002Finference_kub_rex_omni.py --help\npython evaluation\u002Fcounting_expert_models\u002Fcountgdpp\u002Finference_kub_countgdpp_batch.py --help\npython evaluation\u002Fcounting_expert_models\u002Fcountgd\u002Finference_kub_countgd_batch.py --help\n```\n\nSee `evaluation\u002Fcounting_expert_models\u002FREADME.md` for setup notes and example commands.\n\n## Citation\n\nIf you find this project useful, please cite:\n\n```bibtex\n@article{liu2026count,\n  title={Count Anything at Any Granularity},\n  author={Liu, Chang and Wu, Haoning and Xie, Weidi},\n  journal={arXiv preprint arXiv:2605.10887},\n  year={2026}\n}\n```\n\n## Acknowledgements\n\nThis project builds on the excellent [Kubric](https:\u002F\u002Fgithub.com\u002Fgoogle-research\u002Fkubric) data generation framework. We thank the Kubric authors and contributors for making their rendering and simulation infrastructure publicly available.\n\n## License\n\nThis repository includes code derived from Kubric and is released under the Apache License 2.0. See [LICENSE](LICENSE).\n","KubriCount是一个用于多粒度视觉计数的大规模合成基准项目。它支持在开放世界中对物体进行从身份、属性、类别、实例类型到概念五个明确的语义层级上的计数，并提供了一个全自动的数据生成流水线，包括可控的3D合成、基于掩码的图像编辑以及基于视觉语言模型的过滤，以实现密集实例级别的监督与干扰控制。该项目适用于需要高精度和多层次计数的应用场景，如智能监控、自动驾驶中的目标检测与计数等。此外，KubriCount还提供了一个包含约110,500张图片、157个类别的大规模数据集，每个图像都带有详细的标注信息，方便研究人员直接使用或根据需要扩展数据集。","2026-06-11 04:03:10","CREATED_QUERY"]