[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-83082":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":22,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":16,"starSnapshotCount":16,"syncStatus":29,"lastSyncTime":30,"discoverSource":31},83082,"ideogram4","ideogram-oss\u002Fideogram4","ideogram-oss","Ideogram 4: Open image model at the forefront of design","",null,"Python",1923,188,13,18,0,56,1113,379,19.83,"Apache License 2.0",false,"main",true,[],"2026-06-12 02:04:30","\u003Cp align=\"center\">\u003Ca href=\"https:\u002F\u002Fideogram.ai\u002F\" target=\"_blank\" rel=\"noopener noreferrer\">\u003Cpicture>\n  \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"assets\u002Fideogram_logo_darkmode.svg\">\n  \u003Csource media=\"(prefers-color-scheme: light)\" srcset=\"assets\u002Fideogram_logo.svg\">\n  \u003Cimg src=\"assets\u002Fideogram_logo.svg\" alt=\"Ideogram\" width=\"500\">\n\u003C\u002Fpicture>\u003C\u002Fa>\u003C\u002Fp>\n\n\u003Cp align=\"center\">\u003Cem>Ideogram 4: Open image model at the forefront of design\u003C\u002Fem>\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fideogram.ai\u002Fblog\u002Fideogram-4.0\u002F\" target=\"_blank\" rel=\"noopener noreferrer\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FBlog-Post-orange\" alt=\"Blog Post\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fideogram-oss\u002Fideogram4\" target=\"_blank\" rel=\"noopener noreferrer\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FCode-GitHub-181717?logo=github\" alt=\"Code\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Fideogram-ai\u002Fideogram-4\" target=\"_blank\" rel=\"noopener noreferrer\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FModel-HuggingFace-blue?logo=huggingface\" alt=\"Model\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fdeveloper.ideogram.ai\u002F\" target=\"_blank\" rel=\"noopener noreferrer\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAPI-developer.ideogram.ai-purple\" alt=\"API\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fideogram.ai\u002F\" target=\"_blank\" rel=\"noopener noreferrer\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FOfficial%20Site-ideogram.ai-ff69b4\" alt=\"Official Site\">\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fsamples\u002Fcollage_landscape.jpg\" alt=\"A collage of Ideogram 4 samples spanning photorealism, illustration, typography, and poster design\">\n\u003C\u002Fp>\n\n\nIdeogram 4 is **[Ideogram](https:\u002F\u002Fideogram.ai)'s first open-weight text-to-image model**. It is a **state-of-the-art foundation model trained from scratch** — not a fine-tune of any existing model. It introduces a new structured JSON prompting interface, with best-in-class multilingual text rendering, deep language understanding, explicit bounding-box layout and color-palette controls, and native 2k resolution images. The easiest way to try the model is online at **[ideogram.ai](https:\u002F\u002Fideogram.ai\u002F)**.\n\nWe believe openness drives innovation, and we invite the research community to innovate with us on the forefront of visual intelligence.\n\n## Table of Contents\n\n1. [News](#news)\n2. [Model Zoo](#model-zoo)\n3. [Performance](#performance)\n4. [Quick Start](#quick-start)\n5. [Model Summary](#model-summary)\n6. [Prompting Guide](#prompting-guide)\n7. [Documentation](#documentation)\n8. [Citation](#citation)\n\n## News\n\n* **[2026-06-03]** **Ideogram 4 released!** Inference code and weights\n  are now public, and our [technical blog post](https:\u002F\u002Fideogram.ai\u002Fblog\u002Fideogram-4.0\u002F) is live. See the\n  [Quick Start](#quick-start) section to generate your first image, or try the\n  model online at [ideogram.ai](https:\u002F\u002Fideogram.ai\u002F).\n\n## Model Zoo\n\n| Model | Params | Weight Quantization | Supported Hardware | Diffusers Support | License |\n| :---  | :---:  | :---:        | :---:   | :---:   | :---:   |\n| **[Ideogram 4 (nf4)](https:\u002F\u002Fhuggingface.co\u002Fideogram-ai\u002Fideogram-4-nf4)** | 9.3B | nf4 | CUDA | Yes | [Ideogram 4 Non-Commercial](model_licenses\u002FLICENSE-IDEOGRAM-4-NON-COMMERCIAL) |\n| **[Ideogram 4 (fp8)](https:\u002F\u002Fhuggingface.co\u002Fideogram-ai\u002Fideogram-4-fp8)** | 9.3B | fp8 | All | No | [Ideogram 4 Non-Commercial](model_licenses\u002FLICENSE-IDEOGRAM-4-NON-COMMERCIAL) |\n\nWe plan to support more quantizations in the future.\n\n\n## Performance\n\nWe evaluate Ideogram 4 across third-party arenas and benchmarks, standard\nopen-source benchmarks, and our own internal human-preference benchmark. Across\nall of them, **Ideogram 4 is the best open-weight image model by far, and sits\nat the frontier of design.**\n\n### Design Arena\n\n[Design Arena](https:\u002F\u002Fwww.designarena.ai\u002F) is a third-party image Elo\nleaderboard focused specifically on design-oriented generation. On the overall\nboard, Ideogram 4 is the top-ranked open-weight model, trailing only proprietary\nGPT and Gemini models:\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fbenchmarks\u002Fdesign_arena.png\" alt=\"Design Arena overall image Elo leaderboard with Ideogram 4.0 as the top open-weight model\">\n\u003C\u002Fp>\n\nFiltered to open-weight models only, Ideogram 4 leads by a commanding margin,\nwell ahead of the next-best open model:\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fbenchmarks\u002Fdesign_arena2.png\" alt=\"Design Arena open-weight image Elo leaderboard, with Ideogram 4.0 well ahead of all other open models\">\n\u003C\u002Fp>\n\n### ContraLabs\n\n[ContraLabs](https:\u002F\u002Fcontralabs.com\u002Fresearch) ran a blind typography evaluation judged by\nten professional designers from Contra's top-earning talent. Ideogram 4 leads on\nfirst-place win rate, picked as the best of four models 47.9% of the time\noverall — well ahead of Gemini 3.1 Flash Image Preview (Nano Banana 2) at 30.0%,\nFLUX.2 [max] (15.5%), and Grok Imagine 1.0 (15.0%):\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fbenchmarks\u002Fcontralabs_typography.png\" alt=\"ContraLabs typography first-place win rate, with Ideogram v4 leading\">\n\u003C\u002Fp>\n\nIt also wins on practical usability: asked \"Would you use this in real client\nwork?\", the same designers rated Ideogram 4 highest at 3.55 \u002F 5 — significantly\nabove Nano Banana 2 (2.84), Grok Imagine 1.0 (2.61), and FLUX.2 [max] (2.49):\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fbenchmarks\u002Fcontralabs_typography2.png\" alt=\"ContraLabs 'would you use this in real client work?' rating, with Ideogram v4 leading\">\n\u003C\u002Fp>\n\n### LMArena\n\nOn [LMArena](https:\u002F\u002Flmarena.ai\u002F), a third-party text-to-image leaderboard that\nmeasures general-purpose text-to-image use cases, Ideogram is the top-ranked\nopen-weight lab and a top-5 image generation lab overall — beaten only by giant\ncompanies with vastly larger budgets and resources:\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fbenchmarks\u002Flmarena_benchmark.png\" alt=\"LMArena text-to-image lab leaderboard with Ideogram\">\n\u003C\u002Fp>\n\n### Ideogram internal eval\n\nFor our internal human-preference benchmark, focused on graphic design and\nphotography, we had graphic designers deeply familiar with professional design\nwork do the rating blind. Bradley-Terry scores rank Ideogram 4 #2 overall —\nbehind only GPT Image 2 medium — and the top open-weight model:\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fbenchmarks\u002Fideogram_benchmark.png\" alt=\"Ideogram internal design leaderboard with Ideogram 4.0\">\n\u003C\u002Fp>\n\n### Open-source benchmarks\n\nOn standard open-source benchmarks measuring core capabilities — layout control\n(7Bench), spatial reasoning and object fidelity (SpatialGenEval), text rendering\n(X-Omni OCR), and prompt alignment (Prism) — Ideogram 4 closes the gap to the\nleading closed-source models across every axis. On layout control (7Bench), it\nis significantly better than all closed-source models:\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fbenchmarks\u002Fopensource.png\" alt=\"Five-axis capability radar comparing Ideogram 4.0 to leading closed-source models on layout control, spatial reasoning, object fidelity, prompt alignment, and text rendering\">\n\u003C\u002Fp>\n\nAt 9.3B parameters, Ideogram 4 delivers the best text rendering of any open-weight\nrelease we benchmarked — ahead of much larger models like Qwen-Image (20B),\nFLUX.2 [dev] (32B), and HunyuanImage 3.0 (80B MoE):\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fbenchmarks\u002Fopensource2.png\" alt=\"Parameter-efficiency scatter plot showing Ideogram 4.0 at 9.3B parameters leading all other open-weight models on text rendering\">\n\u003C\u002Fp>\n\n\n## Quick Start\n\n### Install\n\n```bash\npip install .\n```\n\nIf you plan to modify the code, install in editable mode instead so changes\nunder `src\u002Fideogram4\u002F` take effect without reinstalling:\n\n```bash\npip install -e .\n```\n\n### Model access\n\nThe model weights are **gated** on Hugging Face, so you must accept the gate and\nauthenticate before the code can download them — otherwise the download fails\nwith a `404` \u002F `GatedRepoError`.\n\n1. Open the model page — [ideogram-ai\u002Fideogram-4-nf4](https:\u002F\u002Fhuggingface.co\u002Fideogram-ai\u002Fideogram-4-nf4)\n   (or [ideogram-ai\u002Fideogram-4-fp8](https:\u002F\u002Fhuggingface.co\u002Fideogram-ai\u002Fideogram-4-fp8)) — and click\n   **Agree and access repository** to accept the license gate.\n2. Create a Hugging Face access token at\n   [huggingface.co\u002Fsettings\u002Ftokens](https:\u002F\u002Fhuggingface.co\u002Fsettings\u002Ftokens) and log in so the\n   download is authenticated:\n\n   ```bash\n   hf auth login\n   ```\n\n   Alternatively, export the token directly: `export HF_TOKEN=\"hf_...\"`.\n\n### CLI\n\nThe plain `--prompt` is rewritten into the structured JSON caption the model\nexpects by a \"magic prompt\" LLM. By default this uses Ideogram's hosted\nmagic-prompt API, which is **free** and does the expansion server-side (no local\nmodel or system prompt needed). It reads `IDEOGRAM_API_KEY` — get a key at\nhttps:\u002F\u002Fdeveloper.ideogram.ai\u002F:\n\n```bash\npython run_inference.py \\\n  --prompt \"a ginger cat wearing a tiny wizard hat reading a spellbook\" \\\n  --output out.png \\\n  --quantization \"nf4\" \\\n  --magic-prompt-key \"$IDEOGRAM_API_KEY\"\n```\n\nYou can also run the expansion through your own LLM provider — one of our magic-prompt\nsystem prompt is **open source**. See the\n[Prompting Guide](docs\u002Fprompting.md#magic-prompt) for details.\n\nFor the highest-quality images, set `--height 2048 --width 2048` and\n`--sampler-preset V4_QUALITY_48`.\n\n#### Safety screening with Hive\n\nPrompt and output safety screening is performed via [Hive](https:\u002F\u002Fthehive.ai\u002F).\nSign up and create a Text Moderation key and a Visual Content Moderation key,\nthen export them as `HIVE_TEXT_MODERATION_KEY` and `HIVE_VISUAL_MODERATION_KEY`\n(or pass them via `--hive-text-key` \u002F `--hive-visual-key`).\n\n```bash\npython run_inference.py \\\n  --prompt \"an isometric illustration of a tiny city floating in the clouds\" \\\n  --output out.png \\\n  --quantization \"nf4\" \\\n  --magic-prompt-key \"$MAGIC_PROMPT_API_KEY\" \\\n  --hive-text-key \"$HIVE_TEXT_MODERATION_KEY\" \\\n  --hive-visual-key \"$HIVE_VISUAL_MODERATION_KEY\"\n```\n\nFor sampler presets, parameter reference, and optimization tips, see\n[docs\u002Finference.md](docs\u002Finference.md).\n\n## Model Summary\n\nIdeogram 4 is a **foundation model trained entirely from scratch**, not a\nfine-tune or distillation of any existing checkpoint. It is a flow-matching\ntext-to-image model built on a **fully single-stream** Diffusion Transformer\n(DiT) architecture.\n\n**Architecture:**\n- **Fully single-stream DiT.** Text and image tokens are concatenated into one\n  unified sequence and processed through the same 34-layer transformer, with no\n  separate text or image branches. This enables deep cross-modal interaction at\n  every layer.\n- **Vision-language model as text encoder.** Instead of a text-only encoder\n  like CLIP or T5, Ideogram 4 uses\n  [Qwen3-VL-8B-Instruct](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-8B-Instruct),\n  a full vision-language model that provides far richer understanding of visual\n  concepts. Hidden states are extracted from **13 intermediate layers** and\n  concatenated, giving the model multi-scale semantic features ranging from\n  surface-level token information to deep compositional understanding.\n- **Dual-branch classifier-free guidance.** The conditional (positive) and\n  unconditional (negative) branches can be independently refined, enabling\n  separate control over prompt adherence and image quality.\n- **Flexible resolution.** Native support for any resolution from 256 to 2048\n  (multiples of 16), with aspect ratios up to 6:1. A single model handles\n  everything from square thumbnails to ultrawide banners, with the noise\n  schedule auto-adjusting per resolution.\n\n**Key Capabilities:**\n- **Extreme controllability.** Ideogram 4 is trained on structured JSON\n  captions, giving users unprecedented control over composition, style,\n  lighting, color palette, typography, and spatial layout, all from a single\n  prompt.\n- **State-of-the-art text rendering.** Ideogram 4 delivers best-in-class\n  in-image text generation (signage, logos, captions, watermarks, multi-line\n  text) with high fidelity directly from the prompt.\n- **Spatial layout control.** Bounding-box coordinates in the prompt allow\n  explicit placement of subjects, text elements, and background regions.\n- **Color palette conditioning.** Specify hex colors in the prompt to steer the\n  image's dominant color scheme.\n\nFor full architecture details, see\n[docs\u002Fmodel_architecture.md](docs\u002Fmodel_architecture.md). For a walkthrough of\nhow the pipeline components fit together, see\n[docs\u002Fpipeline.md](docs\u002Fpipeline.md).\n\n## Prompting Guide\n\nIdeogram 4 is trained exclusively on **structured JSON captions**. While\nplain-text prompts work, you will get the best results by providing a JSON\nobject that follows our caption schema.\n\n\nKey points:\n\n- **Use JSON prompts** for maximum controllability — the model was trained on\n  them and understands the structure natively.\n- **Color palette conditioning** — specify a `colour_palette` array of hex\n  colors in the style description to steer the image's color scheme.\n- **Aspect ratio flexibility** — Ideogram 4 supports a wide range of aspect\n  ratios (any multiple-of-16 resolution from 256 to 2048 on each side). This\n  is a key advantage for practical use: portraits, landscapes, banners,\n  phone wallpapers, social media formats, etc.\n- **Bounding-box layout** — specify `bbox` coordinates in the prompt to\n  explicitly place subjects, text elements, and background regions.\n- **Compositional control** — use `compositional_deconstruction` with bounding\n  boxes and per-element descriptions for precise spatial layout.\n\n\n**Why JSON-only training?** We train exclusively on JSON so that training\nand inference share a single, common prompt format. The training captions themselves are deliberately\n**extremely descriptive**: each JSON exhaustively describes everything in\nthe image to maximize training efficiency. The more\ntext-to-image relationships each caption pins down, the more grounded\nsupervision the model extracts from a single training pair, rather than\nhaving to infer those relationships across many sparsely-captioned samples.\n\n**Why JSON at inference time?** Because the model was trained on captions\nthat name every object explicitly, the most reliable way to get every\nrequested object rendered is to mirror that pattern. Plain-text prompts still work, but\nwon't perform as well since the model was only trained on structured JSON captions.\n\n**Don't want to write JSON by hand?** That's what *magic prompt* is for: it uses\nan LLM to expand a plain-text prompt into a full structured caption before\ngeneration, so you get JSON-quality results from a casual prompt. It runs by\ndefault in `run_inference.py` (see the [CLI](#cli) section).\n\nSee [docs\u002Fprompting.md](docs\u002Fprompting.md) for a full guide.\n\n## Documentation\n\n| Document | Description |\n| :------- | :---------- |\n| [docs\u002Fprompting.md](docs\u002Fprompting.md) | How to write JSON prompts, color palette conditioning, aspect ratios |\n| [docs\u002Finference.md](docs\u002Finference.md) | Sampler presets, parameter reference, resolutions, optimization tips |\n| [docs\u002Fmodel_architecture.md](docs\u002Fmodel_architecture.md) | Architecture diagram, DiT spec, component details |\n| [docs\u002Fpipeline.md](docs\u002Fpipeline.md) | Conceptual pipeline walkthrough — how all components fit together |\n| [docs\u002Fdevelopment.md](docs\u002Fdevelopment.md) | Dev setup, pre-commit hooks, contributing |\n| [docs\u002Fsafety.md](docs\u002Fsafety.md) | Pre-training, post-training, and inference-time safety mitigations; how to report violations |\n\n## Citation\n\nIf you find the provided code or models useful for your research, consider citing them as:\n\n\n```bibtex\n@misc{ideogram-4-2026,\n    author={Ideogram AI},\n    title={{Ideogram 4}},\n    year={2026},\n    howpublished={\\url{https:\u002F\u002Fideogram.ai\u002Fblog\u002Fideogram-4.0\u002F}},\n}\n```\n\n## We're Hiring!\n\nWe're looking for **Research Scientists** and **Research Engineers** to\nwork on next-generation generative models and the products built on top of\nthem. Interested candidates please apply https:\u002F\u002Fjobs.ashbyhq.com\u002Fideogram\n","Ideogram 4 是一个前沿的开源图像生成模型，专为设计领域打造。它从零开始训练，不是基于任何现有模型的微调版本，具备先进的基础模型特性，包括结构化的JSON提示接口、一流的多语言文本渲染能力、深度语言理解以及明确的布局和色彩控制功能，并支持2k分辨率图像输出。该项目使用Python开发，采用Apache License 2.0许可协议。适用于需要高质量图像生成的应用场景，如广告设计、插画创作、海报制作等，尤其适合追求创新视觉效果的设计者和开发者。",2,"2026-06-11 04:10:04","CREATED_QUERY"]