[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-77464":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":15,"stars7d":16,"stars30d":17,"stars90d":14,"forks30d":14,"starsTrendScore":18,"compositeScore":19,"rankGlobal":9,"rankLanguage":9,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":21,"hasPages":21,"topics":23,"createdAt":9,"pushedAt":9,"updatedAt":24,"readmeContent":25,"aiSummary":26,"trendingCount":14,"starSnapshotCount":14,"syncStatus":15,"lastSyncTime":27,"discoverSource":28},77464,"open-collider","CL-ML\u002Fopen-collider","CL-ML","A semantic collision engine for non-trivial LLM idea generation. Operationalizes Koestler's bisociation theory (1964): injects structurally distant knowledge domains into the prompt, forces collisions, surfaces non-trivial ideas. Empirical validation in CL-ML\u002Fopen-collider-research.",null,"Python",311,26,1,0,2,4,159,6,60.29,"MIT License",false,"main",[],"2026-06-12 04:01:21","\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fbanners\u002Fhero.png\" alt=\"Open Collider, a semantic collision engine for non-trivial idea generation\" width=\"100%\">\n\u003C\u002Fp>\n\n> **Built by [Cédric Lion](https:\u002F\u002Ftwitter.com\u002Fcdriclion) · [@oparine_ai](https:\u002F\u002Ftwitter.com\u002Foparine_ai)**\n> Follow for AI creativity research updates · [Read the launch story](https:\u002F\u002Fcdriclion.substack.com\u002Fp\u002Fwhy-direct-prompting-pushes-llms)\n\n# Open Collider\n\n**Open Collider is a method to escape AI slop.** It's a semantic collision engine for non-trivial idea generation: instead of asking an LLM directly (where outputs converge to the same predictable region), it forces the model to reason through a counter-intuitive principle from a structurally distant domain *before* generating ideas, producing outputs that couldn't exist without the collision.\n\nIt's the first method shipped by [Oparine](https:\u002F\u002Foparine.ai), a research practice on the limits of AI creativity.\n\n---\n\n## Quick start\n\nOpen Collider runs inside [Claude Code](https:\u002F\u002Fclaude.ai\u002Fcode), in two modes. Requires **Python >=3.10**.\n\n### Skill mode (free, no API key)\n\nRequires a Claude Code Max subscription. Claude Code orchestrates everything as subagents.\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FCL-ML\u002Fopen-collider.git\ncd open-collider\npip install -e .\n```\n\n### API mode (fast, parallel, reliable)\n\nRequires an Anthropic API key. Python orchestrates LLM calls in parallel.\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FCL-ML\u002Fopen-collider.git\ncd open-collider\npip install -e \".[api]\"\ncp .env.example .env\n# Edit .env with your ANTHROPIC_API_KEY\n```\n\n### Then in Claude Code:\n\nRun these slash commands successively:\n\n```\n\u002Fcollider_setup     # create a project (brief, reference texts, scoring axes)\n\u002Fbrainstorm         # run iterations (domains → ideas → scoring → curation → feedback)\n```\n\nOn first `\u002Fbrainstorm`, you'll be asked to choose API or Skill mode. The choice is saved per project.\n\n|                | **API mode**                    | **Skill mode**                       |\n|----------------|---------------------------------|--------------------------------------|\n| Speed          | ~10 min\u002Fiteration (parallel)    | ~25 min\u002Fiteration (sequential)       |\n| Cost           | ~$2–3\u002Fiteration                 | Free (Max subscription covers it)    |\n| Reliability    | Rock-solid (Python orchestration) | Can be flaky (subagent coordination) |\n| Requirements   | Anthropic API key               | Claude Code Max subscription         |\n\n**What you'll see on a first run.** `\u002Fcollider_setup` produces a project folder with your brief, reference texts, and scoring axes. `\u002Fbrainstorm` then prints the domain bank as it generates, streams idea batches per collision, scores them on your axes, and presents curated ideas inline for love\u002Flike\u002Ftrash. A first iteration ends with a `REPORT.md` you can read or share, and a structured `iter_001\u002F` folder for inspection.\n\n---\n\n## The problem\n\nAsk an LLM for 50 ideas. Then ask again. Measure semantic similarity: 80%+ of outputs cluster in the same region. Different words, same substance. The model converges toward high-probability completions. We call this the **default-prompt basin**; researchers have started calling it the [**Artificial Hivemind**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2510.22954).\n\nAdding more in-domain context to your prompt doesn't fix it. It concentrates the response *deeper* into the same basin.\n\nThe way out is to inject material from somewhere the model wouldn't go on its own.\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fdiagrams\u002F09_escape_mediocrity.png\" alt=\"Injecting distant points A and B: the collision zone escapes the high-density basin\" width=\"40%\">\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\u003Csub>\u003Ci>Visual extracted from the \u003Ca href=\"https:\u002F\u002Fcdriclion.substack.com\u002Fp\u002Fwhy-direct-prompting-pushes-llms\">foundational article\u003C\u002Fa>.\u003C\u002Fi>\u003C\u002Fsub>\u003C\u002Fp>\n\nThe intuition: a constellation of distant attractors stretches the prompt across low-density regions of idea space, where the model wouldn't go on its own. The collision zones, where these attractors' trajectories converge, are where non-trivial ideas can emerge.\n\n→ Full theory and rationale: **[The Open Collider foundational article](https:\u002F\u002Fcdriclion.substack.com\u002Fp\u002Fwhy-direct-prompting-pushes-llms)**.\n\n---\n\n## A concrete example\n\nSame brief, same model, same reference text. The only difference: Open Collider injects a **distant-domain collision** into the prompt.\n\n> **Brief:** *Structural redesigns of Spotify's Discover Weekly that break users out of their taste bubble.*\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fdiagrams\u002Fspotify-side-by-side.png\" alt=\"Default LLM vs Open Collider, 3 ideas each on the Spotify Discover Weekly brief\" width=\"100%\">\n\u003C\u002Fp>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Same comparison as a table (accessible \u002F indexable)\u003C\u002Fb>\u003C\u002Fsummary>\n\n|     | **Default LLM (B): direct prompting** | **Open Collider (A): distant-domain collisions** |\n|-----|-----------------------------------------|---------------------------------------------------|\n| 01  | **Decay Discovery.** Recommendations lose algorithmic weight exponentially after first exposure. Every recommended song gets a timestamp; its likelihood of re-recommendation halves each week, pushing the system to dig into uncharted territory because familiar options become algorithmically unavailable. | **Tail Fracture Protocol.** A Prince Rupert's drop resists a hammer blow to the head but shatters completely if the thin tail is touched. A user's taste has the same topology: their core genre commitments are nearly unbreakable from direct frontal approach, but the peripheral, rarely-played edges are catastrophically open. Surface tracks structurally similar to tail events, not the head. *↳ from glass physics \u002F fracture mechanics* |\n| 02  | **Anti-Clustering Engine.** Map the multidimensional space of all user preferences, then deliberately serve content from the antipodal regions. If a user listens to indie folk, the engine pulls from heavy metal, experimental jazz, K-pop. Musical comfort treated as a problem to solve, not a preference to indulge. | **Production Chain Triangulation.** Genre tags are listener-facing constructs and useless for cross-genre discovery (defined by the same taste clusters they produce). Triangulate via production chain data: engineer, studio, mastering. A mastering engineer who worked on a record the user loves has worked on records across twenty genres they've never touched. Craft lineage as the bridge, not sonic similarity. *↳ from supply-chain provenance* |\n| 03  | **Skip Inversion Algorithm.** Tracks users skip most frequently get promoted; skip behavior reframed as challenge, not poor quality. Songs with diverse skip patterns across taste profiles receive amplification. The mechanism distinguishes \"bad\" skips (immediate rejection) from \"challenging\" skips (unfamiliarity). | **Substrate Penetration Scheduler.** Koji mold infiltrates rice with enzymatic hyphae for days before any visible transformation. Applied to discovery: instead of recommending unfamiliar tracks, inject micro-doses of structural elements from distant genres (a tuning system, a rhythmic subdivision, a harmonic ratio) embedded inside tracks the user already streams. *↳ from fermentation biology* |\n\nThe B ideas are unobjectionable mechanisms: same neighborhood, tweaks to the recommendation function. The A ideas are pulled from glass physics, supply chains, and fermentation biology. Same brief; nowhere near the dense center.\n\u003C\u002Fdetails>\n\n---\n\n## How a session works\n\nOC runs as **brainstorm sessions** built from multiple **iterations**. Each iteration runs the four steps below and surfaces ~10-20 curated ideas for your ideation problem.\n\n![How a session works · Brief → Domains → Collide → Curate](assets\u002Fdiagrams\u002F10_pipeline.png)\n\nThe four steps:\n\n1. **Brief.** Describe your ideation problem, what kinds of ideas you're looking for, what makes a good idea, and give the model **raw reference material** (texts, examples, samples of voice) so it has rich context about the problem. General-purpose, any ideation problem welcome.\n2. **Domains.** An LLM generates structurally distant knowledge domains. Each one carries a counter-intuitive *active principle* (a mechanism) and a bridging question toward your problem.\n3. **Collide.** The mass-generation engine. Each (reference text × distant domain) pair gets its own isolated context window. Goal: produce **a massive volume of candidate ideas**, drawn from many reference materials and many structurally distant domains in parallel. ~20 ideas per collision, ~240 per strategy, three strategies per iteration.\n4. **Curate.** The essential filter. From the mass-generated pool, extract the **gems** that are both **relevant** to your brief AND **non-trivial** (true collisions, verifiable mechanisms, in your voice). Most of the mass is noise; that's expected. The whole point is to surface the few high-signal ideas worth your attention.\n\nThen **feedback**: you apply *love \u002F like \u002F trash* to curated ideas. Loved domains *deepen* into new specialties; loved mechanisms *refresh* into new disciplines; fresh domains keep exploring. Sessions typically exhaust after 3–5 iterations.\n\n---\n\n## Three domain evolution strategies\n\nAfter iteration 1 (which runs Fresh only), subsequent iterations run all three in parallel, weighted by your feedback:\n\n- **Fresh.** Random distant domains, excluding all previously used families. Pure exploration.\n- **Deepen.** New specialties within the families that produced loved ideas. Exploit productive territory.\n- **Refresh.** Extracts causal mechanisms from loved+liked ideas, finds new disciplines with the same structural patterns. Transfer what works.\n\n---\n\n## Empirical evidence\n\nA 12-project benchmark tests Open Collider on two questions: *do the outputs really move away from the default-prompt cloud?* (geometric distance), and *are the resulting ideas actually better, or just different (and possibly absurd)?* (blind LLM-judge preference).\n\n### Distance shift (semantic embeddings)\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fresults\u002Fviz_panel12_forest.png\" alt=\"Forest plot, 12-project panel, A vs B \u002F C vs B \u002F D vs B, with sign tests\" width=\"88%\">\n\u003C\u002Fp>\n\nOC's outputs (A) systematically sit further from the default-prompt cloud (B) than two falsifiers: instruction-only \"be original\" (C) and a length-matched deep brief with no cross-domain content (D).\n\n**A vs B passes 12\u002F12 projects (p = 0.0002).** Both falsifiers also produce a measurable shift, but what the falsifiers *don't* match is the **amplitude**: A's effect size is roughly **4–13× larger than C** (\"be original\" instruction) and **3–4× larger than D** (length-matched deep brief). Direct pairwise checks (BGE nn_in_B) confirm A is the strongest mover: A vs C passes 11\u002F12 and A vs D passes 11\u002F12 (both p ≤ .003). The geometric shift is real, embedding-family-independent, and not explained by either \"be-original\" instructions or longer briefs.\n\n### Quality check (blind LLM-judge)\n\nDistance alone is not enough: higher embedding distance could simply mean the ideas are absurd or irrelevant. So a second test: **three independent LLM judges** (Claude Opus 4.6 + GPT-4o + Gemini 2.5), **4,320 blind pairwise verdicts** on the top-10 curated ideas per 240-idea batch of each condition, scored on two axes: *which is more original?* and *which is the better idea overall to pursue?*\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fresults\u002Fviz_panel12_judge_heatmap.png\" alt=\"Per-project mean A_share across 3 judges, 6 contrasts × axes, panel of 12\" width=\"88%\">\n\u003C\u002Fp>\n\n**On `originality`**, A consistently wins against every baseline:\n\n| Contrast | A wins | mean A_share | p |\n|---|---|---|---|\n| A vs B (collisions vs baseline) | **10\u002F12** | **62%** | .019 |\n| A vs C (collisions vs \"be original\") | **10\u002F12** | **65%** | .019 |\n| A vs D (collisions vs longer brief) | **10\u002F12** | **63%** | .019 |\n\n**On `best_overall`** (which is the better idea to pursue?), A ties or beats every baseline directionally (A vs B 9\u002F12, mean 57%; A vs C 9\u002F12, mean 59%; A vs D 7\u002F12, mean 53%). The signal is weaker than originality, but never reverses: distant-domain collisions **don't sacrifice relevance** for novelty.\n\n→ Full long-form write-up, methodology, and one-click reproduction: **[The Open Collider foundational article](https:\u002F\u002Fcdriclion.substack.com\u002Fp\u002Fwhy-direct-prompting-pushes-llms)**.\n\n---\n\n## Project structure\n\n```\nprojects\u002Fmy_project\u002F\n├── brief_validated.json          # your problem definition\n├── input_bank.yaml               # reference texts index + forbidden topics\n├── project_config.yaml           # axis weights, strategy config, llm_backend\n├── prompts\u002F\n│   ├── idea_generation.md        # customizable generation prompt\n│   └── judge.md                  # scoring prompt (calibration examples)\n├── texts\u002F\n│   └── T01.txt, T02.txt …        # your reference texts\n└── brainstorms\u002F\n    └── brainstorm_001\u002F\n        ├── REPORT.md             # human-readable output (accumulates)\n        └── iter_001\u002F\n            ├── scored_ideas.json\n            ├── curated_ideas.json\n            ├── insights_without_collision.json\n            └── domains\u002F\n```\n\n---\n\n## How it works technically\n\nPython handles prompt building and response parsing. The LLM calls happen either via the Anthropic API (API mode) or via Claude Code subagents (skill mode).\n\n**API mode:**\n```\n\u002Fbrainstorm → Python orchestrator:\n  1. Generate domains       (sequential, Opus)\n  2. Generate ideas         (parallel, Sonnet, 4 concurrent)\n  3. Score ideas            (parallel batches, Sonnet, 3 concurrent)\n  4. Apply threshold + finalize\n  → Claude Code curates inline + displays + collects flags\n```\n\n**Skill mode:**\n```\n\u002Fbrainstorm → Claude Code orchestrates:\n  1. Spawn subagents for domain generation\n  2. Spawn subagents for idea generation (parallel)\n  3. Spawn subagents for scoring (parallel)\n  4. Finalize\n  → Curate inline + display + collect flags\n```\n\n---\n\n## The theory\n\nArthur Koestler's **bisociation** (1964): creativity comes from the collision of two incompatible cognitive frames. Not \"think outside the box\", but *structurally engineer the collision*. Compatible frames (marketing + music) produce recombination. Incompatible frames (magnetohydrodynamics + music) produce invention.\n\nOpen Collider is a methodical implementation of this principle, scaled by LLMs.\n\n→ Full conceptual framework (gravity wells in idea space, why distance matters, falsifiable claims): **[The Open Collider foundational article](https:\u002F\u002Fcdriclion.substack.com\u002Fp\u002Fwhy-direct-prompting-pushes-llms)**.\n\n---\n\n## About\n\nOpen Collider is a method developed at **[Oparine](https:\u002F\u002Foparine.ai)**, a research practice exploring the limits of LLM creativity. The engine, the methodology, and the 12-project benchmark are open source.\n\n## About me & Oparine\n\nI'm **Cédric Lion**. Quantitative economist and data scientist trained at PSL (Paris Sciences & Lettres). Former research analyst on Web3 economics at Animoca Brands. Full-time on AI creativity research since 2025.\n\nI'm convinced **human creativity is synthesizable by LLMs**: with the right architecture, language models can surface ideas that were never formulated in their training data. Open Collider is the first concrete bet in that direction. I build the systems, run the experiments, and publish what works.\n\nI founded **Oparine** to pursue this in two directions at once: fundamental research on what mechanically moves models out of their default outputs, and applied engagements with companies that want to build internal creativity tools (R&D ideation, strategic exploration, brand voice, product invention).\n\nIf that's a fit for your team: **hello@oparine.ai**.\n\n## License\n\nMIT. See [LICENSE](LICENSE).\n","Open Collider 是一个用于非平凡创意生成的语义碰撞引擎。它基于Koestler的双关联理论（1964），通过将结构上远离的知识领域注入提示，强制产生碰撞，从而激发新颖的想法。该项目采用Python编写，支持两种运行模式：免费但速度较慢的Skill模式和快速但需付费的API模式。在Skill模式下，用户需要Claude Code Max订阅；而在API模式下，则需要Anthropic API密钥以实现并行处理。Open Collider特别适合于需要突破传统思维限制、寻找创新解决方案的场景，如产品设计、科研探索或艺术创作等领域。","2026-06-11 03:55:30","CREATED_QUERY"]