[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-5427":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":15,"stars7d":16,"stars30d":17,"stars90d":14,"forks30d":14,"starsTrendScore":18,"compositeScore":19,"rankGlobal":9,"rankLanguage":9,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":21,"hasPages":21,"topics":23,"createdAt":9,"pushedAt":9,"updatedAt":24,"readmeContent":25,"aiSummary":26,"trendingCount":14,"starSnapshotCount":14,"syncStatus":27,"lastSyncTime":28,"discoverSource":29},5427,"x-algorithm","xai-org\u002Fx-algorithm","xai-org","Algorithm powering the For You feed on X",null,"Rust",26139,4500,298,0,10,92,9665,67,120,"Apache License 2.0",false,"main",[],"2026-06-12 04:00:25","# X For You Feed Algorithm\n\nThis repository contains the core recommendation system powering the \"For You\" feed on X. It combines in-network content (from accounts you follow) with out-of-network content (discovered through ML-based retrieval) and ranks everything using a Grok-based transformer model.\n\n> **Note:** The transformer implementation is ported from the [Grok-1 open source release](https:\u002F\u002Fgithub.com\u002Fxai-org\u002Fgrok-1) by xAI, adapted for recommendation system use cases.\n\n## Table of Contents\n\n- [Overview](#overview)\n- [System Architecture](#system-architecture)\n- [Components](#components)\n  - [Home Mixer](#home-mixer)\n  - [Thunder](#thunder)\n  - [Phoenix](#phoenix)\n  - [Candidate Pipeline](#candidate-pipeline)\n- [How It Works](#how-it-works)\n  - [Pipeline Stages](#pipeline-stages)\n  - [Scoring and Ranking](#scoring-and-ranking)\n  - [Filtering](#filtering)\n- [Key Design Decisions](#key-design-decisions)\n- [License](#license)\n\n---\n\n## Overview\n\nThe For You feed algorithm retrieves, ranks, and filters posts from two sources:\n\n1. **In-Network (Thunder)**: Posts from accounts you follow\n2. **Out-of-Network (Phoenix Retrieval)**: Posts discovered from a global corpus\n\nBoth sources are combined and ranked together using **Phoenix**, a Grok-based transformer model that predicts engagement probabilities for each post. The final score is a weighted combination of these predicted engagements.\n\nWe have eliminated every single hand-engineered feature and most heuristics from the system. The Grok-based transformer does all the heavy lifting by understanding your engagement history (what you liked, replied to, shared, etc.) and using that to determine what content is relevant to you.\n\n---\n\n## System Architecture\n\n```\n┌─────────────────────────────────────────────────────────────────────────────────────────────┐\n│                                    FOR YOU FEED REQUEST                                     │\n└─────────────────────────────────────────────────────────────────────────────────────────────┘\n                                               │\n                                               ▼\n┌─────────────────────────────────────────────────────────────────────────────────────────────┐\n│                                         HOME MIXER                                          │\n│                                    (Orchestration Layer)                                    │\n├─────────────────────────────────────────────────────────────────────────────────────────────┤\n│                                                                                             │\n│   ┌─────────────────────────────────────────────────────────────────────────────────────┐   │\n│   │                                   QUERY HYDRATION                                   │   │\n│   │  ┌──────────────────────────┐    ┌──────────────────────────────────────────────┐   │   │\n│   │  │ User Action Sequence     │    │ User Features                                │   │   │\n│   │  │ (engagement history)     │    │ (following list, preferences, etc.)          │   │   │\n│   │  └──────────────────────────┘    └──────────────────────────────────────────────┘   │   │\n│   └─────────────────────────────────────────────────────────────────────────────────────┘   │\n│                                              │                                              │\n│                                              ▼                                              │\n│   ┌─────────────────────────────────────────────────────────────────────────────────────┐   │\n│   │                                  CANDIDATE SOURCES                                  │   │\n│   │         ┌─────────────────────────────┐    ┌────────────────────────────────┐       │   │\n│   │         │        THUNDER              │    │     PHOENIX RETRIEVAL          │       │   │\n│   │         │    (In-Network Posts)       │    │   (Out-of-Network Posts)       │       │   │\n│   │         │                             │    │                                │       │   │\n│   │         │  Posts from accounts        │    │  ML-based similarity search    │       │   │\n│   │         │  you follow                 │    │  across global corpus          │       │   │\n│   │         └─────────────────────────────┘    └────────────────────────────────┘       │   │\n│   └─────────────────────────────────────────────────────────────────────────────────────┘   │\n│                                              │                                              │\n│                                              ▼                                              │\n│   ┌─────────────────────────────────────────────────────────────────────────────────────┐   │\n│   │                                      HYDRATION                                      │   │\n│   │  Fetch additional data: core post metadata, author info, media entities, etc.       │   │\n│   └─────────────────────────────────────────────────────────────────────────────────────┘   │\n│                                              │                                              │\n│                                              ▼                                              │\n│   ┌─────────────────────────────────────────────────────────────────────────────────────┐   │\n│   │                                      FILTERING                                      │   │\n│   │  Remove: duplicates, old posts, self-posts, blocked authors, muted keywords, etc.   │   │\n│   └─────────────────────────────────────────────────────────────────────────────────────┘   │\n│                                              │                                              │\n│                                              ▼                                              │\n│   ┌─────────────────────────────────────────────────────────────────────────────────────┐   │\n│   │                                       SCORING                                       │   │\n│   │  ┌──────────────────────────┐                                                       │   │\n│   │  │  Phoenix Scorer          │    Grok-based Transformer predicts:                   │   │\n│   │  │  (ML Predictions)        │    P(like), P(reply), P(repost), P(click)...          │   │\n│   │  └──────────────────────────┘                                                       │   │\n│   │               │                                                                     │   │\n│   │               ▼                                                                     │   │\n│   │  ┌──────────────────────────┐                                                       │   │\n│   │  │  Weighted Scorer         │    Weighted Score = Σ (weight × P(action))            │   │\n│   │  │  (Combine predictions)   │                                                       │   │\n│   │  └──────────────────────────┘                                                       │   │\n│   │               │                                                                     │   │\n│   │               ▼                                                                     │   │\n│   │  ┌──────────────────────────┐                                                       │   │\n│   │  │  Author Diversity        │    Attenuate repeated author scores                   │   │\n│   │  │  Scorer                  │    to ensure feed diversity                           │   │\n│   │  └──────────────────────────┘                                                       │   │\n│   └─────────────────────────────────────────────────────────────────────────────────────┘   │\n│                                              │                                              │\n│                                              ▼                                              │\n│   ┌─────────────────────────────────────────────────────────────────────────────────────┐   │\n│   │                                      SELECTION                                      │   │\n│   │                    Sort by final score, select top K candidates                     │   │\n│   └─────────────────────────────────────────────────────────────────────────────────────┘   │\n│                                              │                                              │\n│                                              ▼                                              │\n│   ┌─────────────────────────────────────────────────────────────────────────────────────┐   │\n│   │                              FILTERING (Post-Selection)                             │   │\n│   │                 Visibility filtering (deleted\u002Fspam\u002Fviolence\u002Fgore etc)               │   │\n│   └─────────────────────────────────────────────────────────────────────────────────────┘   │\n│                                                                                             │\n└─────────────────────────────────────────────────────────────────────────────────────────────┘\n                                               │\n                                               ▼\n┌─────────────────────────────────────────────────────────────────────────────────────────────┐\n│                                     RANKED FEED RESPONSE                                    │\n└─────────────────────────────────────────────────────────────────────────────────────────────┘\n```\n\n---\n\n## Components\n\n### Home Mixer\n\n**Location:** [`home-mixer\u002F`](home-mixer\u002F)\n\nThe orchestration layer that assembles the For You feed. It leverages the `CandidatePipeline` framework with the following stages:\n\n| Stage | Description |\n|-------|-------------|\n| Query Hydrators | Fetch user context (engagement history, following list) |\n| Sources | Retrieve candidates from Thunder and Phoenix |\n| Hydrators | Enrich candidates with additional data |\n| Filters | Remove ineligible candidates |\n| Scorers | Predict engagement and compute final scores |\n| Selector | Sort by score and select top K |\n| Post-Selection Filters | Final visibility and dedup checks |\n| Side Effects | Cache request info for future use |\n\nThe server exposes a gRPC endpoint (`ScoredPostsService`) that returns ranked posts for a given user.\n\n---\n\n### Thunder\n\n**Location:** [`thunder\u002F`](thunder\u002F)\n\nAn in-memory post store and realtime ingestion pipeline that tracks recent posts from all users. It:\n\n- Consumes post create\u002Fdelete events from Kafka\n- Maintains per-user stores for original posts, replies\u002Freposts, and video posts\n- Serves \"in-network\" post candidates from accounts the requesting user follows\n- Automatically trims posts older than the retention period\n\nThunder enables sub-millisecond lookups for in-network content without hitting an external database.\n\n---\n\n### Phoenix\n\n**Location:** [`phoenix\u002F`](phoenix\u002F)\n\nThe ML component with two main functions:\n\n#### 1. Retrieval (Two-Tower Model)\nFinds relevant out-of-network posts:\n- **User Tower**: Encodes user features and engagement history into an embedding\n- **Candidate Tower**: Encodes all posts into embeddings\n- **Similarity Search**: Retrieves top-K posts via dot product similarity\n\n#### 2. Ranking (Transformer with Candidate Isolation)\nPredicts engagement probabilities for each candidate:\n- Takes user context (engagement history) and candidate posts as input\n- Uses special attention masking so candidates cannot attend to each other\n- Outputs probabilities for each action type (like, reply, repost, click, etc.)\n\nSee [`phoenix\u002FREADME.md`](phoenix\u002FREADME.md) for detailed architecture documentation.\n\n---\n\n### Candidate Pipeline\n\n**Location:** [`candidate-pipeline\u002F`](candidate-pipeline\u002F)\n\nA reusable framework for building recommendation pipelines. Defines traits for:\n\n| Trait | Purpose |\n|-------|---------|\n| `Source` | Fetch candidates from a data source |\n| `Hydrator` | Enrich candidates with additional features |\n| `Filter` | Remove candidates that shouldn't be shown |\n| `Scorer` | Compute scores for ranking |\n| `Selector` | Sort and select top candidates |\n| `SideEffect` | Run async side effects (caching, logging) |\n\nThe framework runs sources and hydrators in parallel where possible, with configurable error handling and logging.\n\n---\n\n## How It Works\n\n### Pipeline Stages\n\n1. **Query Hydration**: Fetch the user's recent engagements history and metadata (eg. following list)\n\n2. **Candidate Sourcing**: Retrieve candidates from:\n   - **Thunder**: Recent posts from followed accounts (in-network)\n   - **Phoenix Retrieval**: ML-discovered posts from the global corpus (out-of-network)\n\n3. **Candidate Hydration**: Enrich candidates with:\n   - Core post data (text, media, etc.)\n   - Author information (username, verification status)\n   - Video duration (for video posts)\n   - Subscription status\n\n4. **Pre-Scoring Filters**: Remove posts that are:\n   - Duplicates\n   - Too old\n   - From the viewer themselves\n   - From blocked\u002Fmuted accounts\n   - Containing muted keywords\n   - Previously seen or recently served\n   - Ineligible subscription content\n\n5. **Scoring**: Apply multiple scorers sequentially:\n   - **Phoenix Scorer**: Get ML predictions from the Phoenix transformer model\n   - **Weighted Scorer**: Combine predictions into a final relevance score\n   - **Author Diversity Scorer**: Attenuate repeated author scores for diversity\n   - **OON Scorer**: Adjust scores for out-of-network content\n\n6. **Selection**: Sort by score and select the top K candidates\n\n7. **Post-Selection Processing**: Final validation of post candidates to be served\n\n---\n\n### Scoring and Ranking\n\nThe Phoenix Grok-based transformer model predicts probabilities for multiple engagement types:\n\n```\nPredictions:\n├── P(favorite)\n├── P(reply)\n├── P(repost)\n├── P(quote)\n├── P(click)\n├── P(profile_click)\n├── P(video_view)\n├── P(photo_expand)\n├── P(share)\n├── P(dwell)\n├── P(follow_author)\n├── P(not_interested)\n├── P(block_author)\n├── P(mute_author)\n└── P(report)\n```\n\nThe **Weighted Scorer** combines these into a final score:\n\n```\nFinal Score = Σ (weight_i × P(action_i))\n```\n\nPositive actions (like, repost, share) have positive weights. Negative actions (block, mute, report) have negative weights, pushing down content the user would likely dislike.\n\n---\n\n### Filtering\n\nFilters run at two stages:\n\n**Pre-Scoring Filters:**\n| Filter | Purpose |\n|--------|---------|\n| `DropDuplicatesFilter` | Remove duplicate post IDs |\n| `CoreDataHydrationFilter` | Remove posts that failed to hydrate core metadata |\n| `AgeFilter` | Remove posts older than threshold |\n| `SelfpostFilter` | Remove user's own posts |\n| `RepostDeduplicationFilter` | Dedupe reposts of same content |\n| `IneligibleSubscriptionFilter` | Remove paywalled content user can't access |\n| `PreviouslySeenPostsFilter` | Remove posts user has already seen |\n| `PreviouslyServedPostsFilter` | Remove posts already served in session |\n| `MutedKeywordFilter` | Remove posts with user's muted keywords |\n| `AuthorSocialgraphFilter` | Remove posts from blocked\u002Fmuted authors |\n\n**Post-Selection Filters:**\n| Filter | Purpose |\n|--------|---------|\n| `VFFilter` | Remove posts that are deleted\u002Fspam\u002Fviolence\u002Fgore etc. |\n| `DedupConversationFilter` | Deduplicate multiple branches of the same conversation thread |\n\n---\n\n## Key Design Decisions\n\n### 1. No Hand-Engineered Features\nThe system relies entirely on the Grok-based transformer to learn relevance from user engagement sequences. No manual feature engineering for content relevance. This significantly reduces the complexity in our data pipelines and serving infrastructure.\n\n### 2. Candidate Isolation in Ranking\nDuring transformer inference, candidates cannot attend to each other—only to the user context. This ensures the score for a post doesn't depend on which other posts are in the batch, making scores consistent and cacheable.\n\n### 3. Hash-Based Embeddings\nBoth retrieval and ranking use multiple hash functions for embedding lookup\n\n### 4. Multi-Action Prediction\nRather than predicting a single \"relevance\" score, the model predicts probabilities for many actions.\n\n### 5. Composable Pipeline Architecture\nThe `candidate-pipeline` crate provides a flexible framework for building recommendation pipelines with:\n- Separation of pipeline execution and monitoring from business logic\n- Parallel execution of independent stages and graceful error handling\n- Easy addition of new sources, hydrations, filters, and scorers\n\n---\n\n## License\n\nThis project is licensed under the Apache License 2.0. See [LICENSE](LICENSE) for details.\n","该项目是为X平台的“推荐给你”信息流提供支持的核心推荐系统。它结合了用户关注账户的内容（内网内容）和通过机器学习检索发现的内容（外网内容），并使用基于Grok的变压器模型对所有内容进行排名。核心技术特点包括利用Grok-1开源版本的变压器实现，以及完全依赖于机器学习模型来预测用户的参与度概率，从而决定哪些内容对用户最为相关，摒弃了传统的人工特征工程和大部分启发式方法。适用于需要高效个性化内容推荐的应用场景，如社交媒体平台、新闻聚合应用等，能够显著提升用户体验与参与度。",2,"2026-06-11 03:03:15","top_language"]