The Full Pipeline Explorer
Follow a tweet's complete journey from posting to your timeline through all 5 algorithmic stages. See exactly how scoring, filters, and penalties determine what you see.
What Is The Recommendation Pipeline?
Every day, Twitter processes approximately 1 billion tweets through a 5-stage pipeline. By the time one reaches your feed, it has passed through candidate generation, feature extraction, machine learning scoring, multiple filters and penalties, and final mixing. Only ~4% survive to appear in feeds.
The Five Stages
Stage 1: Candidate Generation (~1B → ~1,400)
Fetch potential tweets from various sources
Stage 2: Feature Hydration (~1,400 tweets)
Attach ~6,000 features to each tweet
Stage 3: Heavy Ranker ML Scoring (~1,400 tweets)
Predict 15 engagement types, calculate weighted scores
Stage 4: Filters & Penalties (~1,400 → ~100-200)
Apply multipliers, diversity penalties, safety filters
Stage 5: Mixing & Serving (~100-200 → 50-100)
Insert ads, modules, deliver final timeline
Why This Matters
- Extreme filtering: 96% of tweets never reach any feeds
- Multi-stage compounding: Advantages and penalties multiply across stages
- Invisible decisions: Most filtering happens before you ever see rankings
- Same tweet, different treatment: Identical content gets different scores for different users
Follow a Tweet Through The Pipeline
Configure the Tweet
Choose a tweet scenario to follow through the pipeline. Each scenario has realistic engagement probabilities and characteristics.
🔥 Viral Educational Thread
In-NetworkHigh-quality thread from someone you follow in your main interest cluster
🌐 Out-of-Network Quality
Out-of-NetworkGreat content from someone you don't follow, different cluster
⚡ Controversial Take
In-NetworkHot take that drives replies, from followed author
📝 3rd Tweet from Same Author
In-NetworkGood content but author already has 2 tweets in your feed
The Technical Details
Stage 1: Candidate Generation
Fetch ~1,400 candidate tweets from various sources based on your profile:
- In-Network: ~50% from people you follow (via Earlybird search)
- SimClusters: ~20% from your interest clusters
- Real Graph: ~15% from social connections
- UTEG: ~15% from engagement graph
Stage 2: Feature Hydration
Attach ~6,000 features to each tweet:
- Author features (follower count, verified status, reputation score)
- Tweet features (media type, length, recency, topic)
- Engagement features (predicted probabilities for 15 engagement types)
- User-tweet features (cluster similarity, real graph connection)
Stage 3: Heavy Ranker (ML Scoring)
MaskNet model predicts 15 engagement probabilities and calculates weighted score:
score = Σ (probability_i × weight_i)
Top weights:
- Reply with Author Engagement: 75.0
- Reply: 13.5
- Good Profile Click: 12.0
- Retweet: 1.0
- Favorite: 0.5
Negative weights:
- Negative Feedback: -74.0
- Report: -369.0
Stage 4: Filters & Penalties
Multiple filters reshape the ranking:
- Out-of-Network Penalty: 0.75x multiplier (25% penalty)
- Author Diversity: Exponential decay for multiple tweets from same author
- Cluster Scoring: Multiply by your cluster interest (this creates filter bubbles!)
- Feedback Fatigue: 80% penalty after "not interested" click
- Previously Seen: Remove tweets you've already seen
- Safety Filters: Remove NSFW, blocked users, muted keywords
Stage 5: Mixing & Serving
Insert ads, promoted tweets, follow recommendations, and serve final timeline.
Code References
Candidate generation: ForYouScoredTweetsCandidatePipelineConfig.scala
Heavy Ranker weights: HomeGlobalParams.scala:786-1028
Scoring computation: NaviModelScorer.scala:139-178
Out-of-network penalty: RescoringFactorProvider.scala:45-57
Author diversity: AuthorBasedListwiseRescoringProvider.scala:54
Cluster scoring: ApproximateCosineSimilarity.scala:84-94