The Invisible Filter

You and your friend see the same tweets, but in completely different orders. This is how clusters create filter bubbles.

What Are Cluster Filters?

Twitter doesn't show everyone the same feed in the same order. Even if you and your friend follow similar accounts and see identical tweets, those tweets will rank completely differently for each of you. This isn't random—it's driven by an invisible mechanism called cluster-based personalization.

The Core Mechanism

Twitter assigns every user and every tweet to invisible communities called "clusters" (there are ~145,000 of them). Think of clusters as interest groups: "AI/Tech enthusiasts," "Cooking fans," "Political junkies," etc. When scoring tweets for your feed, the algorithm multiplies each tweet's base quality score by your cluster interest.

The result: The exact same tweet with the exact same base quality will score completely differently for you vs your friend based on which clusters you belong to and how strongly.

Why This Matters

This mechanism creates different realities on the same platform:

The Shape of the Behavior

Cluster filtering happens through multiplication, not addition. If you're 60% interested in AI/Tech and 15% interested in Politics:

AI tweet (base quality: 0.85):
  Your score: 0.85 × 0.60 = 0.51

Politics tweet (same base quality: 0.85):
  Your score: 0.85 × 0.15 = 0.128

Same quality, 4× score difference just from clusters!

Consequence: Your existing interests get amplified, minority interests get suppressed, and you drift toward increasingly concentrated feeds over time.

New to clusters? See the Cluster Explorer to understand where these communities come from and how they're based on who you follow.


Experience The Filter

See how cluster interests create completely different feeds for different people. Adjust YOUR cluster interests and compare against a friend's profile—the same 15 tweets will rank in completely different orders.

Configure Your Profiles

👤 You

Adjust your cluster interests to see how it affects your feed ranking.

Total: 100%

👥 Your Friend

Select a friend's profile to compare against yours.

■ AI/Tech 15%
■ Cooking 5%
■ Politics 80%

The Technical Details

The Scoring Formula

Each tweet's final score is calculated as:

final_score = base_quality_score × your_cluster_interest

Where:
- base_quality_score = tweet's inherent quality (0.0 to 1.0)
- your_cluster_interest = your interest in the tweet's cluster (0.0 to 1.0)

Example:
Tweet belongs to AI/Tech cluster (cluster_id: 12345)
Base quality: 0.85
Your AI/Tech interest: 0.60
Your friend's AI/Tech interest: 0.15

Your score: 0.85 × 0.60 = 0.51
Friend's score: 0.85 × 0.15 = 0.128

Same tweet, 4× score difference!

How Clusters Affect the Full Pipeline

Cluster-based personalization doesn't just happen once—it happens at multiple stages, compounding the effect.

Stage 1: Candidate Generation

Before any engagement scoring happens, the algorithm fetches candidates based on YOUR clusters:

Your clusters: 60% AI, 25% Cooking, 15% Politics

Candidate fetching:
- Fetch 800 tweets from AI cluster
- Fetch 800 tweets from Cooking cluster
- Fetch 800 tweets from Politics cluster

Initial bias: More AI tweets fetched simply because you're 60% AI!

Stage 2: Cluster Scoring (What This Simulator Shows)

Each tweet gets multiplied by your cluster interest:

AI tweet (base quality: 0.85):
- Your score: 0.85 × 0.60 = 0.51
- Friend's score: 0.85 × 0.15 = 0.128

Politics tweet (base quality: 0.85):
- Your score: 0.85 × 0.15 = 0.128
- Friend's score: 0.85 × 0.80 = 0.68

Same quality tweet, 5.3× score difference!

Stage 3: Engagement Scoring

After cluster multiplication, engagement weights are applied. But cluster filtering already determined which tweets you see!

The Compound Effect

Cluster scoring happens at MULTIPLE stages:

This is why 60/40 becomes 76/24 in the Journey Simulator - the multiplicative effect compounds at multiple stages.


Code References

Cluster assignment (InterestedIn): InterestedInFromKnownFor.scala:26-30

Multiplicative scoring: ApproximateCosineSimilarity.scala:84-94

Cluster count: ~145,000 clusters total (most users assigned to 10-20)

L2 normalization: SimClustersEmbedding.scala:59-72