The X Algorithm Just Threw Out the Rulebook: A Plain-English Guide to the May 2026 Update
May 15, 2026

Back in January I wrote about how the X algorithm really works. That post is mostly still right about behaviour signals, but a lot of the specific mechanics I described are now gone. Here's what changed in the May 15 update, and why this one is a bigger deal than the last.
TL;DR
A few things to know up front:
- The old weighted-formula approach is mostly retired. X explicitly says they've eliminated every hand-engineered feature and most heuristics. A Grok-based transformer now does the heavy lifting.
- The ranker predicts 15 specific actions you might take on a post (down from the 18 I mentioned in January). The set is documented in full in the open-source repo.
- A new content-understanding service called Grox runs alongside the main pipeline. It classifies posts for spam, safety, virality and topic.
- Phoenix, the ranking model, is now publicly downloadable. It's a transformer ported from Grok-1, with 256-dim embeddings.
- Retrieval uses a two-tower model: one tower encodes you, one tower encodes posts, and similarity decides what reaches the ranker.
- Each post is scored in isolation. The ranker can't see other candidates when scoring yours, which means your score doesn't depend on what else is in the batch.
The shift in one line: X used to be a system of rules with a model attached. It's now a model with a few rules attached.
Where We Left Off
In January, X open-sourced its For You algorithm for the first time. The picture that emerged was a system built on layers of hand-tuned logic. There was a diversity penalty that cut post #2's score to 70%, post #3's to 49%, and so on. There was a 7-day freshness cutoff. There were 18 action types with specific weights. Out-of-network posts took a fixed penalty. The model existed, but it sat inside a thick wrapper of human-engineered rules.
That wrapper is mostly gone now. The May 15 release replaces it with a single phrase in the README: "We have eliminated every single hand-engineered feature and most heuristics from the system".
That's a much bigger deal than any of the seven update points xAI lists below it.
What Actually Changed
1. The Heuristics Got Replaced by a Transformer
This is the headline. The old approach was a stack of rules and a ranker that combined them with learned weights. The new approach is mostly transformer, with a thin pipeline around it.
The transformer is called Phoenix, and it's ported from Grok-1, xAI's open-source language model. It's been adapted for recommendations. The way it works in plain terms: instead of looking at your last few interactions and applying a fixed formula, it reads your engagement history as a sequence and learns patterns directly. Same way a language model reads a sequence of tokens.
For creators, this matters because the rules of thumb from January no longer apply cleanly. The 70 percent attenuation on your second post in a row, the specific weights on each action type, the fixed out-of-network penalty: those are model parameters now, not hardcoded thresholds. They can shift between updates. They can be different for different users. You can't game them the way you could game a fixed rule.
The replacement is simpler to describe and harder to manipulate. The model has seen what engagement looks like across billions of examples, and it scores your posts based on that. Your job is to make content the model recognises as something people actually want.
2. The 15 Things the Algorithm Predicts
The ranker predicts probabilities for 15 different actions. Here's the complete list from the repo:
- P(favorite)
- P(reply)
- P(repost)
- P(quote)
- P(click)
- P(profile_click)
- P(video_view)
- P(photo_expand)
- P(share)
- P(dwell)
- P(follow_author)
- P(not_interested)
- P(block_author)
- P(mute_author)
- P(report)
The final score is a weighted sum of these probabilities. Positive actions (favorite, reply, repost, dwell, follow) carry positive weights. Negative actions (not_interested, block, mute, report) carry negative weights and push scores down.
A couple of things worth noting. The list shrank from the 18 I described in January. Some action types were consolidated, others removed. Profile clicks and photo expands are now treated as first-class signals, which they weren't before. Dwell time is still there, which confirms that yes, X is still tracking how long you stop on a post.
The takeaway is the same as before but cleaner: avoid the bottom five. A single report, block or mute does more damage than several favorites can offset.
3. Candidate Isolation: Your Post Stands Alone
Here's a subtle but important design choice. When the transformer scores a post, it can only attend to your engagement history, not to the other posts in the batch. The attention mask is configured to enforce this.
Why this matters: your post's score doesn't depend on what else happens to be in the feed at that moment. The same post, scored for the same user, gets the same number whether it's competing against five great posts or fifty mediocre ones. Scores are consistent and cacheable.
For creators, this rules out a category of theories about timing and "feed slots". Your post isn't fighting other specific posts for a position. It's getting an absolute score based on you, the viewer, and your history. The selector then sorts everything by score and picks the top K.
You're not competing post-against-post. You're competing against your own viewers' interest level.
4. Two Towers for Discovery
Out-of-network content (posts from people you don't follow) gets retrieved through a two-tower model. One tower turns you into a vector. The other tower turns every post into a vector. The system finds the closest matches by dot-product similarity.
This is how the platform decides which strangers' posts even make it into your candidate pool in the first place. It's not magic. It's geometry. Your vector lives somewhere in a high-dimensional space, and the posts that show up are the ones whose vectors land nearby.
What this means practically: if your posting style produces a consistent vector signature, the model can route you to the right audience. If your topics are scattered, your vector is noisy, and you reach fewer of the right people. Consistency isn't a brand strategy here. It's a retrieval-quality strategy.
5. Grox: A Separate Brain for Content Understanding
Grox is the other big addition. It's a service that sits alongside the main feed pipeline and produces content-level signals: spam classification, safety screening, post categorisation, banger detection (likely-to-go-viral content), and reply ranking.
A few things to be clear about. Grox is not a pre-filter that gates posts before they reach the ranker. It runs as its own task-execution engine, with its own embedders and classifiers, and its outputs feed into the main pipeline as hydrated features. The ranker sees a richer post when it scores: not just text and engagement counts, but also "this post looks like spam" or "this post is about software engineering" or "this post has banger potential".
For creators, Grox is the system that's reading your content. If it tags your post as spam, that label travels with the post into the ranker and tanks your score. If it categorises your post under a topic, that helps you reach people who care about that topic.
The classifiers cover:
- Spam detection. Repetitive content, mass-tagged posts, suspicious patterns.
- Safety screening. Two stages of policy checks (initial screen plus a more detailed pass).
- PTOS policy enforcement. Posts that violate platform terms get flagged.
- Banger detection. An early-stage classifier that predicts viral potential before broad distribution.
- Reply ranking. Orders replies under a post by predicted quality.
- Post categorisation. Assigns topics so posts can be matched to interested users.
This is a meaningful change. Until now, X mostly judged your post by what people did with it. Now there's a parallel system judging your post by what it is.
6. The Phoenix Model Is Downloadable
A pre-trained mini Phoenix is now published as a 3GB archive over Git LFS. The mini version has 256-dim embeddings, 4 attention heads, and 2 transformer layers. It's small enough to run on a workstation.
This is the first time the actual weights have been public. The January release was the code; this release is the trained model. You can clone the repo, download the artifacts, and run phoenix/run_pipeline.py end-to-end. Retrieval and ranking, the same way they're composed in production.
For most creators this doesn't change daily behaviour. For researchers, journalists, and curious developers, it changes everything about what's verifiable. You can now test specific posts against the actual ranking model instead of guessing.
7. Hash-Based Embeddings, Richer Context, Ads in the Feed
A few smaller but worth-mentioning shifts:
- Hash-based embeddings. Both retrieval and ranking use multiple hash functions for embedding lookup. This is a memory and speed optimisation, but it also means embeddings are stable across model updates.
- Richer query hydration. The system now pulls in your followed topics, joined starter packs, mutual follow graph, impression bloom filters, IP, demographics, and served history. The personal context window got significantly wider.
- Richer candidate hydration. Posts get enriched with engagement counts, brand safety signals, language code, media presence, quote post expansion, mutual follow Jaccard scores, subscription status, and video duration before they hit the ranker.
- Ads blending. A dedicated
home-mixer/ads/module handles ad placement, with brand-safety tracking that avoids placing ads next to sensitive content.
None of these individually is a story. Together they paint the picture: more context in, more signals out, less hardcoded logic in between.
What This Means If You're Trying to Grow
The January playbook needs a refresh. Here's what I'd update.
The specific numbers from January are stale. The 70 percent attenuation, the 7-day cutoff, the 18 actions: treat them as the kinds of things the model has learned, not as exact values you can plan around. The model can shift these between releases.
Consistency is now a retrieval issue, not just a branding one. Your vector in the two-tower model is what decides which strangers see you. If you post about software one day and crypto the next, your vector is muddier and your discovery suffers.
Grox is reading your content. Write so the classifiers can tell what you're doing. Posts that look spammy to a model (link-heavy, repetitive structure, low-effort threads) get flagged before any human sees them.
You're not competing against other posts. Because of candidate isolation, your score is absolute. So obsessing over what other accounts in your niche are posting doesn't help your individual post. Making your post better for your viewer is what helps.
Reports, blocks, mutes are still the killers. This hasn't changed. Three of the five negative-weight prediction heads can sink a post on their own.
The whole system can be tested. With the mini Phoenix public, the era of guessing at the algorithm is ending. Expect tools and analyses built on top of the real model in the next few months.
The Bigger Picture
The trajectory matters more than any single update. X used to be a behaviour-only system. Then they added a model and wrapped it in rules. Now they've stripped the rules and let the model do most of the work, with a content-understanding service feeding it richer features.
This is the same direction every major recommendation system has gone. The interesting difference here is the open-source commitment. The repo gets refreshed on a cadence, the weights are now published, and the production pipeline is in the box. You can verify the claims.
The next update is probably weeks away. I'll be back when it drops.
Bottom Line
Three things to take away:
- The old rule-based mental model is mostly obsolete. A Grok-based transformer does the work now, with a content-understanding service alongside it.
- Your post is scored on its own merits relative to a viewer, not against other posts in the feed.
- The model is public, the pipeline is public, and verification is no longer guesswork.
If you haven't read the original piece on how the X algorithm works, the behaviour-signal fundamentals there are still useful. Just take the specific numbers with a grain of salt now.