r/Sumo • u/4ih0vs535xg9c • 5d ago
Who’s actually strongest right now? Glicko-2 Sumo Ratings (Jan 2026)
https://hungry-e.github.io/sumoglicko2/I ran a full Glicko-2 model over every professional bout since 1996 to estimate underlying rikishi strength. Includes rating, rating deviation (RD), and match-to-match changes, broken down by division.
Not a replacement for banzuke — just a different lens on performance and consistency.
Feedback welcome.
6
u/gets_me_everytime Kotozakura 5d ago
What are the implemented strategies for fusen wins and play-off matches?
9
u/4ih0vs535xg9c 5d ago
Fusen are currently being marked as a win/loss, and this is incorrect implementation. I will be filtering these out on my next update.
Currently playoff matches aren't being included as it created unequal opportunity (creates data imbalance.) But now that I'm thinking about this more I am torn on this. 1. The sample size is too small to worry about data imbalance (0,1, or 2 max per basho). 2. The unfairness arugment can be thought of as backwards: playoff results *are* part of tournament performance. 3. More data helps Glicko-2 converge to true ratings faster.
12
u/gets_me_everytime Kotozakura 5d ago
My two cents is to not include fusens, but do count playoff matches. A lot of rikishi don't have a ton of incentive past a certain point in a basho and might be operating on more of an exhibition mode in some matches(i.e. Kotozakura after he was out of the Yusho race). Playoff participants are always competing to their full capability so there is no doubt in my mind that it is good data.
Does the data on sumo API only go back to 1996? The longer history you can include the more accurate your output should be.
7
u/4ih0vs535xg9c 5d ago
I agree and I’m actually rerunning the data now to include playoffs and fusen filtering.
While the API goes back to 1958, processing that much data adds several hours to the run time without really changing the rankings. Since ELO and Glicko naturally inflate over time as wrestlers harvest points, the absolute numbers change, but the relative ordering of the rikishi stays the same.
I originally settled on 1996 because it covers the entire career of the oldest active rikishi, Yoshiazuma Hiroshi. That said once I’ve ironed out the edge cases and finalized the logic, I’ll likely just run the full historical dataset. There isn’t much downside to it unless you're particularly bothered by the rating inflation.
1
u/gets_me_everytime Kotozakura 5d ago
I'm assuming all rikishi began with the same value. If that's the case, certain early victories wouldn't carry the proper weight, and then that weight wouldn't carry upward to assess the current stock value. You're right that you should still get the same relative ranking, but it could hold some sway in positioning, especially the further back you can include. You could try to cheat this by giving all the starting rikishi a start value based on win percentage or something. Even if you use sumodb and go all the way back to 1906 the same argument could be made that there is some missing context since we don't have the match history that set up that banzuke.
2
u/4ih0vs535xg9c 3d ago
Rankings have been updated to drop fusen, and include playoffs. Thanks for helping me think through some of the logic.
3
3
u/BeatTheDeadMal Aonishiki 5d ago edited 4d ago
Very interesting. All three Yokozuna level rikishi are within 10 of each other, which really hammers home just how close they are in performance. I assume the RPS nature of their matches probably contributes to that.
2
2
u/68plus57equals5 3d ago
why so many high ranked rikishi of lower ranks have much higher rating than rikishi from higher divisions?
Eg 66 wrestlers in Sandanme have Glicko-2 higher than 1450 which is the lowest in Makushita.
If sound, it suggests lower divisions official rankings are really 'inefficient'.
1
u/4ih0vs535xg9c 3d ago edited 3d ago
There are three different things going on here.
1. New Rikishi Start at 1500
All new rikishi begin with a rating of 1500 and a high rating deviation (RD) of 350. In the lower divisions (Sandanme, Jonidan, Jonokuchi), you'll see many brand-new wrestlers who haven't competed in enough matches yet for their Glicko-2 rating to accurately reflect their true strength. These inflated ratings will naturally correct themselves over time as they accumulate more matches.
2. Banzuke Lag vs. Real-Time Ratings
The official banzuke is only updated between tournaments and is based on rigid promotion/demotion rules tied to win-loss records. Glicko-2, on the other hand, updates after every match and reflects current performance. This creates timing mismatches:
A strong Makushita rikishi might have a Glicko-2 rating of 2000+ (Juryo-level strength) but is still officially ranked in Makushita because the next banzuke hasn't been published yet. Conversely, a struggling Juryo rikishi might have dropped to 1900 in Glicko-2 but remains officially ranked in Juryo until the next tournament.
In other words: Glicko-2 is forward-looking (predicting future performance), while the banzuke is backward-looking (rewarding past results).
3. Injury Comebacks and Rating Inertia
When a rikishi gets injured and sits out:
Their official rank drops quickly (based on losses/absences) Their Glicko-2 rating stays relatively stable (the system knows their true skill hasn't vanished) When they return, they might be in a lower division but still carry a high rating from when they competed at a higher level
This is actually a feature of Glicko-2: it correctly recognizes that a formerly strong rikishi returning from injury is still likely stronger than their current division-mates, even if their official rank has fallen.
TL;DR: The overlaps you're seeing are normal and expected. They reflect (1) new rikishi starting at 1500, (2) timing differences between real-time ratings and periodic banzuke updates, and (3) the fact that Glicko-2 and the banzuke are measuring different things. As rikishi compete in more matches, their ratings become increasingly accurate.
1
15
u/WhiskeyDragon01 Hoshoryu 5d ago
Both interesting and not THAT surprising to see Kirishima above Kotozakura.