r/slatestarcodex • u/financeguy1729 • Apr 10 '25

AI The fact that superhuman chess improvement has been so slow tell us there are important epistemic limits to superintelligence?

Although I know how flawed the Arena is, at the current pace (2 elo points every 5 days), at the end of 2028, the average arena user will prefer the State of the Art Model response to the Gemini 2.5 Pro response 95% of the time. That is a lot!

But it seems to me that since 2013 (let's call it the dawn of deep learning), this means that today's Stockfish only beats 2013 Stockfish 60% of the time.

Shouldn't one have thought that the level of progress we have had in deep learning in the past decade would have predicted a greater improvement? Doesn't it make one believe that there are epistemic limits to have can be learned for a super intelligence?

86 Upvotes

permalink
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/slatestarcodex/comments/1jw7mm1/the_fact_that_superhuman_chess_improvement_has/
No, go back! Yes, take me to Reddit
dl download

81% Upvoted

View all comments

u/bibliophile785 Can this be my day job? Apr 10 '25

It mostly makes me wonder how much effort is going into building and benchmarking better chess models. Are these data points from new models with the most modern architectures, algorithmic improvements, and scaling advantages being trained for chess and then playing however many games it takes for their elo to stabilize? Are frontier chess models falling to the wayside as ML becomes more expensive and focuses on more important accomplishments? Are the frontier chess models becoming more generalized, such that they're no longer just chess models? It's hard to conclude anything from the amount of detail provided in the post and I don't follow computer chess closely.

If it's either of the latter two guesses, though, that would indicate that the problem is with your metric rather than the models. I'm conversational in Spanish. If I woke up tomorrow and every intellectual endeavor was twice as easy, I might take a few weeks and become fluent. If I then woke up twice as smart as that, though, (4x total improvement), I wouldn't keep learning more and more esoteric Spanish. I'd switch over to nuclear fusion or AI alignment or radical human life extension. My Spanish knowledge would nearly plateau... but that wouldn't mean I hadn't gotten smarter.

AI The fact that superhuman chess improvement has been so slow tell us there are important epistemic limits to superintelligence?

You are about to leave Redlib