6 Comments

I’ve seen a lot of coverage of this piece of research, but what I fail to understand is how on the one hand we’re suppose to believe task horizons are growing exponentially, and at the same time AI can’t beat Pokémon?

Expand full comment

How long does it take a skilled human to beat Pokémon? At least on the tasks they tested in the paper, the best AI had a time horizon of around an hour. The trend would predict in 7 months this would be 2 hours, and 7 more months 4 hours, and so on.

Of course, beating Pokémon seems like quite a different task to the ones tested in the paper, so your mileage may vary

Expand full comment

That's kinda my point... The tasks in the paper is a subset of possible tasks, and based on that sweeping generalization are being made on the supposedly exponential curve we're on.

Expand full comment

A 50% success rate is extremely modest to put it mildly: In software engineering, cybersecurity, or research, a 50% chance of success on a task with no real-time oversight is practically unusable. At best, it points to "potential assistance," not autonomy. Also the paper engages in naive extrapolation, assuming that exponential improvement continues indefinitely. Historical technology trends mostly follow logistic (S-curve) progressions. Constraints like computational resources, training data quality, and architectural bottlenecks are not addressed. Furthermore, "tasks that take a month" are ill-defined. Human tasks of this length often involve planning, communication, and multi-step dependencies that are not easily compressible into LLM inference. In short othing is growing indefinitely exponentially: https://theafh.substack.com/p/the-last-day-on-the-lake?r=42gt5

Expand full comment

Well we also keep moving the AGI goalposts

Expand full comment

It is one thing to project from past numbers and another thing to project based on factors underlying those numbers.

Good, useful public data sources - the fuel of machine learning - have been near fully exploited. That issue has been seen with OpenAI's latest models not seemingly bringing much new in terms of functionality.

The growth in time horizon is coincident to similar growth in total investment in AI. Will that also continue, gobbling up a notable percentage of GDP (and energy) at a time of increasing political instability? Will investors continue to put money in to AI despite the clear increasing risk of many involved losing their shirts (as has happened with every single technological bubble in recent history)?

Some amazing things have been achieved but one really has to pay attention to the fundamentals of the industry when forecasting because the investment level is enormous and the capacity for sufficient monetization is an open question at this stage.

Expand full comment