Discussion about this post

User's avatar
Jurgen Gravestein's avatar

I’ve seen a lot of coverage of this piece of research, but what I fail to understand is how on the one hand we’re suppose to believe task horizons are growing exponentially, and at the same time AI can’t beat Pokémon?

Expand full comment
Andreas F. Hoffmann's avatar

A 50% success rate is extremely modest to put it mildly: In software engineering, cybersecurity, or research, a 50% chance of success on a task with no real-time oversight is practically unusable. At best, it points to "potential assistance," not autonomy. Also the paper engages in naive extrapolation, assuming that exponential improvement continues indefinitely. Historical technology trends mostly follow logistic (S-curve) progressions. Constraints like computational resources, training data quality, and architectural bottlenecks are not addressed. Furthermore, "tasks that take a month" are ill-defined. Human tasks of this length often involve planning, communication, and multi-step dependencies that are not easily compressible into LLM inference. In short othing is growing indefinitely exponentially: https://theafh.substack.com/p/the-last-day-on-the-lake?r=42gt5

Expand full comment
5 more comments...

No posts