I think it's somewhat likely there's no speedup even with R&D automation, if that automation happens through scaling of LLMs with current methods, and there's no breakthrough that lets LLMs learn deep skills faster than they build next versions of models (using the current cookbook, updating deep skills primarily via RLVR). If they can't learn quickly, and don't invent a method for learning quickly, then being very fast at reasoning doesn't help, because they can only reason with the deep skills they have in the current version of the model, which only update in the next version, which takes significant time even if it's produced autonomously by the current version of the LLMs themselves, and it's possible to do general learning this way. This still means RSI and AGI in the central senses of the terms, but not a takeoff (or superintelligence).
Furthermore, whatever current speedup AI R&D might be experiencing will mostly go away once scaling of compute stops being as rapid, and this slowdown still happens after the slow-learning automation of R&D with LLMs. The low-hanging fruit enabled by more ambitious (compute-intensive) experiments will be picked, and less straightforward research will proceed at more or less the usual background pace (primarily modified by more people working on AI), because the slow-learning RSI process of automated LLM-building doesn't contribute to it in a crucial way, and so Amdahl's law reasserts the low speed of research progress in the longer term (if humans still learn novel ideas faster, in the course of solving the kinds of problems that take humanity an unpredictable number of years).
Takeoff is still possible in this scenario at any moment (upon invention of a method for learning deep skills quickly), but it doesn't happen as a result of some predictable AI-driven process of progress-grinding, other than via a slight extension in the current period of rapid compute scaling as a result of a larger TAM becoming immediately accessible if LLMs go into the slow-learning RSI. Another unknown is how smart the maximally scaled LLMs get (say, 100T active params, 2,500T total params, which is feasible in 2030 at the cost per token that GPT-4.5 had in 2025, though the compute for pretraining such models might only arrive a bit later). It doesn't seem like LLMs are likely to get very far beyond human genius level (if they are merely scaled using the current cookbook; it's plausible they don't even reach that level). And the hobbling of being unable to quickly learn deep ideas seems sufficiently crippling that they don't necessarily contribute to the speed of progress in non-routine research substantially (while the more routine kind of research soon runs out of the low-hanging fruit). So the AIs have a shot at solving fast learning for AIs, but not a prospect of predictable progress towards it (if it doesn't happen right away). And the higher annual probability of starting a takeoff mostly goes away after rapid scaling ends (though that probability is still substantial, I'd give 50% for a takeoff starting by 2032-2033, meaning an AGI that learns faster than human researchers and starts actually accelerating all the non-routine aspects of R&D progress a lot).
while 4x more compute at current margins is pretty close to as good as getting compute that's 4x serially faster
Why?
I haven't yet found a nice and clean way to model this in isolation
Not sure if this is what you want, but:
In fact the above is conservative in assuming one compute doubling yields just one labor doubling. You get that by running more copies. But you'll train smarter models
If one compute doubling yields two labor doublings (holding software constant) then, rerunning the above analysis:
How much faster would ai progress be compared to today?
Of course, compute may be growing more slowly
There is a gradual boost setting that smooths out the automation returns over a longer period, but I think this period is unrealistically long such that you don't see one-time speed-up effects
Would it help if we added another param controlling how many years the boost occurs over?
Sure, it would help some (though I'd want to think about whether the earlier trajectory is very plausible). To be clear, this isn't at all a blocker for me or something, I mostly wanted to make the higher level point in this post and ran into this issue. I could have just edited and run the code locally (which is pretty easy these days...).
This is a somewhat technical note.
By "software-only singularity", I mean that, after full automation of AI R&D, progress gets faster and faster due to smarter AIs driving increasingly fast rates of improvement in algorithms (overcoming diminishing returns), and that this lasts long enough to yield a large amount of progress (e.g. at least 4 years of progress in 1 year). The equivalent statement in jargon is: r is significantly greater than 1 (implying progress is getting faster and faster) and this remains the case for long enough to get large amounts of progress. For context, see How quick and big would a software intelligence explosion be?
Even without a "software-only singularity", I think full automation of AI R&D probably greatly speeds up progress for two main reasons:
We can also analyze this by looking at an example trajectory in the AI Futures Model that barely misses a software-only singularity and seeing how fast progress is after full automation of AI R&D. This trajectory involves a little over 2 years of progress in the year after full automation of AI R&D (SAR). This corresponds to going from full automation of AI R&D (SAR) to Top-human-Expert-Dominating AI (TEDAI) [3] in a bit less than a year, which is a lot of progress. (Quantitatively, it involves going from a 24x AI R&D software acceleration to a 270x AI R&D software acceleration in a year.) I suspect the AI Futures Model modestly underestimates takeoff speeds and one-time acceleration effects due to effectively acting as though AI speed and quantity don't matter outside of coding automation. [4]
There are other (indirect) reasons AI progress might speed up around when AIs automate AI R&D:
One important caveat is that by the time AIs automate AI R&D, the rate of compute scaling may be substantially lower than it is today. Thus, the default/trend rate of AI progress may be lower, so the corresponding acceleration would be relative to a lower baseline. This is directly applicable for the "further compute has increased returns" argument and maybe has a modest effect on the size of the one-time speed up (the size of the one-time speed up is sensitive to how much returns from further labor effort have diminished at a given level of compute).
If I remember correctly, this model effectively acts as though you go from no automation acceleration directly to full automation, while in practice earlier AIs will substantially accelerate AI R&D, meaning that returns to effort will already have substantially diminished by the point you reach full automation. As in, full automation will be a large acceleration relative to a human-only baseline, but a relatively smaller acceleration relative to AIs that existed 6 months before full automation, so much of the low-hanging fruit will already be plucked. You can model this in an ad hoc way by reducing the initial speed-up parameter such that it corresponds to the speed-up over AIs that existed 8 months prior to full automation; with my parameter guesses, this yields around 2.5 years of progress in the first year. (There is a gradual boost setting that smooths out the automation returns over a longer period, but I think this period is unrealistically long such that you don't see one-time speed-up effects.) ↩︎
Historically, progress has been driven by both scaling up compute and scaling up labor. However, I expect scaling up labor has been a small fraction of the effect in recent years. Compute for algorithms and training has been scaled up by around 4x per year while company employee count has 3x'd each year. But employee count 3x'ing is way worse than making all employees operate 3x faster due to a diminishing labor pool, (mostly one-time) onboarding costs, and parallelization penalties (while 4x more compute at current margins is pretty close to as good as getting compute that's 4x serially faster). I think the discount from a diminishing labor pool and from onboarding makes the 3x increase in the number of employees roughly as good as a "free" 2x increase in employee count at equal quality. Then, the parallelization penalty further reduces this 2x increase to being as valuable as having existing employees operate ~1.3x faster. Thus, I expect the labor increase is much less important than a 4x increase in compute. So it's fair to model the large majority of recent progress as being driven by increases in compute, where the value mostly comes from being able to run more experiments. ↩︎
TEDAI: AIs which strictly dominate top human experts in virtually all cognitive tasks (i.e., doable via remote work). ↩︎
This is in part because it doesn't model shifting to research directions that are more effective in the low-compute but plentiful-labor regime. ↩︎
Fully automated AI R&D makes moderate advantages more likely to be stable/predictable because now the labor part of AI R&D is likely commoditized and similar between companies (reducing variance). However, maintaining a lead ultimately requires maintaining a compute advantage (a large software lead can probably be converted into a compute advantage): if a trailing company had more compute and was able to hold on to a compute advantage (despite the potentially decisive advantages of the leading company), we should expect them to eventually catch up and overtake because labor is commoditized after full automation. I suspect it will be hard for significantly trailing companies to maintain a compute advantage if the leading company pulls far ahead on software due to speed ups from AI R&D. In the most extreme case, the leading company (or the AIs of the leading company) might literally take over the world, neutralizing prior compute advantages of trailing companies. ↩︎
Investors might be incentivized to pressure the trailing company to sell their compute to the leading company even if the leadership of the company isn't inclined to do this. Investors have limited power so this isn't clearly sufficient, but a deal could be designed to give the leadership of the trailing company additional power or possibly financial upside, so that they are incentivized to sell. Also, the leading company might just end up being extremely powerful, in the limit literally fully taking over the world. ↩︎