Seeing some confusion on whether AI could be strictly stronger than AI+humans: A simple argument there may be that - at least in principle - adding more cognition (e.g. a human) to a system should not make it strictly worse overall. But that seems true only in a very idealized case.
One issue is incorporating human input without losing overall performance even in situation when the human's advice is much wore than the AI's in e.g. 99.9% of the cases (and it may be hard to tell apart the 0.1% reliably).
But more importantly, a good framing here may be the optimal labor cost allocation between AIs and Humans on a given task. E.g. given a budget of $1000 for a project:
This is still not a very well-formalized definition as even the artists and philosophers already use some weak AIs efficiently in some part of their business, and a boundary needs to be drawn artificially around the core of the project.
Although even in AI period with a well-aligned AI, the humans providing their preferences and feedback are a very valuable part of the system. It is not clear to me whether to include this in cyborg or AI period.
The transitions in more complex, real-world domains may not be as sharp as e.g. in chess, and it would be useful to model and map the resource allocation ratio between AIs and humans in different domains over time. This is likely relatively tractable and would be informative for prediction of future development of the transitions.
While the dynamic would differ between domains (not just the current stage but also the overall trajectory shape), I would expect some common dynamics that would be interesting to explore and model.
A few examples of concrete questions that could be tractable today:
While in many areas the fraction of resources spent on (advanced) AIs is still relatively small, it is ramping up quite quickly and even those may provide informative to study (and develop methodology and metrics for, and create forecasts to calibrate our models).
It can be useful to zoom out and talk about very compressed concepts like ‘AI progress’ or ‘AI transition’ or ‘AGI timelines’. But from the perspective of most AI strategy questions, it’s useful to be more specific.
Looking at all of human history, it might make sense to think of ourselves as at the cusp of an AI transition, when AI systems overtake humans as the most powerful actors. But for practical and forward-looking purposes, it seems quite likely there will actually be multiple different AI transitions:
Stage
[>> = more powerful than]
Human period:
Humans >> AIs
Cyborg period: Human+AI teams >> humans
Human+AI teams >> AIs
AI period:
AIs >> humans
(AIs ~ human+AI teams)
Some domains might never enter an AI period. It’s also possible that in some domains the cyborg period will be very brief, or that there will be a jump straight to the AI period. But:
This means that for each domain, there are potentially two transitions: one from the human period into the cyborg period, and one from the cyborg period into the AI period.
Transitions in some domains will be particularly important
The cyborg period in any domain will correspond to:
Some domains where increased capabilities/automation/speed seem particularly strategically important are:
Some other domains which seem less centrally important but could end up mattering a lot are:
There are probably other strategically important domains we haven’t listed.
A common feature of the domains listed is that increased capabilities in those domains could lead to large increases in power, for the systems with those capabilities. It will sometimes be helpful to consider power in aggregate, so that we can make direct comparisons about the amount of power which is automated in a given domain.
Clearly, capabilities in these domains interact. In our view, people coming from different backgrounds often perceive large increases in power in their domain of expertise as the decisive transition. For example, it is easy for someone coming from a research background to see how automated research abilities could impact other domains. But the reverse is also true: automated powers of persuasion, or automated cultural evolution, would have a strong impact on research, by making some directions of thinking unpopular, and influencing the allocation of attention and minds.
Note that it isn’t clear that the level of abstraction we’ve picked here is the right one. It’s possible that even more granularity would be more helpful, at least in some situations. For all of the domains we list, you could think of sub-domains, or of particular capabilities which might advance faster or slower than others.
The order of AI transitions in different domains will matter
The timing of transitions in different domains isn’t independent. But the world will look very different depending on which transitions happen first. A few vignettes:
Importantly, the fact that there are different possible orderings suggests that there are multiple possible winning strategies from the perspective of decreasing existential risk. For example:
Caption: in trajectories A and B, coordination is automated more quickly than AI research. In trajectory C, AI research is automated more quickly.
What does all of this imply? Tentatively:
‘Cyborg periods’ could be pivotal
Even if cyborg periods are brief, they may be pivotal:
This leads to a picture where there are overlapping but different cyborg periods in different domains. These periods will probably be:
Interventions
Leveraging the power of human+AI teams during cyborg periods seems like it might be critical for navigating transitions to very advanced AI.
This is likely to be non-trivial. For example, to really make use of the different kinds of cognition in a system involving a single AI system and a single human requires:
Doing this in a more complex set-up might involve a lot of substantive work. But we can probably prepare for this in advance, by practising working in human+AI teams in the sub-domains where automation is more advanced. (Recent post on Cyborgism is a good example of a push in that direction.)
This applies more broadly than just to AI alignment research, and it would be great to have people in other strategically important domains practising this too.
The ideas in this post are mostly from Jan, and private discussions between Jan and a few other people. Rose did most of the writing. Clem and Nora gave substantive comments. s The post was written as part of the work done at ACS research group.