When discussing AI risks, talk about capabilities, not intelligence

Vika

46 When discussing AI risks, talk about capabilities, not intelligence

by Victoria Krakovna

11th Aug 2023

Linkpost from vkrakovna.wordpress.com

3 min read

7

46

Public discussions about catastrophic risks from general AI systems are often derailed by using the word “intelligence”. People often have different definitions of intelligence, or associate it with concepts like consciousness that are not relevant to AI risks, or dismiss the risks because intelligence is not well-defined. I would advocate for using the term “capabilities” or “competence” instead of “intelligence” when discussing catastrophic risks from AI, because this is what the concerns are really about. For example, instead of “superintelligence” we can refer to “super-competence” or “superhuman capabilities”.

When we talk about general AI systems posing catastrophic risks, the concern is about losing control of highly capable AI systems. Definitions of general AI that are commonly used by people working to address these risks are about general capabilities of the AI systems:

PASTA definition: “AI systems that can essentially automate all of the human activities needed to speed up scientific and technological advancement”.
Legg-Hutter definition: “An agent’s ability to achieve goals in a wide range of environments”.

We expect that AI systems that satisfy these definitions would have general capabilities including long-term planning, modeling the world, scientific research, manipulation, deception, etc. While these capabilities can be attained separately, we expect that their development is correlated, e.g. all of them likely increase with scale.

There are various issues with the word “intelligence” that make it less suitable than “capabilities” for discussing risks from general AI systems:

Anthropomorphism: people often specifically associate “intelligence” with being human, being conscious, being alive, or having human-like emotions (none of which are relevant to or a prerequisite for risks posed by general AI systems).
Associations with harmful beliefs and ideologies.
Moving goalposts: impressive achievements in AI are often dismissed as not indicating “true intelligence” or “real understanding” (e.g. see the “stochastic parrots” argument). Catastrophic risk concerns are based on what the AI system can do, not whether it has “real understanding” of language or the world.
Stronger associations with less risky capabilities: people are more likely to associate “intelligence” with being really good at math than being really good at politics, while the latter may be more representative of capabilities that make general AI systems pose a risk (e.g. manipulation and deception capabilities that could enable the system to overpower humans).
High level of abstraction: “intelligence” can take on the quality of a mythical ideal that can’t be met by an actual AI system, while “competence” is more conducive to being specific about the capability level in question.

It’s worth noting that I am not suggesting to always avoid the term “intelligence” when discussing advanced AI systems. Those who are trying to build advanced AI systems often want to capture different aspects of intelligence or endow the system with real understanding of the world, and it’s useful to investigate and discuss to what extent an AI system has (or could have) these properties. I am specifically advocating to avoid the term “intelligence” when discussing catastrophic risks, because AI systems can pose these risks without possessing real understanding or some particular aspects of intelligence.

The basic argument for catastrophic risk from general AI has two parts: 1) the world is on track to develop generally capable AI systems in the next few decades, and 2) generally capable AI systems are likely to outcompete or overpower humans. Both of these arguments are easier to discuss and operationalize by referring to capabilities rather than intelligence:

For #1, we can see a trend of increasingly general capabilities, e.g. from GPT-2 to GPT-4. Scaling laws for model performance as compute, data and model size increase suggest that this trend is likely to continue. Whether this trend reflects an increase in “intelligence” is an interesting question to investigate, but in the context of discussing risks, it can be a distraction from considering the implications of rapidly increasing capabilities of foundation models.
For #2, we can expect that more generally capable entities are likely to dominate over less generally capable ones. There are various historical examples of this, e.g. humans causing other species to go extinct. While there are various ways in which other animals may be more “intelligent” than humans, the deciding factor was that humans had more general capabilities like language and developing technology, which allowed them to control and shape the environment. The best threat models for catastrophic AI risk focus on how the general capabilities of advanced AI systems could allow them to overpower humans.

As the capabilities of AI systems continue to advance, it’s important to be able to clearly consider their implications and possible risks. “Intelligence” is an ambiguous term with unhelpful connotations that often seems to derail these discussions. Next time you find yourself in a conversation about risks from general AI where people are talking past each other, consider replacing the word “intelligent” with “capable” – in my experience, this can make the discussion more clear, specific and productive.

(Thanks to Janos Kramar for helpful feedback on this post.)

AI Safety Public Materials4AI Alignment Fieldbuilding2AI2

Frontpage

New Comment

3 comments, sorted by

top scoring

Click to highlight new comments since: Today at 12:03 AM

[-]tailcalled2y1620

I think discussions about capabilities raise the question "why create AI that is highly capable at deception etc.? seems like it would be safer not to".

The problem that occurs here is that some ways to create capabilities are quite open-ended, and risk accidentally creating capabilities for deception due to instrumental convergence. But at that point it feels like we are getting into the territory that is best thought of as "intelligence", rather than "capabilities".

Reply

[-]Victoria Krakovna2y33

I agree that a possible downside of talking about capabilities is that people might assume they are uncorrelated and we can choose not to create them. It does seem relatively easy to argue that deception capabilities arise as a side effect of building language models that are useful to humans and good at modeling the world, as we are already seeing with examples of deception / manipulation by Bing etc.

I think the people who think we can avoid building systems that are good at deception often don't buy the idea of instrumental convergence either (e.g. Yann LeCun), so I'm not sure that arguing for correlated capabilities in terms of intelligence would have an advantage.

Reply

[-]Rob Bensinger2y82

My own suggestion would be to use a variety of different phrasings here, including both "capabilities" and "intelligence", and also "cognitive ability", "general problem-solving ability", "ability to reason about the world", "planning and inference abilities", etc. Using different phrases encourages people to think about the substance behind the terminology -- e.g., they're more likely to notice their confusion if the stuff you're saying makes sense to them under one of the phrasings you're using, but doesn't make sense to them under another of the phrasings.

Phrases like "cognitive ability" are pretty important, I think, because they make it clearer why these different "capabilities" often go hand-in-hand. It also clarifies that the central problems are related to minds / intelligence / cognition / etc., not (for example) the strength of robotic arm, even though that too is a "capability".

Reply

Moderation Log