Review

This is a quick response to Evolution Provides No Evidence For the Sharp Left Turn, due to it winning first prize in The Open Philanthropy Worldviews contest. I think part of the post is sufficiently misleading about evolutionary history and the OP first prize gives it enough visibility, that it makes sense to write a post-long response.

Central evolutionary biology related claim of the original post is this:

  • The animals of the generation learn throughout their lifetimes, collectively performing many billions of steps of learning.
  • The generation dies, and all of the accumulated products of within lifetime learning are lost.
  • Differential reproductive success slightly changes the balance of traits across the species.



The only way to transmit information from one generation to the next is through evolution changing genomic traits, because death wipes out the within lifetime learning of each generation.



However, this sharp left turn does not occur because the inner learning processes suddenly become much better / more foomy / more general in a handful of outer optimization steps. It happens because you devoted billions of times more optimization power to the inner learning processes, but then deleted each inner learner shortly thereafter. Once the inner learning processes become able to pass non-trivial amounts of knowledge along to their successors, you get what looks like a sharp left turn. But that sharp left turn only happens because the inner learners have found a kludgy workaround past the crippling flaw where they all get deleted shortly after initialization.

In my view, this interpretation of evolutionary history is something between "speculative" and "wrong".

Transmitting some of the data gathered during the lifetime of the animal to next generation by some other means is so obviously useful that is it highly convergent. Non-genetic communication channels to the next generation include epigenetics, parental teaching / imitation learning, vertical transmission of symbionts, parameters of prenatal environment, hormonal and chemical signaling, bio-electric signals, and transmission of environmental resources or modifications created by previous generations, which can shape the conditions experienced by future generations (e.g. beaver dams). 

Given the fact overcoming the genetic bottleneck is so highly convergent, it seems a bit surprising if there was a large free lunch on table in exactly this direction, as Quintin assumes:

Evolution's sharp left turn happened because evolution spent compute in a shockingly inefficient manner for increasing capabilities, leaving vast amounts of free energy on the table for any self-improving process that could work around the evolutionary bottleneck. Once you condition on this specific failure mode of evolution, you can easily predict that humans would undergo a sharp left turn at the point where we could pass significant knowledge across generations. I don't think there's anything else to explain here, and no reason to suppose some general tendency towards extreme sharpness in inner capability gains.


It's probably worth to go a bit into technical details here: evolution did manage to discover evolutionary innovations like mirror neurons: A mirror neuron is a neuron that fires both when an organism acts and when the organism observes the same action performed by another. Thus, the neuron "mirrors" the behavior of the other, as though the observer were itself acting. ... Further experiments confirmed that about 10% of neurons in the monkey inferior frontal and inferior parietal cortex have "mirror" properties and give similar responses to performed hand actions and observed actions.[1] 

Clearly, mirror neurons are type of an innovation which allows high throughput behavioural cloning / imitation learning. "10% of neurons in the monkey inferior frontal and inferior parietal cortex" is a massive amount of compute. Neurons imitating your parent's motoric policy based on visual channel information about the behaviour of your parent is a high-throughput channel. (I recommend doing a Fermi estimate of this channel capacity).

The situation where you clearly have a system totally able to eat the free lunch on the table, and supposedly the lunch is still there, makes me suspicious.

At the same time: yes, clearly, nowadays, human culture is a lot of data, and humans learn more than monkeys.

Different stories

What are some evolutionary plausible alternatives of Quintin's story? 

Alternative stories would usually suggest that ancestral humans had access to channels to overcome the genetic bottleneck, and were using such channels to the extent it was marginally effective. Then, some other major change happened, the marginal fitness advantage of learning more grew, and humans developed to transmit more bits, so, modern humans transmit more.

An example of such major change could be advent of culture. If you look at the past timeline from a replicator dynamics perspective, the next most interesting event after the beginning of life is cultural replicators running on human brains crossing R>1 and starting  the second vast evolutionary search, cultural evolution.

How is the story "cultural evolution is the pivotal event" different? Roughly speaking, culture is a multi-brain parallel immortal evolutionary search computation. Running at higher speed and a layer of abstraction away from physical reality (compared to genes), it was able to discover many pools of advantage, like fire, versatile symbolic communication, or specialise-and-trade superagent organisation.

In this view, there is a type difference between 'culture' and 'increased channel capacity'. 

You can interpret this in multiple ways, but if you want to cast this as a story of a discontinuity, where biological evolution randomly stumbled upon starting a different powerful open-ended misaligned search, it makes sense. The fact that such search finds caches of fitness and negentropy seems not very surprising. [2]

Was the "increased capacity to transfer what's learned in brain's lifetime to the next generation" at least the most important or notably large direction what to exploit? I'm not a specialist on human evolution, but seems hard to say with confidence: note that 'fire' is also a big deal, as it allows you do spend way less on digestion, and cheaper ability to coordinate is a big deal, as illustrated by ants, and symbolic communication is a big deal, as it is digital, robust and and effective compression.

Unfortunately for attempts to figure out what were the precise marginal costs and fitness benefits for ancestral humans, my impression is, ~ten thousand generations of genetic evolution in a fitness landscape shaped by cultural evolution screens a lot of evidence. In particular, from the fact modern humans are outliers in some phenotype characteristic, you can not infer it was the cause of the change to humans. For example, argument like 'human kids have unusual capacity to absorb significant knowledge across generations compared to chimps, ergo, the likely cause of human explosive development is ancestral humans having more of this capacity than other species' has very little weight. Modern wolfs are also notably different from modern chihuahuas, but the correct causal story is not 'ancestral chihuahuas had an overhang of loyalty and harmlessness'.

Does this partially invalidate the argument toward implications for AI in the original post? In my view yes; if, following Quintin, we translate the actual situation  into quantities and narratives that drive AI progress rates

- the "specific failure mode" of not transmitting what brains learn to the next generation is not there
- the marginal fitness advantage of transmitting more bits to the next generation brains is unclear, similarly to an unclear marginal advantage of e.g. spending more on LLMs curating data for the next gen LLM training
- because we don't really understand what happened, the metaphorical map to AI progress mostly maps this lack of understanding to lack of clear insights for AI
- it seems likely culture is somehow big deal, but it is not clear how you would translate what happened to AI domain; if such thing can happen with AIs, if anything, it seems pushing more toward the discontinuity side, as the cultural search uncovered relatively fast multiple to many caches of negentropy
(- yes, obviously, given culture, it is important that you can transmit it to next generation, but it seems quite possible that for transferring seed culture  the capacity channel you have via mirror neurons is more than enough)
 

Not even approximately true

In case you still believe the original post is still somehow approximately true, and the implications for AI progress still somehow approximately hold, I think it's important to basically un-learn that update. Quoting the original post:

This last paragraph makes an extremely important claim that I want to ensure I convey fully:

- IF we understand the mechanism behind humanity's sharp left turn with respect to evolution

- AND that mechanism is inapplicable to AI development

- THEN, there's no reason to reference evolution at all when forecasting AI development rates, not as evidence for a sharp left turn, not as an "illustrative example" of some mechanism / intuition which might supposedly lead to a sharp left turn in AI development, not for anything.
 

The conjunctive IF is a crux, and because we don't understand what happened with culture enough, the rest of the implication does not hold.

Consider a toy model counterfactual story: in a fantasy word, exactly repeating 128 bits of the first cultural replicator gives the human ancestor the power to cast a spell and gain +50% fitness advantage.  Notice that this is a different story from "overcoming channel to offspring capacity" - you may be in the situation you have plenty of capacity, but don't have the 128 bits, and this is a situation much more prone to discontinuities.

Because it is not clear if reality was more like stumbling upon a specific string, or piece of code, or evolutionary ratchet, or something else, we don't know enough to rule out a metaphor suggesting discontinuities.  

Conclusion

Where I do agree with Quintin is scepticism toward some other stories attempting to draw some strong conclusion from human evolution, including strong conclusions about discontinuities.

I do think there is a reasonably good metaphor genetic evolution : brains ~ base optimiser : mesa-optimiser, but notice that evolution was able to keep brains mostly aligned for all other species except humans.  Relation human brain : cultural evolution is very unlike base optimiser : mesa-optimiser

(Note on AI)

While I mostly wanted to focus on the evolutionary part of the OP, I'm sceptical about the AI claims too. (Paraphrasing: While the current process of AI training is not perfectly efficient, I don't think it has comparably sized overhangs which can be exploited easily.)

In contrast, to me, it seems current way how AIs learn is very obviously inefficient, compared to what's possible. For example, explain to GPT4 something new, or make it derive something new. Open a new chat window, and probe if it now knows it. Compare with a human.
 

  1. ^
  2. ^

    This does not imply the genetic evolutionary search is a particularly bad optimiser - instead, the landscape is such that there are many sources of negentropy available.

New Comment
3 comments, sorted by Click to highlight new comments since:

It's a well-know fact in anthropology that:

  1. During the ~500,000 years that Neanderthals were around, their stone- tool-making technology didn't advance at all: tools from half-a-million years apart are functionally identical. Clearly their capacity for cultural transmission of stone-tool-making skills was already at its capacity limit the whole time.
  2. During the ~300,000 years that Homo sapiens has been around, our technology has advanced at an accelerating rate, with a rate-of-advance roughly proportion to planetary population, and planetary population increasing with technological advances, with the positive feedback giving super-exponential acceleration. Clearly our cultural transmission of technological skills has never saturated its capacity limit (and information technology such as writing, printing, and the Internet has obviously further increased that limit).

So there's a clear and dramatic difference here, and it seems to date back to around the start of our species. Just what caused such a massive increase in our species' capacity to pass on useful information between generations is unclear. (Personally I suspect something in syntactic generality of our language, perhaps loosely analogous to the phenomenon of Turing-completeness.) But Homo sapiens is not just another hominid, and the sapiens part isn't just puffery: we have a dramatic capability shift from any previous species in the bandwidth of our cultural information transmission —  it's vastly larger than the information content of our genome, and still growing.

I really don't want to spend even more time arguing over my evolution post, so I'll just copy over our interactions from the previous times you criticized it, since that seems like context readers may appreciate.

In the comment sections of the original post:

Your comment

[very long, but mainly about your "many other animals also transmit information via non-genetic means" objection + some other mechanisms you think might have caused human takeoff]

My response

I don't think this objection matters for the argument I'm making. All the cross-generational information channels you highlight are at rough saturation, so they're not able to contribute to the cross-generational accumulation of capabilities-promoting information. Thus, the enormous disparity between the brain's with-lifetime learning versus evolution cannot lead to a multiple OOM faster accumulation of capabilities as compared to evolution.

When non-genetic cross-generational channels are at saturation, the plot of capabilities-related info versus generation count looks like this:

with non-genetic information channels only giving the "All info" line a ~constant advantage over "Genetic info". Non-genetic channels might be faster than evolution, but because they're saturated, they only give each generation a fixed advantage over where they'd be with only genetic info. In contrast, once the cultural channel allows for an ever-increasing volume of transmitted information, then the vastly faster rate of within-lifetime learning can start contributing to the slope of the "All info" line, and not just its height.

Thus, humanity's sharp left turn.

In Twitter comments on Open Philanthropy's announcement of prize winners:

Your tweet

But what's the central point, than? Evolution discovered how to avoid the genetic bottleneck myriad times; also discovered potentially unbounded ways how to transmit arbitrary number of bits, like learning-teaching behaviours; except humans, nothing foomed. So the updated story would be more like "some amount of non-genetic/cultural accumulation is clearly convergent and is common, but there is apparently some threshold crossed so far only by humans. Once you cross it you unlock a lot of free energy and the process grows explosively". (&the cause or size of treshold is unexplained)

(note: this was a reply and part of a slightly longer chain)

My response

Firstly, I disagree with your statement that other species have "potentially unbounded ways how to transmit arbitrary number of bits". Taken literally, of course there's no species on earth that can actually transmit an *unlimited* amount of cultural information between generations. However, humans are still a clear and massive outlier in the volume of cultural information we can transmit between generations, which is what allows for our continuously increasing capabilities across time.

Secondly, the main point of my article was not to determine why humans, in particular, are exceptional in this regard. The main point was to connect the rapid increase in human capabilities relative to previous evolution-driven progress rates with the greater optimization power of brains as compared to evolution. Being so much better at transmitting cultural information as compared to other species allowed humans to undergo a "data-driven singularity" relative to evolution. While our individual brains and learning processes might not have changed much between us and ancestral humans, the volume and quality of data available for training future generations did increase massively, since past generations were much better able to distill the results of their lifetime learning into higher-quality data.

This allows for a connection between the factors we've identified are important for creating powerful AI systems (data volume, data quality, and effectively applied compute), and the process underlying the human "sharp left turn". It reframes the mechanisms that drove human progress rates in terms of the quantities and narratives that drive AI progress rates, and allows us to more easily see what implications the latter has for the former.

In particular, this frame suggests that the human "sharp left turn" was driven by the exploitation of a one-time enormous resource inefficiency in the structure of the human, species-level optimization process. And while the current process of AI training is not perfectly efficient, I don't think it has comparably sized overhangs which can be exploited easily. If true, this would mean human evolutionary history provides little evidence for sudden increases in AI capabilities.

The above is also consistent with rapid civilizational progress depending on many additional factors: it relies on resource overhand being a *necessary* factor, but does not require it to be alone *sufficient* to accelerate human progress. There are doubtless many other factors that are relevant, such as a historical environment favorable to progress, a learning process that sufficiently pays attention to other members of ones species, not being a purely aquatic species, and so on. However, any full explanation of the acceleration in human progress of the form: 
"sudden progress happens exactly when (resource overhang) AND (X) AND (Y) AND (NOT Z) AND (W OR P OR NOT R) AND..." 
is still going to have the above implications for AI progress rates.

There's also an entire second half to the article, which discusses what human "misalignment" to inclusive genetic fitness (doesn't) mean for alignment, as well as the prospects for alignment during two specific fast takeoff (but not sharp left turn) scenarios, but that seems secondary to this discussion.

I'll try to keep it short
 

All the cross-generational information channels you highlight are at rough saturation, so they're not able to contribute to the cross-generational accumulation of capabilities-promoting information.

This seems clearly contradicted by empirical evidence. Mirror neurons would likely be able to saturate what you assume is brains learning rate, so not transferring more learned bits is much more likely because marginal cost of doing so is higher than than other sensible options. Which is a different reason than "saturated, at capacity".
 

Firstly, I disagree with your statement that other species have "potentially unbounded ways how to transmit arbitrary number of bits". Taken literally, of course there's no species on earth that can actually transmit an *unlimited* amount of cultural information between generations

Sure. Taken literally, the statement is obviously false ... literally nothing can store arbitrary number of bits because of Bekenstein bound. More precisely, the claim is existing non-human ways how to transmit leaned bits to the next generation in practice do not seem to be constrained by limits how many bits they can transmit, but by some other limits (e.g. you can transmit more bits than the capacity of the animal to learn).
 

Secondly, the main point of my article was not to determine why humans, in particular, are exceptional in this regard. The main point was to connect the rapid increase in human capabilities relative to previous evolution-driven progress rates with the greater optimization power of brains as compared to evolution. Being so much better at transmitting cultural information as compared to other species allowed humans to undergo a "data-driven singularity" relative to evolution. While our individual brains and learning processes might not have changed much between us and ancestral humans, the volume and quality of data available for training future generations did increase massively, since past generations were much better able to distill the results of their lifetime learning into higher-quality data.
 


1. As explained in my post, there is no reason to assume ancestral humans were so much better at transmitting information as compared to other species

2. The qualifier they were better at transmitting cultural information may (or may not) do a lot of work. 

The crux is something like "what is the type signature of culture".  Your original post roughly assumes "it's just more data". But this seems very unclear: a comment above yours, jacob_cannell confidently claims I miss the forest and makes a guess the critical innovation is "symbolic language". But, obviously, "symbolic language" is a very different type of innovation than "more data transmitted across generations". 

Symbolic language likely
- allows to use any type of channel more effectively
- in particular, allows more efficient horizontal synchronization, allowing parallel computations across many brains
- overall sounds more like software upgrade

Consider plain old telephone network wires: these have surprisingly large intrinsic capacity, which isn't that effectively used by analog voice calls.  Yes, when you plug a modem on both sides you experience "jump" in capacity - but this is much more like "software update" and can be more sudden.

Or a different example - empirically, it seems possible to teach various non-human apes sign language (their general purpose predictive processing brains are general enough to learn this). I would classify this as "software" or "algorithm" upgrade,. If someone did this to a group of apes in the wild, it seems plausible knowledge of language would stick and make them differentially more fit. But teaching apes symbolic language sounds in principle different from "it's just more data" or "it's a higher quality data", and implications for AI progress would be different.
 

it relies on resource overhand being a *necessary* factor,

My impression is compared to your original post your model drifts to more and more general concepts where it becomes more likely true,  harder to refute and less clear what the implication for AI is.  What is the "resource" here? Does negentropy stored in wood count as "a resource overhang"?

I'm arguing specifically against a version where "resource overhang" is caused by "exploitable resources you easily unlock by transmitting more bits learned by your brain vertically to your offspring brain" because your map of humans to AI progress is based on quite specific model of what are the bottlenecks and overhangs. 

If the current version of the argument is "sudden progress happens exactly when (resource overhang) AND ..." with "generally any kind of resource" then yes, this sounds more likely, but it seems very unclear what does this imply for AI.

(Yes I'm basically not discussing the second half of the article)