The Pando Problem: Rethinking AI Individuality

Jan_Kulveit

You should make this a top level post so it gets visibility. I think it's important for people to know the caveats attached to your results and the limits on its implications in real-world dynamics.

What to include in a guest lecture on existential risks from AI?

Aryeh Englander3y10

Yes please!

Biology-Inspired AGI Timelines: The Trick That Never Works

Aryeh Englander3y110

I think part of what I was reacting to is a kind of half-formed argument that goes something like:

My prior credence is very low that all these really smart, carefully thought-through people are making the kinds of stupid or biased mistakes they are being accused of.
In fact, my prior for the above is sufficiently low that I suspect it's more likely that the author is the one making the mistake(s) here, at least in the sense of straw-manning his opponents.
But if that's the case then I shouldn't trust the other things he says as much, because it looks lik

Aryeh Englander3y310

Meta-comment:

I noticed that I found it very difficult to read through this post, even though I felt the content was important, because of the (deliberately) condescending style. I also noticed that I'm finding it difficult to take the ideas as seriously as I think I should, again due to the style. I did manage to read through it in the end, because I do think it's important, and I think I am mostly able to avoid letting the style influence my judgments. But I find it fascinating to watch my own reaction to the post, and I'm wondering if others have any (co... (read more)

Rob Bensinger3y80

When I try to mentally simulate negative reader-reactions to the dialogue, I usually get a complicated feeling that's some combination of:

Some amount of conflict aversion: Harsh language feels conflict-y, which is inherently unpleasant.
Empathy for, or identification with, the people or views Eliezer was criticizing. It feels bad to be criticized, and it feels doubly bad to be told 'you are making basic mistakes'.
Something status-regulation-y: My reader-model here finds the implied threat to the status hierarchy salient (whether or not Eliezer is just tryin

... (read more)

Kaj Sotala3y140

I had a pretty strong negative reaction to it. I got the feeling that the post derives much of its rhetorical force from setting up an intentionally stupid character who can be condescended to, and that this is used to sneak in a conclusion that would seem much weaker without that device.

Zvi3y300

Things I instinctively observed slash that my model believes that I got while reading that seem relevant, not attempting to justify them at this time:

There is a core thing that Eliezer is trying to communicate. It's not actually about timeline estimates, that's an output of the thing. Its core message length is short, but all attempts to find short ways of expressing it, so far, have failed.
Mostly so have very long attempts to communicate it and its prerequisites, which to some extent at least includes the Sequences. Partial success in some cases, full suc

... (read more)

Alex Turner3y270

I find it concerning that you felt the need to write "This is not at all a criticism of the way this post was written. I am simply curious about my own reaction to it" (and still got downvoted?).

For my part, I both believe that this post contains valuable content and good arguments, and that it was annoying / rude / bothersome in certain sections.

5Rafael Harth3y

1: To me, it made it more entertaining and thus easier to read. (No idea about non-anecdotal data, would also be interested.) 3: Also no data; I strongly suspect the metric is generally good because... actually I think it's just because the people I find worth listening to are overwhelmingly not condescending. This post seems highly usual in several ways.

Rob Bensinger3y50

I've gotten one private message expressing more or less the same thing about this post, so I don't think this is a super unusual reaction.

Paths To High-Level Machine Intelligence

Aryeh Englander4y40

Thanks Daniel for that strong vote of confidence!

The full graph is in fact expandable / collapsible, and it does have the ability to display the relevant paragraphs when you hover over a node (although the descriptions are not all filled in yet). It also allows people to enter in their own numbers and spit out updated calculations, exactly as you described. We actually built a nice dashboard for that - we haven't shown it yet in this sequence because this sequence is mostly focused on phase 1 and that's for phase 2.

Analytica does have a web version, but it... (read more)

[AN #156]: The scaling hypothesis: a plan for building AGI

Aryeh Englander4y40

I'd like to hear more thoughts, from Rohin or anybody else, about how the scaling hypothesis might affect safety work.

3Rohin Shah4y

Wrote a separate comment here (in particular I think claims 1 and 4 are directly relevant to safety)

List of good AI safety project ideas?

Answer by Aryeh EnglanderJun 03, 202110

New post on the EA Forum: Some AI Governance Research Ideas

List of good AI safety project ideas?

Answer by Aryeh EnglanderMay 28, 202110

Just came across this: Research ideas to study humans with AI Safety in mind

[Event] Weekly Alignment Research Coffee Time (05/10)

Aryeh Englander4y20

Thanks Adam for setting this up! I have no idea if my experience is representative, but that was definitely one of the highest-quality discussion sessions I've had at events of this type.

1Adam Shimi4y

Glad the experience was good!

[Linkpost] Treacherous turns in the wild

Aryeh Englander4y20

I don't think this is quite an example of a treacherous turn, but this still looks relevant:

Lewis et al., Deal or no deal? end-to-end learning for negotiation dialogues (2017):

Analysing the performance of our agents, we find evidence of sophisticated negotiation strategies. For example, we find instances of the model feigning interest in a valueless issue, so that it can later ‘compromise’ by conceding it. Deceit is a complex skill that requires hypothesising the other agent’s beliefs, and is learnt relatively late in child development (Talwar and Lee, 200

... (read more)

0Mark Xu4y

This is a cool example, thanks!

Timeline of AI safety

Aryeh Englander4y20

That's later in the linked wiki page: https://timelines.issarice.com/wiki/Timeline_of_AI_safety#Full_timeline

Timeline of AI safety

Aryeh Englander4y20

Excellent, thanks! Now I just need a similar timeline for near-term safety engineering / assured autonomy as they relate to AI, and then a good part of a paper I'm working on will have just written itself.

The ethics of AI for the Routledge Encyclopedia of Philosophy

Aryeh Englander4y60

Also - particular papers that you think are important, especially if you think they might be harder to find in a quick literature search. I'm part of an AI Ethics team at work, and I would like to find out about these as well.

The ground of optimization

Aryeh Englander5y20

We could define the natural selection system as:

All configurations = all arrangements of matter on a planet (both arrangements that are living and those that are non-living)

Basis of attraction = all arrangements of matter on a planet that meet the definition of a living thing

Target configuration set = all arrangements of... (read more)

The ground of optimization

Aryeh Englander5y130

I shared this essay with a colleague where I work (Johns Hopkins University Applied Physics Lab). Here are her comments, which she asked me to share:

This essay proposes a very interesting definition of optimization as the manifestation of a particular behavior of a closed, physical system. I haven’t finished thinking this over, but I suspect it will be (as is suggested in the essay) a useful construct. The reasoning leading to the definition is clearly laid out (thank you!), with examples that are very useful in understanding the concept. The downsi... (read more)

2Aryeh Englander5y

This was actually part of a conversation I was having with this colleague regarding whether or not evolution can be viewed as an optimization process. Here are some follow-up comments to what she wrote above related to the evolution angle: We could define the natural selection system as: All configurations = all arrangements of matter on a planet (both arrangements that are living and those that are non-living) Basis of attraction = all arrangements of matter on a planet that meet the definition of a living thing Target configuration set = all arrangements of living things where the type and number of living things remains approximately stable. I think that this system meets the definition of an optimizing system given in the Ground for Optimization essay. For example, predator and prey co-evolve to be about “equal” in survival ability. If a predator become so much better than its prey that it eats them all, the predator will die out along with its prey; the remaining animals will be in balance. I think this works for climate perturbations, etc. too. HOWEVER, it should be clear that there are numerous ways in which this can happen – like the ball on bumpy surface with a lot of convex “valleys” (local minima), there is not just one way that living things can be in balance. So, to say that “natural selection optimized for intelligence” is quite not right – it just fell into a “valley” where intelligence happened. FURTHER, it’s not clear that we have reached the local minimum! Humans may be that predator that is going to fall “prey” to its own success. If that happened (and any intelligent animals remain at all), I guess we could say that natural selection optimized for less-than-human intelligence! Further, this definition of optimization has no connotation of “best” or even better – just equal to a defined set. The word “optimize” is loaded. And its use in connection with natural selection has led to a lot of trouble in terms of human races, and humans v. anim

AI ALIGNMENT FORUM
AF

All of Aryeh Englander's Comments + Replies