All of lsusr's Comments + Replies

Lsusr10

Am I still eligible for the prize if I publish a public blog post at the same time I submit the Google Doc or would you prefer I not publish a blog post about February 15th? Publishing the blog post immediately advances science better (because it can enable discussion) but waiting until after the February 15th might be preferable to you for contest-related reasons.

2Mark Xu
We would prefer submissions be private until February 15th.
Lsusr50

I've been gingerly building my way up toward similar ideas but I haven't yet posted my thoughts on the subject. I appreciate you ripping the band-aid off.

There are two obvious ways an intelligence can be non-consequentialist.

  • It can be local. A local system (in the physics sense) is defined within a spacetime of itself. An example of a local system is special relativity.
  • It can be stateless. Stateless software is written in a functional programming paradigm.

If you define intelligence to be consequentialist then corrigibility becomes extremely d... (read more)

7Steve Byrnes
Thanks for the comment! I feel like I'm stuck in the middle… 1. On one side of me sits Eliezer, suggesting that future powerful AGIs will make decisions exclusively to advance their explicit preferences over future states 2. On the other side of me sits, umm, you, and maybe Richard Ngo, and some of the "tool AI" and GPT-3-enthusiast people, declaring that future powerful AGIs will make decisions based on no explicit preference whatsoever over future states. 3. Here I am in the middle, advocating that we make AGIs that do have preferences over future states, but also have other preferences. I disagree with the 2nd camp for the same reason Eliezer does: I don't think those AIs are powerful enough. More specifically: We already have neat AIs like GPT-3 that can do lots of neat things. But we have a big problem: sooner or later, somebody is going to come along and build a dangerous accident-prone consequentialist AGI. We need an AI that's both safe, and powerful enough to solve that big problem. I usually operationalize that as "able to come up with good original creative ideas in alignment research, and/or able to invent powerful new technologies". I think that, for an AI to do those things, it needs to do explicit means-end reasoning, autonomously come up with new instrumental goals and pursue them, etc. etc. For example, see discussion of "RL-on-thoughts" here. "Humans will eventually wind up in control" is purely about future states. "Humans will remain in control" is not. For example, consider a plan that involves disempowering humans and then later re-empowering them. That plan would pattern-match well to "humans will eventually wind up in control", but it would pattern-match poorly to "humans will remain in control". Yes, this is a very important potential problem, see my discussion under "Objection 1".
Lsusr*00

Much of the dialogue about AI Safety I encounter in off-the-record conversations seems to me like it's not grounded in reality. I repeatedly hear (what I feel to be) a set of shaky arguments that both shut down conversation and are difficult to validate empirically.

The shaky argument is as follows:

  1. Machine learning is rapidly growing more powerful. If trends continue it will soon eclipse human performance.
  2. Machine learning equals artificial intelligence equals world optimizer.
  3. World optimizers can easily turn the universe into paperclips by accident.
  4. Ther
... (read more)
2gjm
Thanks! (I would not have guessed correctly.)