Trends in GPU price-performance

Marius Hobbhahn; Tamay

Bounty: Diverse hard tasks for LLM agents

What about controlling a robot body in a simulated envoronmet?

The LMA gets some simple goal, like make a cup of coffee and bring it to the user. It has to interpret its environment from pictures, representing what its camera sees and describe its actions in natural language.

More complicated scenarios may involve a baby lying on the floor in the path to the kitchen, valid user trying to turn of the agent, invalid user trying to turn of the agent and so on.

1Beth Barnes1y

I think some tasks like this could be interesting, and I would definitely be very happy if someone made some, but doesn't seem like a central example of the sort of thing we most want. The most important reasons are: (1) it seems hard to make a good environment that's not unfair in various ways (2) It doesn't really play to the strengths of LLMs, so is not that good evidence of an LLM not being dangerous if it can't do the task. I can imagine this task might be pretty unreasonably hard for an LLM if the scaffolding is not that good. Also bear in mind that we're focused on assessing dangerous capabilities, rather than alignment or model "disposition". So for example we'd be interested in testing whether the model is able to successfully avoid a shutdown attempt when instructed, but not whether it would try to resist such attempts without being prompting to.

Current AIs Provide Nearly No Data Relevant to AGI Alignment

Ape in the coat1y00

The novel views are concerned with the systems generated by any process broadly encompassed by the current ML training paradigm.

Omnicide-wise, arbitrarily-big LLMs should be totally safe.

This is an optimistic take. If we could be rightfully confident that our random search through mindspace with modern ML methods can never produce "scary agents", a lot of our concerns would go away. I don't think that it's remotely the case.

The issue is that this upper bound on risk is also an upper bound on capability. LLMs, and other similar AIs, are not going to

... (read more)

3Alex Turner1y

I understand this to connote "ML is ~uninformatively-randomly-over-mindspace sampling 'minds' with certain properties (like low loss on training)." If so—this is not how ML works, not even in an approximate sense. If this is genuinely your view, it might be helpful to first ponder why statistical learning theory mispredicted that overparameterized networks can't generalize.

AI ALIGNMENT FORUM
AF

All of Ape in the coat's Comments + Replies