User Comment Replies — AI Alignment Forum

Survey of NLP Researchers: NLP is contributing to AGI progress; major catastrophe plausible

jungofthewon3y20

This was really interesting, thanks for running and sharing! Overall this was a positive update for me.

Results are here

I think this just links to PhilPapers not your survey results?

1Sam Bowman3y

Thanks! Fixed link.

Common misconceptions about OpenAI

jungofthewon3y50

I also appreciated reading this.

How to do theoretical research, a personal perspective

jungofthewon3y30

This was really helpful and fun to read. I'm sure it was nontrivial to get to this level of articulation and clarity. Thanks for taking the time to package it for everyone else to benefit from.

Rant on Problem Factorization for Alignment

jungofthewon3y60

If anyone has questions for Ought specifically, we're happy to answer them as part of our AMA on Tuesday.

Rant on Problem Factorization for Alignment

jungofthewon3y121

I think we could play an endless and uninteresting game of "find a real-world example for / against factorization."

To me, the more interesting discussion is around building better systems for updating on alignment research progress -

What would it look like for this research community to effectively update on results and progress?
What can we borrow from other academic disciplines? E.g. what would "preregistration" look like?
What are the ways more structure and standardization would be limiting / taking us further from truth?
Wh

... (read more)

5johnswentworth3y

The problem with not using existing real-world examples as a primary evidence source is that we have far more bits-of-evidence from the existing real world, at far lower cost, than from any other source. Any method which doesn't heavily leverage those bits necessarily makes progress at a pace orders of magnitude slower. Also, in order for factorization to be viable for aligning AI, we need the large majority of real-world cognitive problems to be factorizable. So if we can find an endless stream of real-world examples of cognitive problems which humans are bad at factoring, then this class of approaches is already dead in the water.

How do we prepare for final crunch time?

jungofthewon4y30

Access

Alignment-focused policymakers / policy researchers should also be in positions of influence.

Knowledge

I'd add a bunch of human / social topics to your list e.g.

Policy
Every relevant historical precedent
Crisis management / global logistical coordination / negotiation
Psychology / media / marketing
Forecasting

Research methodology / Scientific “rationality,” Productivity, Tools

I'd be really excited to have people use Elicit with this motivation. (More context here and here.)

Re: competitive games of introducing new tools, we di... (read more)

The case for aligning narrowly superhuman models

jungofthewon4y50

This is exactly what Ought is doing as we build Elicit into a research assistant using language models / GPT-3. We're studying researchers' workflows and identifying ways to productize or automate parts of them. In that process, we have to figure out how to turn GPT-3, a generalist by default, into a specialist that is a useful thought partner for domains like AI policy. We have to learn how to take feedback from the researcher and convert it into better results within session, per person, per research task, across the entire product. Another spin on it: w... (read more)

Forecasting Thread: AI Timelines

jungofthewon5y20

I generally agree with this but think the alternative goal of "make forecasting easier" is just as good, might actually make aggregate forecasts more accurate in the long run, and may require things that seemingly undermine the virtue of precision.

More concretely, if an underdefined question makes it easier for people to share whatever beliefs they already have, then facilitates rich conversation among those people, that's better than if a highly specific question prevents people from making a prediction at all. At least as much, if not more, of the value ... (read more)

AI ALIGNMENT FORUM
AF

All of jungofthewon's Comments + Replies