Tips for Empirical Alignment Research

[-]Henry Sleight2y31

First off: as one of Ethan's current Astra Fellows (and having worked with him since ~last October) I especially think his collaborators in MATS and Astra historically underweight how valuable overcommunicating with Ethan is, and routinely underbook meetings to ask for his support.

Second, I think this post is so dense with useful advice, so I made anki flashcards of Ethan's post using GPT-4 (generated via ankibrain [https://ankiweb.net/shared/info/1915225457] , small manual edits.)

You can find them here: https://drive.google.com/file/d/1G4i7iZbILwAiQ7FtasSoLx5g7JIOWgeD/view?usp=sharing

[-]qxcv2y10

For highly empirical research, it’s critical to get quick feedback and iterate on ideas rapidly. Jacob Steinhardt has a great blog post describing that a really good strategy for doing research is to “reduce uncertainty at the fastest possible rate”

Michael Bernstein's slides on velocity are a great resource for learning this mindset this as well. I particularly like his metaphor of the "swamp". This is the place you get stuck when you really want technique X to work for the project to progress, but none of the ways that you've tried applying it have succeeded. The solution is to have high velocity: that is, to test out as many ideas as possible per unit time until you get out the swamp. Other highlights of the slide deck include the focus on answering questions rather than doing engineering, and the related core-periphery distinction between things that are strictly needed to answer a question & those that can be ignored/mocked up/replaced for testing (which echoes the ideas in the "workflow" section of this post).

(Although they're similar, I'd argue that Michael's approach is easier to apply to empirical alignment research than Jacob's "stochastic decision process" approach. That's because falsifying abstract research ideas in empirical deep learning is hard (impossible?), and you don't get much generalizable knowledge from failing to get one idea to work. The real aim is to find one deep insight that does generalize—hence the focus on trying many distinct approaches.)

[-]Ethan Perez2y11

Yeah, I think this is one of the ways that velocity is really helpful. I'd probably add one caveat specific to research on LLMs, which is that, since the field/capabilities are moving so quickly, there's much, much more low-hanging fruit in empirical research than almost any other field of research. This means that, for LLM research specifically, you should rarely be in a swamp, because that means that you've probably run through the low-hanging fruit on that problem/approach, and there's other low-hanging in other areas that you probably want to be picking instead.

(High velocity is great for both picking low-hanging fruit and for getting through swamps when you really need to solve a particular problem, so it's useful to have either way)

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

86

Tips for Empirical Alignment Research

86

What success generally looks like

Tactical Research Tips & Approach

Workflow

Reading research papers:

General Mindset Tips

Three modes of research

Work Habits

Taking health seriously

Machine Learning/Engineering Footguns

Default Norms for Projects with Me