I'll be interested in the results! First-principles reasoning being kinda hard, I'm curious how much people are going to try to chew bite-sized pieces vs. try to absorb a ball of energy bigger than their head.
Yeah, I will be posting updates, and probably the participants themselves will post some notes and related ideas. Excited too about how it's going to pan out!
This post is part of the work done at Conjecture.
Tl;dr: We need far more conceptual AI alignment research approaches than we have now if we want to increase our chances to solve the alignment problem. However, the conceptual alignment field remains hard to access, and what feedback and mentorship there is focuses around few existing research directions rather than stimulating new ideas. This model lead to the creation of Refine, a research incubator for potential conceptual alignment researchers funded by the LTFF and hosted by Conjecture. Its goal is to help conceptual alignment research grow in both number and variety, through some minimal teaching and a lot of iteration and feedback on incubatees’ ideas. The first cohort has been selected, and will run from August to October 2022. In the bigger picture, Refine is an experiment within Conjecture to find ways of increasing the number of conceptual researchers and improve the rate at which the field is making productive mistakes.
The Problem: Not Enough Varied Conceptual Research
I believe that in order to solve the alignment problem, we need significantly more people attacking it from a lot different angles.
Why? First because none of the current approaches appears to yield a full solution. I expect many of them to be productive mistakes we can and should build on, but they don't appear sufficient, especially with shorter timelines.
In addition, the history of science teaches us that for many important discoveries, especially in difficult epistemic situations, the answers don't come from one lone genius seeing through the irrelevant details, but instead from bits of evidence revealed by many different takes and operationalizations[1] (possibly unified and compressed together at the end). And we should expect alignment to be hard based on epistemological vigilance.
So if we accept that we need more people tackling alignment in more varied ways, why are we falling short of that ideal? Note that I will focus here on conceptual researchers, as they are the source of most variations on the problem, and because they are so hard to come by.
I see three broad issues with getting more conceptual alignment researchers working on wildly different approaches:
Refine, the incubator for conceptual researchers and research bets that I'm running at Conjecture, aims at addressing these issues.
Description of Refine
Research Incubator
Refine is a research incubator: that is, a program for helping potential conceptual researchers improve and create relevant ideas and research. It's inspired by startup incubators like Y combinator, but with a focus on research. As such, the point is not to make participants work on already trusted research directions, but to give them all the help they need to create exciting and relevant new research questions and ideas that are highly relevant to alignment.
In broad strokes, Refine starts with two weeks focused around studying and discussing core ideas in the History and Philosophy of Science and in the Epistemology of Alignment, followed by 10 weeks of intense idea-generation-feedback-writing loops (for a total of 3 months).
At the end, the research produced will be evaluated by established conceptual researchers, and we'll help the incubatees get funding or get hired (at Conjecture or other places).
In more details, the first cohort of Refine will follow this process:
Generalist Mentors
Rather than having current researchers act as PhD advisors on their own topics, Refine aims at leveraging more generalist mentors (currently me) who can see value and issues in almost all approaches, while understanding the problem deeply enough to give relevant feedback. The hope is that this kind of support will minimize ontological commitments while still biasing the work towards the hard problem.
In addition, generalist mentors avoid the overuse of the scarce resource of conceptual researchers, and might be a great fit for thinkers focused on the sort of epistemological work I'm doing at Conjecture.
Selection and Respect
(The Black Swan, Nassim Nicholas Taleb, 2007)
Building and running a program like Refine leads to a conundrum. On the one hand, there are obviously tests and evaluations involved: at the beginning to select people, during the program, and at the end to decide if the program was successful. On the other hand, the anxiety of being always judged and evaluated is corrosive, as Taleb expresses so clearly.
I don't have a perfect solution. The dark world is that both need to be taken into account for the program to succeed.
My current choice is to use these two different frames in distinct contexts. During the selection process, and when making the post-mortem, I should take an evaluative frame, while remembering that historical progress is incredibly more subtle than the parody we often make of it. And during the actual running of the program, I shouldn't be in an evaluative mindset, but only focus on how to help the participants do the best they can.
Difference with Other Programs
With more and more programs around alignment in the last few years, it makes sense to ask if the problem we're tackling with Refine has not been addressed already. I'm definitely excited about all these programs; yet they all target different enough problems that I don't think they are addressing the lack of varied conceptual research completely.
Some Concrete Details
The first cohort of Refine, funded by the Long-Term Future Fund, will happen from August to October 2022. The ops are managed by Conjecture, and it will happen in France initially (for administrative reasons), then in London at Conjecture's offices. We pay incubatees a stipend, and also cover all their travel and housing.
The first cohort is composed of Alexander Gietelink Oldenziel, Chin Ze Shen, Tamsin Leake, Linda Linsefors, and Paul Bricman. In terms of statistics, it's interesting to notice that none of the participants are British or American: 4 out of 5 are from continental Europe, and one is from Southeast Asia. In terms of knowledge of alignment, 2 have a deep interaction with the field, 2 have thought independently about it a lot, and one is relatively new to it.
For the final evaluation, Steve Byrnes, Vanessa Kosoy, Evan Hubinger, Ramana Kumar, and John Wentworth all committed to look and evaluate the output of at least a few participants, and give judgment on whether they are excited by the research produced.
The Long View: Refine and Conjecture
The idea for Refine mostly came from my own frustrations with the small growth of conceptual alignment research, and from a project of an independent lab with Jessica Cooper.
Yet Conjecture management has been excited about it since even before I joined officially, and Refine fits well within the core mission of Conjecture: to improve and scale alignment research by finding many angles of attack on the problem and then supporting researchers to do the best possible work.
In this perspective, Refine is an experiment to find ways of diversifying alignment research and making more productive mistakes. It's a tentative way of converting resources into more varied and unexplored alignment research directions, and generally to help create more and better conceptual alignment researchers.
If Refine is successful at producing exciting new research and researchers, then finding ways to replicate it, improve it, and scale it (maybe in a decentralized way) will become one of Conjecture's priorities. If it isn't successful, then we will learn the most we can from the failure and iterate on other options to create great and varied conceptual alignment research.
I also see a strong synergy between the needs of Refine-like programs and the epistemology team I'm leading at Conjecture. More specifically, researchers focused on the History and Philosophy of Science and the Epistemology of Alignment seem like great fits for generalist mentors, because they are steeped in the details of progress and alignment enough to provide useful and subtle feedback while minimizing ontological commitments.
I will dig into this in future posts, but if you want pointers now, you can see my post on productive mistakes, Chapter 2 (on electrolysis) and Chapter 3 (on chemical atomism) of Is Water H2O? by Hasok Chang, and Rock, Bone, and Ruin by Adrian Currie.