(Cross-posted to EA Forum.)
I’m a Senior Program Officer at Open Phil, focused on technical AI safety funding. I’m hearing a lot of discussion suggesting funding is very tight right now for AI safety, so I wanted to give my take on the situation.
At a high level: AI safety is a top priority for Open Phil, and we are aiming to grow how much we spend in that area. There are many potential projects we'd be excited to fund, including some potential new AI safety orgs as well as renewals to existing grantees, academic research projects, upskilling grants, and more.
At the same time, it is also not the case that someone who reads this post and tries to start an AI safety org would necessarily have an easy time raising funding from us. This is because:
Thanks for writing this! I agree.
I used to think that starting new AI safety orgs is not useful because scaling up existing orgs is better:
And yet, existing org do not just hire more people. After talking to a few people from AIS orgs, I think the main reason is that scaling is a lot harder than I would intuitively think.
We also see the effects of coordination costs/"scaling being hard" in industry, where there is a pressure towards people working longer hours. (It's not common that companies encourage employees to work part-time and just hire more people.)
Thanks for the post! I think it does a good job of describing key challenges in AI field-building and funding.
The talent gap section describes a lack of positions in industry organizations and independent research groups such as SERI MATS. However, there doesn't seem to be much content on the state of academic AI safety research groups. So I'd like to emphasize the current and potential importance of academia for doing AI safety research and absorbing talent. The 80,000 Hours AI risk page says that there are several academic groups working on AI safety including the Algorithmic Alignment Group at MIT, CHAI in Berkeley, the NYU Alignment Research Group, and David Krueger's group in Cambridge.
The AI field as a whole is already much larger than the AI safety field so I think analyzing the AI field is useful from a field-building perspective. For example, about 60,000 researchers attended AI conferences worldwide in 2022. There's an excellent report on the state of AI research called Measuring Trends in Artificial Intelligence. The report says that most AI publications come from the 'education' sector which is probably mostly universities. 75% of AI publications come from the education sector and the rest are published by non-profits, industry, and governments. Surprisingly, the top 9 institutions by annual AI publication count are all Chinese universities and MIT is in 10th place. Though the US and industry are still far ahead in 'significant' or state-of-the-art ML systems such as PaLM and GPT-4.
What about the demographics of AI conference attendees? At NeurIPS 2021, the top institutions by publication count were Google, Stanford, MIT, CMU, UC Berkeley, and Microsoft which shows that both industry and academia play a large role in publishing papers at AI conferences.
Another way to get an idea of where people work in the AI field is to find out where AI PhD students go after graduating in the US. The number of AI PhD students going to industry jobs has increased over the past several years and 65% of PhD students now go into industry but 28% still go into academic jobs.
Only a few academic groups seem to be working on AI safety and many of the groups working on it are at highly selective universities but AI safety could become more popular in academia in the near future. And if the breakdown of contributions and demographics of AI safety will be like AI in general, then we should expect academia to play a major role in AI safety in the future. Long-term AI safety may actually be more academic than AI since universities are the largest contributor to basic research whereas industry is the largest contributor to applied research.
So in addition to founding an industry org or facilitating independent research, another path to field-building is to increase the representation of AI safety in academia by founding a new research group though this path may only be tractable for professors.
I’m writing this in my own capacity. The views expressed are my own, and should not be taken to represent the views of Apollo Research or any other program I’m involved with.
TL;DR: I argue why I think there should be more AI safety orgs. I’ll also provide some suggestions on how that could be achieved. The core argument is that there is a lot of unused talent and I don’t think existing orgs scale fast enough to absorb it. Thus, more orgs are needed. This post can also serve as a call to action for funders, founders, and researchers to coordinate to start new orgs.
This piece is certainly biased! I recently started an AI safety org and therefore obviously believe that there is/was a gap to be filled.
If you think I’m missing relevant information about the ecosystem or disagree with my reasoning, please let me know. I genuinely want to understand why the ecosystem acts as it does right now and whether there are good reasons for it that I have missed so far.
Why?
Before making the case, let me point out that under most normal circumstances, it is probably not reasonable to start a new organization. It’s much smarter to join an existing organization, get mentorship, and grow the organization from within. Furthermore, building organizations is hard and comes with a lot of risks, e.g. due to a lack of funding or because there isn’t enough talent to join early on.
My core argument is that we’re very much NOT under normal circumstances and that, conditional on the current landscape and the problem we’re facing, we need more AI safety orgs. By that, I primarily mean orgs that can provide full-time employment to contribute to AI safety but I’d also be happy if there were more upskilling programs like SERI MATS, ARENA, MLAB & co.
Talent vs. capacity
Frankly, the level of talent applying to AI safety organizations and getting rejected is too high. We have recently started a hiring round and we estimate that a lot more candidates meet a reasonable bar than we could hire. I don’t want to go into the exact details since the round isn’t closed but from the current applications alone, you could probably start a handful of new orgs.
Many of these people could join top software companies like Google, Meta, etc. or even already are at these companies and looking to transition into AI safety. Apollo is a new organization without a long track record, so I expect the applications for other alignment organizations to be even stronger.
The talent supply is so high that a lot of great people even have to be rejected from SERI MATS, ARENA, MLAB, and other skill-building programs that are supposed to get more people into the field in the first place. Also, if I look at the people who get rejected from existing orgs like Anthropic, OpenAI, DM, Redwood, etc. it really pains me to think that they can’t contribute in a sustainable full-time capacity. This seems like a huge waste of talent and I think it is really unhealthy for the ecosystem, especially given the magnitude and urgency of AI safety.
Some people point to independent research as an alternative. I think independent research is a temporary solution for a small subset of people. It’s not very sustainable and has a huge selection bias. Almost anyone with a family or with existing work experience is not willing to take the risk. In my experience, women also have a disproportional preference against independent research compared to men, so the gender balance gets even worse than it already is (this is only anecdotal evidence, I have not looked at this in detail).
Furthermore, many people just strongly prefer working with others in a less uncertain, more regular environment of an organization, even if that organization is fairly new. It’s just more productive and more fun to work with a team than as an independent.
Additionally, getting funding for independent research right now is also quite hard, e.g. the LTFF is currently quite resource-constrained and has a high bar (not their fault!; update: may be less funding constrained now). All in all, independent research just really seems like a band-aid to a much bigger problem.
Lastly, we’re strongly undercounting talent that is not already in our bubble. There are a lot of people concerned about existential risks from AI that are already working at existing tech companies. They would be willing to “take the jump” and join AI safety orgs but they wouldn’t take the risk to do independent research. These people typically have a solid ML background and could easily contribute if they skilled up in alignment a bit.
In almost all normal industries, organizations are aware that their hires don’t really contribute for the first couple of months and see this education as an upfront investment. In AI safety, we currently have the luxury that we can hire people who can contribute from day one. While this is very comfortable for the organizations, it really is a bad sign for the ecosystem as a whole.
The opportunity costs are minimal
Even if more than half of new AI safety orgs failed, it seems plausible that the opportunity costs of funding them are minimal. A lot of the people who would be enabled by more organizations would just be unable to contribute in the counterfactual world. In fact, more people might join capability teams at existing tech companies for lack of alternatives.
In the cases where a new org would fail, their employees could likely try to join another AI safety org that is still going strong. That seems totally fine, they probably learned valuable skills in the meantime. This just seems like normal start-ups and companies operate and should not discourage us from trying.
I have heard an argument along the lines of “new orgs might lock up great talent that then can’t contribute at the frontier safety labs” which would imply some opportunity costs. This doesn’t seem like a strong argument to me. The people in the new orgs are free to move to big orgs and Anthropic/DM/OpenAI has a lot more money, status, compute, and mentorship available. If they want to snatch someone, they probably could. This feels like something that the people themselves should decide--right now they just don’t have that option in the first place.
We are wasting valuable time
Warning: This section is fairly speculative.
I know that some people have longer timelines than I do but even under more conservative timelines, it seems reasonable that there should be more people working on AI safety right now.
If we think that “solving” alignment is about as hard as the Manhatten project (130k people) or the Apollo project (400k people) then we need to scale the size of the community by about 3 orders of magnitude, assuming there are currently 100-500 full-time employees working on alignment. Let’s assume, for simplicity, Ajeya Cotra’s latest public median estimate for TAI of ~2040. This would imply about 3 OOMs in ~15 years or 1 OOM every 5 years. If you think it’s harder than the two named projects maybe 4 OOMs may be more accurate.
There are a couple of additional changes I’d personally like to make to this trajectory. First, I’d much rather eat up the first 2 OOMs early (maybe in the first 3-5 years) so we can build relevant expertise and engage in research agendas with longer payoff times. Second, I just don’t think the timelines are accurate and would rather calculate with 2033 or earlier. Under these assumptions, 10x every 2 years seems more appropriate.
A 10x growth every ~4 years could be done with a handful of existing orgs but a 10x growth every ~2 years probably requires more orgs. Since my personal timelines are much closer to the latter, I’m advocating for more organizations. Even on the slower end of the spectrum, we should probably not bank on the fact that the existing orgs are able to scale at the required pace and diversify our bets.
I sometimes hear a view that doing research now is almost irrelevant and we should keep a big war chest for “the end game”. I understand the intuition but it just feels wrong to me. Lots of research is relevant today and we do already get relevant feedback from empirical work on current AI systems. Evals and interpretability are obvious examples but research into scalable oversight or adversarial training can be done today as well and seems relevant for the future. Furthermore, if we wanted to spend >50% of the budget in the last year (assuming we knew when that was), we still need people to spend that money on. Building up the research and engineering capacity today seems already justified from a skill- and tool-building perspective alone.
The funding exists (maybe?)
I find it hard to get a good sense of the funding landscape right now (see e.g. this post), for example, I currently don’t have a good estimate of how much money OpenPhil has available and how much of that is earmarked for AI safety. Thus, I won’t speculate too much on the long-year funding plan of existing AI safety funders.
However, historically funding for AI safety has looked ~like this (copied from this post):
This indicates that OpenPhil, SFF, and others allocate high double-digit millions into AI safety every year and the trend is probably rising. My best guess is, therefore, that funders would be willing to support new orgs with reasonably-sized seed grants if they met their bar. I’m not very sure where that bar currently is, but I personally think it totally makes sense to fully fund a fairly junior group of researchers for a year who want to give it a go as long as they have a somewhat reasonable plan (as stated above, the opportunity costs just aren’t that high). Funders like OpenPhil might be hesitant to fund a new org but they may be more willing to fund a preliminary working group or collective that could then transition into an org if needed.
If funders agree with me here, I think it would be great if they signaled this willingness, e.g. by having a “starter package” for an org where a group of up to 5 people gets $100-300k (includes salary, compute, food, office, equipment, etc.) per person to do research for ~a year (where the exact amount depends on experience and promisingness). To give concrete examples, Jesse’s work on SLT and Kaarel’s work on DLK (or to be more precise, the agendas that followed from those works; the actual agendas are probably many months ahead of the public writing by now) are promising enough that I would totally give them $500k+ for a year to develop it further if I were in a position to move these amounts of money. If the project doesn’t work out, they could still close the org/working group and move on.
Another option is to look for grants from other sources, e.g. non-EA funders or VCs.
AI safety is becoming more and more mainstream and philanthropic and private high-networth individuals are becoming interested in allocating money to the space. Furthermore, there are likely going to be government grants available for AI safety in the near future. These funding schemes are often opaque and the probability of success is lower than with traditional EA funders but it is now at least possible at all.
Another source of funding that I was not really considering until recently is VC funding. There are a couple of VCs who are interested in AI safety for the right reasons and it seems to me that there will be a large market around AI safety in some form. People just want to understand what’s going on in their models and what their limitations are, so there surely is a way to create products and services to satisfy these needs. It’s very important though to check if your strategy is compatible with the VCs’ vision and to be honest about your goals. Otherwise, you’ll surely end up in conflict and won’t be able to achieve the goal of reducing catastrophic risks from AI.
VC backing obviously reduces your option space because you need to eventually make a product. On the other hand, there are many more VCs than there are donors, so it may be worth the trade-off (also getting VC backing doesn’t exclude getting donations, it just makes them less likely).
How big is the talent gap?
I don’t know exactly how big the gap between available spots and available talent is but my best guess is ~2-20x depending on where you set the bar.
I don’t have any solid methodology here and I’d love for someone to do this estimate properly but my personal intuitions come from the following:
I can’t put any direct numbers on that because it would reveal information I’m not comfortable sharing in public but I can say that my intuitive aggregate conclusion from this results in a 2-20x talent-to-capacity gap. The 2x roughly corresponds to “could meaningfully contribute within a month of employment” and the 20x roughly corresponds to “could meaningfully contribute within half a year or less if provided with decent mentorship”.
How did we end up here?
My current feeling is that AI safety is underfunded and desperately needs more capacity to allow more talent to contribute. I’m not sure how we ended up here but I could imagine the following reasons to play a role.
Potentially this has led to a “stand-off” situation where funders are waiting for good opportunities and people who could start something see the change of the funding situation and are more hesitant to start something and as a result nothing happens.
I think a couple of new organizations and programs from the last 2 years, e.g. Redwood, FAR, CAIS, Epoch, SERI MATS, ARENA, etc., look like promising bets so far but I’d like to see many more orgs of the same caliber in the coming years.
Redwood has recently been criticized but I think their story to date is a mostly successful experiment for the AI safety community. In the worst-case interpretation, Redwood produced a lot of really talented AI safety researchers, in the best case, their scientific outputs were also quite valuable. I personally think, for example, that causal scrubbing is an important step for interpretability. Furthermore, I think trying something like MLAB and REMIX was very valuable both for the actual content value as well as the information value. So Redwood is mostly a win for AI safety funding in my books and should not be a reason for more conservative funding strategies just because not everything worked out as intended.
Some common counterarguments
“We don’t need more orgs, we need more great agendas”
There is a common criticism that the lack of orgs is due to the lack of agendas and if there were more great agendas, there would be more orgs. This criticism is often coupled with pointing out the lack of research leads who could execute such an agenda. While I think this is technically true, it doesn’t seem quite right.
The obvious question is how good agendas are developed in the first place. It may be through mentorship at another org or years of experience in academia/independent research. Typically, the people who have this experience are hired by the big labs and therefore rarely start a new org. So if you think that only people who already have a great agenda should start an org, not starting new orgs is probably reasonable under current conditions. However, the assumption that you can only get good at research leadership through this path just seems wrong to me.
First, research leadership is obviously hard but you can learn it. I know lots of people who don’t have research leadership experience yet but who I would judge to be competent at running an agenda that could serve 3-10 people. Surely they would make many mistakes early on but they would grow into the position. Not trying seems like a strictly worse proposal than trying and failing (see section on opportunity costs).
Second, a great agenda just doesn't seem like a necessary requirement. It seems totally fine for me to replicate other people’s work, extend existing agendas, or ask other orgs if they have projects to outsource (usually they do) for a year or so and build skills during that time. After a while, people naturally develop their own new ideas and then start developing their own agendas.
“The bottleneck is mentorship”
It’s clearly true that mentorship is a huge bottleneck but great mentors don’t fall from the sky. All senior people started junior and, in my personal opinion, lots of fairly junior people in the AI safety space would already make pretty good mentors early on, e.g. because they have prior management experience outside of AI safety or because they just have good research intuitions and ideas from the start.
Furthermore, a lot of the people who are in the position to provide mentorship are in full-time positions at big labs where they get access to compute, great team members, a big salary, etc. From that position, it just doesn’t seem that exciting to mentor lots of new people even if it had more impact in some cases. Therefore, a lot of very capable mentors are locked in positions where they don’t provide a lot of mentorship to newcomers because existing AI safety labs aren’t scaling fast enough to soak up the incoming talent streams. Thus, I personally think creating more orgs and thus places where mentorship can be learned and provided would be a good response.
Similar to the point about great agendas, it seems fine to me to just let people try, especially if the alternative is independent research or not contributing at all.
“The bottleneck is operations”
Before starting Apollo Research, I was really worried that operations would be a bottleneck. Now I’m not worried anymore. There are a lot of good operations people looking to get into AI safety and there is a lot of external help available.
Historically, EA orgs have sometimes struggled with finding operations talent but I think that was largely due to these orgs not providing an interesting value proposition to the talent they were looking for. For example, ops was sometimes framed as “everything that nobody else wanted to do” (see this post for details). So if you make a reasonable proposition, operations talent will come.
Furthermore, there is a lot of external help for operations in the EA sphere. Some organizations provide fiscal sponsorship and some operations help, e.g. Rethink Priorities or Effective Ventures. Impact ops (new org) may also be able to provide help early on and significantly help with operations.
“The downside risks are too high”
Another counterargument has been that there are large downside risks in funding AI safety orgs, e.g. because they may pull a switcheroo and cause further acceleration. This seems true to me and warrants increased caution but there seem to be lots of organizations that don’t have a high risk of falling into that category. Right now, training models and making progress on them is so expensive that any org with less than ~$50M of funding probably can’t train models close to the frontier anyway.
I personally think that many agendas have some capabilities externalities, e.g. building LM agents for evals or finding new insights for interpretability, but there are ways to address this. Orgs should carefully estimate the safety-capabilities balance of their outputs, including consulting with trusted external parties and then use responsible disclosure and differential publishing to circulate their outputs.
It feels like this is solvable by funding organizations where the leadership has a track record of caring about AI safety for the right reasons and has an agenda that isn’t too entangled with capabilities. It’s a hard problem but we should be able to make reasonable positive EV bets to address it.
“The best talent get jobs, the rest doesn’t matter”
I’ve heard the notion that AI safety talent is power law distributed, and therefore most impact will come from a small number of people anyways. This argument implies that it is fine to keep the number of AI safety researchers as low as it currently is (or scale very slowly) as long as the few relevant ones can contribute. I think there are a couple of problems with this argument.
How?
I can’t provide a perfect recipe that will work for everyone but here are some scattered thoughts and suggestions about starting an org.
What now?
I don’t think everyone should try to start an org but I think there are some ways in which we, as a community, could make it easier for those who want to
Starting an org isn’t easy and lots of efforts will fail. However, given the lack of existing full-time AI safety capacity, it seems like we should try creating more orgs nonetheless. In the best case, a bunch of them will succeed, in the worst case, a “failed org” provides a lot of upskilling opportunities and leadership experience for the people involved.
I think the high quality of rejected candidates in AI safety is a very bad sign for the health of the community at the moment. The fact that lots of people with years of ML research and engineering experience with a solid understanding of alignment aren’t picked up is just a huge waste of talent. As an intuitive benchmark, I would like to get to a world where at least half of all SERI MATS scholars are immediately hired after the program and we aren’t even close to that yet.