The main value to me is being updated on all the research that is going on in this field. If the newsletter went away and nothing else changes, I don't know how I would find all the new relevant papers and posts that come out.
I think I've commented on your newsletters a few times, but haven't comment more because it seems like the number of people who would read and be interested in such a comment would be relatively small, compared to a comment on a more typical post. A lot of people who read your newsletters are doing so by email and won't even see my comment, and someone who does read them through LW/AF might not be interested in the particular paper (or your opinion of it) that I want to discuss. Plus, the fact that you avoid giving strong negative opinions (which BTW seems sensible to me for a newsletter format) makes it less likely that I feel an urgent need to correct something.
One idea you can consider is to create individual link posts on AF for the most important papers/posts that you include in the newsletter (with your summaries and opinions) that haven't already been posted to AF, which would create focal points for discussing them. I think if I had a thought on some paper that is mentioned in your newsletter, I'd be more inclined to write a comment for it under its own link post as opposed to under your newsletter post. I would also be more inclined to comment on your summaries and opinions if there was a chance to correct something before it went out to your email subscribers. This could also be a way for you to solicit summaries from random readers.
Thanks! Link posts on AF are an interesting idea; my current expectation is that very few people apart from you would comment on them, but it seems worth trying.
I would also be more inclined to comment on your summaries and opinions if there was a chance to correct something before it went out to your email subscribers.
This makes sense, will think about how to make it happen.
One option that's smaller than link posts might be to mention in the AF/LW version of the newsletter which entries are new to AIAF/LW as far as you know; or make comment threads in the newsletter for those entries. I don't know how useful these would be either, but it'd be one way to create common knowledge 'this is currently the one and only place to discuss these things on LW/AIAF'.
Copied from my answer in the feedback form:
I'm a layman, attempting to help with infrastructure for technical people, who reads the newsletter sporadically to keep up with the overall trends in AI and AI Safety.
Right now I read the newsletter fairly sporadically. I think it might benefit me to, once a year, or maybe once a quarter, reading a higher level summary that goes over which papers seemed most important that year, and which overall research trends seemed most significant. I'm not sure if this is worth the opportunity cost for you, but it'd be helpful to me and probably others.
(I'd be interested in that both from the standpoint of my own personal knowledge, as well as tracking how stable your opinions are over time – when you list something as particularly interested or important do you tend to still think so a year later?)
I also think it'd make more sense for LessWrong to curate a "highlights of the highlights" post once every 3-12 months, than what we currently do, which is every so often randomly decide that a recent Newsletter was particularly good and curate that.
I think it might benefit me to, once a year, or maybe once a quarter, reading a higher level summary that goes over which papers seemed most important that year, and which overall research trends seemed most significant. I'm not sure if this is worth the opportunity cost for you, but it'd be helpful to me and probably others.
A slightly different option would be to read the yearly AI alignment literature review, use that to find the top N most interesting papers, and read their summaries in the spreadsheet. This also has the benefit of showing you a perspective other than mine on what's important -- there could be an Agent Foundations paper in the list that I haven't summarized.
(I'd be interested in that both from the standpoint of my own personal knowledge, as well as tracking how stable your opinions are over time – when you list something as particularly interested or important do you tend to still think so a year later?)
I think that the stability of my opinions is going up over time, mainly because I started the newsletter while still new to the field.
I also think it'd make more sense for LessWrong to curate a "highlights of the highlights" post once every 3-12 months, than what we currently do, which is every so often randomly decide that a recent Newsletter was particularly good and curate that.
This seems good; I'm currently thinking I could write something like that once every 25 newsletters (which is about half a year), which should also help me evaluate the stability of my opinions.
Comment thread for the question: Am I underestimating the risk of causing information cascades? Regardless, how can I mitigate this risk?
Comment thread for the question: What can I do to get more feedback on the newsletter on an ongoing basis (rather than having to survey people at fixed times)?
Comment thread for the question: How should I deal with the growing amount of AI safety research?
On April 9, 2018, the first Alignment Newsletter was sent out to me and one test recipient. A year later, it has 889 subscribers and two additional content writers, and is the thing for which I’m best known. In this post I look at the impact of the newsletter and try to figure out what, if anything, should be changed in the future.
(If you don’t know about the newsletter, you can learn about it and/or sign up here.)
Summary
In which I badger you to take the 3-minute survey, and summarize some key points.
Actions I’d like you to take
Everything else
Newsletter updates
In which I tell you about features of the newsletter that you probably didn’t know about.
Spreadsheet
Many of you probably know me as the guy who summarizes a bunch of papers every week. I claim you should instead think of me as the guy who maintains a giant spreadsheet of alignment-related papers, and incidentally also sends out a changelog of the spreadsheet every week. You could use the spreadsheet by reading the changelog every week, but you could also use it in other ways:
I find myself using the spreadsheet a couple of times a week, often to remind me of what I thought about a paper or post that I had read a long time ago, but also for literature reviews and finding papers that I vaguely remember that are relevant to what I’m currently thinking about. Of course, I have a better grasp of the spreadsheet making search easy; the categories make intuitive sense to me; and I read far more than the typical researcher, so I’d expect it to significantly more useful to me than to other people. (On the other hand, I don’t benefit from discovering new material in the spreadsheet, since I’m usually the one who put it there.)
Translation
Xiaohu Zhu has offered to translate the Alignment Newsletter to Mandarin! His translations can be found here; I also copy them over to the main Alignment Newsletter page. I’d be excited to see more Chinese AI researchers reading the newsletter content.
Newsletter stats
In which I present raw data and questions of uncertainty. This might be useful to understand newsletters broadly, but I won’t be drawing any big conclusions. The main takeaway is that lots of people read the newsletter; in particular, there are more subscribers than researchers in the field. Knowing that, you can skip ahead to “Impact of the newsletter” and things should still make sense.
Growth
As of Friday April 5, according to Mailchimp, there are 889 subscribers to the newsletter. Typically, the open rate is just over 50%, and the click-through rate is 10-15%. My understanding is that this is very high relative to other online mailing lists; but that could be because of online shopping mailing lists, where you are incentivized to send lots of emails at the expense of open and click-through rates. There are probably also readers who read the newsletter on the Alignment Forum, LessWrong, or Twitter.
The newsletter typically gets a steady trickle of 0-25 new subscribers each week, and sometimes gets a large increase. Here are all of the weeks in which there were >25 new subscribers:
AN #1 -> AN #2: 2 -> 141 subscribers (+139), because of the initial announcement.
AN #3 -> AN #4: 148 -> 238 subscribers (+90), probably still because of the initial announcement, though I don’t know why it grew so little between #2 and #3.
AN #14 -> AN #15: 328 -> 405 subscribers (+77), don’t know why (though I think I did know at the time)
AN #16 -> AN #17: 412 -> 524 subscribers (+112), because of Miles Brundage’s tweet on July 23 about his favorite newsletters.
AN #17 -> AN #18: 524 -> 553 subscribers (+29), because of this SSC post on July 30 and the LessWrong curation of AN #13 on Aug 1.
AN #18 -> AN #19: 553 -> 590 subscribers (+37), because of residual effects from the past two weeks.
AN #30 -> AN #31: 653 -> 689 subscribers (+36), because of Rosie Campbell’s blog post on Oct 29 about her favorite newsletters.
Over time, the opens and clicks have gone down as a percentage of subscribers, but have gone up in absolute numbers. I would guess that the biggest effect is that the most interested people subscribed early, and so as time goes on the marginal subscriber is less interested and ends up bringing down the percentages. Another effect would be that over time people get less interested in the newsletter, and stop opening/clicking on it, but don’t unsubscribe. However, over the last few months, rates have been fairly stable, which suggests this effect is negligible.
On the other hand, during the last few months growth has been organic / word-of-mouth rather than through “publicity” like Miles’s tweet and Rosie’s blog post, so it’s possible that organic growth leads to more interested subscribers who bring up the rates, and this effect approximately cancels the decrease in rates from people getting bored of the newsletter. I could test this with more fine-grained data about individual subscribers but I don’t care enough.
So far, I have not been trying to publicize the newsletter beyond the initial announcement. I'm still not sure of the value of a marginal reader obtained via “publicity”. The newsletter seems to me to be both technical and insider-y (i.e. it assumes familiarity with basic AI safety arguments), while the marginal reader from “publicity” seems not very likely to be either. That said, I have heard from a few readers that the newsletter is reasonably easy to follow, so maybe I'm putting too much weight on this concern. I’d love to hear thoughts in the comments.
Composition of subscribers
I don’t know who these 889 subscribers are; it’s much larger than the size of the field of AI safety. Even if most of the technical safety researchers and strategy/policy researchers have subscribed, that would only get us to 100-200 subscribers. Some guesses on who the remaining people are:
Regardless of the answer, I’m surprised that these people find the newsletter valuable. Most of the time I’m writing to technical safety researchers, and relying on an assumption of shared jargon and underlying intuitions that I don’t explain. It’s not as bad as it could be, since I try to make my explanations accessible both to people working in traditional AI as well as people at MIRI, but I would have guessed that it was still not easy to understand from the outside. Some hypotheses, only the first of which seems plausible:
I sampled 25 people uniformly at random from the subscribers. Of these, I have met 8 of them, and have heard of 2 more. I would categorize the 25 people in the following rough categories: x-risk community (4), AI researchers sympathetic to x-risk (2), students (3), people interested in AI and x-risk (3), people involved with AI startups (2), researcher with no publicly obvious interest in x-risk (6), and could not be found easily (5). But really the most salient outcome was that for anyone I didn’t already know, I found it very hard to figure out why they were subscribed to the newsletter.
Impact of the newsletter
In which I try and fail to figure out whether the benefits outweigh the costs.
Benefits
Here are the main sources of value from the newsletter that I see:
When I started the newsletter, I was aiming primarily for the first one, by telling researchers what they should be reading. I continue to optimize mainly for that, though now I often try to provide enough information that researchers don’t have to read the original paper/post. I knew about the second source of value, but didn’t think it would be very large; I’m now more uncertain about how important it is. The reputational effects were more unexpected, since I didn’t think the newsletter would become as large as it currently is. I don’t know much about the last source of value and am basically ignoring it (i.e. pretending it is zero) in the rest of the analysis.
I’m actually quite uncertain about how much value comes from each of these subpoints, mainly because there’s a striking lack of comments or feedback on the newsletter. Excluding one person at CHAI who I talk to frequently, I get a comment on the content of the newsletter maybe once every 3-4 weeks. I can understand that people who get it as an email newsletter may not see an obvious way to comment (replying to a newsletter email is an unusual thing to do), but the newsletter is crossposted to LessWrong, the Alignment Forum, and Twitter. Why aren’t there comments there?
One possibility is that people treat the newsletter as a curation of interesting papers and posts, in which case there isn’t much need to comment. However, I’m fairly confident that many readers also find value in the summaries and opinions. You could instead interpret this as evidence that the things I’m saying are reasonable -- after all, if I was wrong on the Internet, surely someone would let me know. On the other hand, if I’m only saying things that people already believe, am I actually accomplishing anything? It’s hard to say.
I think the most likely story is that I say things that people didn’t know but agree with once I say them -- but I share Raemon’s intuition that people aren’t really learning much if that’s the case. (The rest of that post has many more thoughts on comments that apply to the newsletter.)
Overall it still feels like in expectation most of the value comes from widening the set of fields that any individual technical researcher is following, but it seems entirely possible that the newsletter does not do that at all and as a result only has reputational benefits. (I am fairly confident that the reputational benefits are positive and non-zero.) I’d really like to get more clarity on this, so if you read the newsletter, please take the survey!
Costs
The main cost of the newsletter is the opportunity cost of our time. Each newsletter takes about 15 hours of my time. The newsletter has gotten more detailed over time, but this isn’t reflected in the total hours I put in because it has been approximately offset by new content writers (Richard Ngo and Dan Hendrycks) who took some of the burden of summarizing off of me. Currently I’d estimate that the newsletter takes 15-20 hours in total (with 2-5 hours from Richard and Dan). This can be broken down into time I would have spent reading and summarizing papers anyway, and time that I spent only because the newsletter exists, which we could call “extra hours”. Initially, I wanted to read and summarize a lot of papers for my own benefit, so the newsletter took about 4-5 extra hours per week. Now, I’m less inclined to read a ton of papers, and it take 8-10 extra hours per week.
This means in aggregate I’ve spent 700-800 hours on the newsletter, of which about 300-400 were hours that I wouldn’t have spent otherwise. Even only counting the 300-400 hours, this is comparable to the time I spent on state of the world and learning biases projects together, including all of the time spent on paper writing, blog posts, and talks in addition to the research itself.
In addition to time costs, the newsletter could do harm. While there are many ways this could happen, the only one that feels sufficiently important to consider is the risk of causing information cascades. Since nearly everyone in the field is reading the newsletter, we may all end up with some belief B just because it was in a newsletter. We might then have way too much confidence in B since everyone else also believes B.
Overall I’m not too worried. There’s so much content in the newsletter that I seriously doubt a single idea could spread widely as a result of the newsletter -- inevitably some people won’t remember that particular idea. So we only need to worry about “big” ideas that are repeated often in the newsletter. The most salient example of that would be my general opposition to the Bostrom/Yudkowsky paradigm of AI safety, but it still seems quite prevalent amongst researchers. In addition I’d be really surprised if existing researchers were convinced of a “big” idea or paradigm solely because other researchers believed it (though they might put undue weight on it).
Is the newsletter worth it?
If the only benefit of the newsletter were the reputational effects, it would not be worth my time (even ignoring Richard and Dan’s time). However, I get enough thanks from people in the field that the newsletter must be providing value to them, even though I don’t have a great model of what the value is. My current best guess is that there is a lot of value, which makes the newsletter worth the cost, but I think there is a non-negligible chance that this would be reversed if I had a good model of what value everyone was getting from it.
Going forward
In which I figure out what about the newsletter should change in the future.
Structure of the newsletter
So far I’ve only talked about whether the newsletter is worthwhile as a whole. But of course we can also analyze individual aspects of the newsletter and figure out how important they are.
Opinions are probably the key feature of the newsletter. Many papers and blog posts are aimed more at appearing impressive rather than conveying facts. Even the ones that are truth seeking are subject to publication bias: they are written by people who think that the ideas within are important, and so will be biased towards positivity. As a result, an opinion from a researcher who didn't do the work can help contextualize the results that makes it easier for less involved readers to figure out the importance of the ideas. (As a corollary, I worry about the lack of a fresh perspective on posts that I write, but don’t see an obvious easy solution to that problem.) I think this also contributes to the success of Import AI and ChinAI, which are also quite heavy on opinions.
I think the summaries are also quite important. I aim for the longer summaries to be sufficiently informative that you don’t have to read the blog post / paper unless you want to do a deep dive and really understand the results. For papers, I often roughly aim for it to be more useful to read my summary than to read the abstract, intro, and conclusion of the paper. In the world where the newsletter didn’t have summaries, I think researchers would not keep up as much with the state of the field.
Overall, I think I’m pretty happy with the current structure of the newsletter, and don’t currently intend to change it. But if I get more clarity on what value the newsletter provides to researchers, I wouldn’t be surprised if I would change the structure as a result.
Scaling up
In the year that I’ve been writing the newsletter, the amount of writing that I want to cover has gone up quite a lot, especially with the launch of the Alignment Forum. I expect this will continue, and I won’t be able to keep up.
By default, I would cover less and less of it. However, it would be nice for the spreadsheet to be a somewhat comprehensive database of the AI safety literature. This is not what we currently have, because I often don’t cover good Agent Foundations work because it’s hard for me to understand and I don’t have pre-2018 content, but it is pretty good for the subfields of AI safety that I’m most knowledgeable about.
There has been some outsourcing of work as Richard Ngo and Dan Hendrycks have joined, but it still does not seem sustainable to continue this long-term, due to coordination challenges and challenges with maintaining quality. That said, it’s not impossible that this could work:
That said, in all of these cases, it feels better to instead just summarize a smaller fraction of all the work, especially since the newsletter is already long enough that people probably don’t read all of it, while still adding links to papers that I haven’t read to the spreadsheet. The main value of summarizing everything is having a more comprehensive spreadsheet, but I don’t think this is sufficiently valuable to warrant the approaches above. That said, I could imagine that this conclusion being overturned by having a better model of how the newsletter adds value for technical safety researchers.
Sourcing
So far, I have found papers and articles from newsletters, blogs, Arxiv Sanity and Twitter. However, Twitter has become worse over time, possibly because it has learned to show me non-academic stuff that is more attention-grabbing or controversial, despite me trying not to click on those sorts of things. Arxiv Sanity was my main source for academic work, but recently it’s been getting worse, and is basically not working any more, and I’m not sure why. So I’m now trying to figure out a new way to find relevant literature -- does anyone have suggestions?
If I continue to have trouble, I might summarize random academic papers I’m interested in instead of the ones that have come out very recently.
Appearance
It’s rather annoying that the newsletter is a giant wall of text; it’s probably not fun to read as a result. In addition to the categories, which were partly meant to give structure to the wall of text, I’ve been trying to break things into more paragraphs, but really it needs something much more drastic. However, I also don’t want it to be even more work to get a newsletter out.
So, if anyone wants to volunteer to make the newsletter visually nicer that would be appreciated, but it shouldn’t cost me too much more time (maybe half an hour a week, if it was significantly nicer). One easy possibility would be to include an image at the beginning of the newsletter -- any suggestions for what should go there?
Future of the newsletter
Given the uncertainty of the value of the newsletter, it’s not inconceivable that I decide to stop writing it in the future, or scale back significantly. That said, I think there is value in stability. It is generally bad for a project to have “fits and starts” where its quality varies with the motivation of the person running them, or for the project to potentially be cancelled solely based on how valuable the creator thinks it is. (I’m aware I haven’t argued for this; feel free to ask me about it if it seems wrong.)
Due to this and related reasons, when I started the newsletter, I had an internal commitment to continue writing it for at least six months, as long as most other people thought it was still valuable. Obviously, if everyone agreed that the newsletter was not useful or actively harmful, then I’d stop writing it: this is more to deal with the case where I no longer think the newsletter is useful, even though other people think it is useful.
Now I’m treating it as an ongoing three-month commitment: that is, I am always committing to continue writing the newsletter for at least three months as long as most other people think it is valuable. At any point I can decide to stop the ongoing commitment (presumably when I think it is no longer worth my time to write it); there would then be three months where I would continue to write the newsletter for stability, and figure out what would happen with the newsletter after the three months.
Feedback I’d like
There are a bunch of questions I have, that I’d love to get opinions on either anonymously in the 3-minute survey (which you should fill out!) or in the comments. (Comments preferred because then other people can build off of them.) I’ve listed the questions roughly in order of importance:
Appendix: Alignment Newsletter FAQ
All of these are in the appendix because I don’t particularly care if people read it or not. It’s not very relevant to any of the content in the main post. It is relevant to anyone who might want to start their own newsletter, or their own project more generally.
What’s the history of the Alignment Newsletter?
During one of the CHAI seminars, someone suggested that we each take turns finding and collecting new research papers and sending them out to each other. I already had a system in place doing exactly this, so I volunteered to do this myself (rather than taking turns). I also figured that to save even more CHAI-researcher-time, it would make sense to give a quick summary and then tell people under what circumstances they should read the paper. (I was already summarizing papers for my own notes.)
This pretty quickly proved to be valuable, and I thought about making it public for even more time savings. However, it still seemed pretty nascent and in flux, so I continued iterating on it within CHAI, while thinking about how it could be made to be public-facing. (See also the “Things done right” section.) After a little under two months of writing the newsletter within CHAI, I made it public. At that time, the goal was to provide a list of relevant readings for technical AI safety researchers that had been published each week; and help them decide whether or not they should read them.
Over time, my summaries and opinions became longer and more detailed. I don’t know exactly why this happened. Regardless, at some point I started aiming for some of my summaries to be detailed enough that researchers could just read the summary and not read the paper/post itself.
In September, Richard Ngo volunteered to contribute summaries to the newsletter on a variety of topics, and Dan Hendrycks joined soon after focusing on robustness and uncertainty.
Why do you never have strong negative opinions?
One of the design decisions made at the beginning of the newsletter was to avoid strong critiques of any particular piece of research. This was for a few reasons:
Of course, this decision has downsides as well:
While the first downside seems like a real cost, the second downside is about inhibiting intellectual progress in AI safety research. I think this is okay: intellectual progress does not need to happen in the newsletter. In most of these cases I express stronger disagreements in channels more conducive to intellectual progress (e.g. the Alignment Forum, emails/messages, talking in person, the version of the newsletter internal to CHAI).
Another probable effect of avoiding negativity is reduced readership, since it is likely much more interesting to read a newsletter with active disagreements and arguments than one that dryly summarizes a research paper. I don’t yet know whether this is a pro or a con (even ignoring other effects of negativity).
Mistakes
I don’t know of very many mistakes, even in hindsight. I think this is primarily because I don’t get feedback on the newsletter, not because everything has gone perfectly. It seems quite likely that there are still things that are mistakes; but I don’t know it yet because I don’t have the data to tell.
Analyzing other newsletters. The one thing that I wish I had done was to analyze other newsletters like Import AI in more detail before starting this one. I think it’s plausible that I could have realized the value of opinions and more detailed summaries right at the beginning, rather than evolving in that direction over a couple of months.
Delays. I did fall over a week behind on the newsletter over the last month or two. While this is bad, I wouldn’t really call it a Mistake: I don’t think of the newsletter as a weekly commitment or obligation. I very much value the flexibility to allocate time to whatever seems most pressing; if the newsletter was more of a commitment (such that falling behind is a Mistake), I think I would have to be much more careful about what I agree to do, and this would prevent me from doing other important things. Instead, my approach is to have the newsletter as a fairly important goal that I try to schedule enough time for, but if I find myself running out of time and have to cut something, it’s not a tragedy if it means the newsletter is delayed. That’s essentially what happened over the last month or two.
Things done right
I spent a decent amount of time thinking about the design of the newsletter before implementing it, and I think this was in hindsight a very good idea. Here I list a few things that worked out well.
A polished product. I was particularly conscious of the fact that at launch the newsletter would be using up the limited common resource of “people’s willingness to try out new things”. Both in order to make sure people stuck with the project, and in order to not use up the common resource unnecessarily, I wanted to be fairly confident that this would be a good product before launching. As a result, I iterated for a little under two months within CHAI, in order to figure out product-market fit. You can see the evolution over time -- this is the first internal newsletter, whereas this is the first public newsletter. (They’re all available here.)
This is not to say that the newsletter has been static since launch; it has changed significantly. Most notably, while originally I was aiming to give people enough information to decide whether or not to read the paper/post, I now sometimes aim for including enough detail that people don’t need to read the paper/post. But the point is that a lot of the early improvements happened within CHAI without consuming the common resource.
I’m not sure to what extent this is different from standard startup advice of iterating quickly and testing product-market fit: it depends on whether it counts as testing for product-market fit to trial the newsletter within CHAI. To the extent that there is a difference, it’s mainly that I’m arguing for more planning, especially before consuming common resources (whereas with startups, the fierce competition means that you do not worry about consuming common resources).
Considered stability and commitment. As I mentioned above, I had an internal commitment to continue writing the newsletter for at least six months, as long as other people thought it was valuable. In addition to the value of stability, I viewed this as part of cooperatively using the common resource of people’s willingness to try things. If you’re going to use the resource and fail, ideally you would have learned that it is actually infeasible to succeed in that domain, as opposed to e.g. lack of motivation on the author’s part.
Here’s another way to see this. I think it would have been a lot harder for the newsletter to be successful if there had been 2-5 attempts to create a newsletter in the past that had then fizzled out, because people would expect newsletters to fail and wouldn’t subscribe. My initial commitment helps prevent me from being one of those failures for “bad” reasons (e.g. me losing motivation) while still allowing me to fail for “good” reasons (e.g. no one actually wants to read a newsletter about AI alignment).
I can’t point to any actually good outcomes that resulted from this policy; nonetheless I think it was a good thing to have done.
Investing in flexible automated systems. I had created the private version of the spreadsheet before the first public newsletter, in order to have a database of readings for myself (replacing my previous Google Doc database), and I wrote a script to generate the email from this database. While lots of ink has been spilled on the value of automation, it doesn’t usually emphasize flexibility. By not using a technology meant for one specific purpose, I was able to do a few things that I wouldn’t expect to be able to do with a more specialized version:
But really, the key value of flexibility is that it allows you to adapt to circumstances that you had never even considered when creating the system:
Thought about potential negative effects. I’m pretty sure I thought of most of the points about negativity (listed above) before publicizing the newsletter. This is discussed a lot; I don’t think I have anything significant to add.
This section seems to indicate that I thought of things initially and they were all important -- this is almost certainly not the case. I’m sure I’m rationalizing some of these with hindsight and didn’t actually think of all the benefits then, and I also probably thought of other considerations that didn’t end up being important that I’ve now forgotten.