All of Yulu Pi's Comments + Replies

I have been wondering if neural networks (or more specifically, transformers) will become the ultimate form of AGI. If not, will the existing research on Interpretability, become obsolete?

1Stephen Casper
I do not worry a lot about this. It would be a problem. But some methods are model-agnostic and would transfer fine. Some other methods have close analogs for other architectures. For example, ROME is specific to transformers, but causal tracing and rank one editing are more general principles that are not. 

hey Neel,

Great post!

I am trying to look into the code here

But the links dont work anymore! It would be nice if you could help update them!

I dont know if this link works for the original content: https://colab.research.google.com/github/neelnanda-io/Easy-Transformer/blob... (read more)

1Neel Nanda
Ah, thanks! Haven't looked at this point in a while, updated it a bit. I've since made my own transformer tutorial which (in my extremely biased opinion) is better esp for interpretability. It comes with a template notebook to fill out alongside part 2, (with tests!) and by the end you'll have implemented your own GPT-2. More generally, my getting started in mech interp guide is a better place to start than this guide, and has more on transformers!