michael_mjd

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by

Confused why a "capabilities research is good for alignment progress" position isn't discussed more

I think we are getting some information. For example, we can see that token level attention is actually quite powerful for understanding language and also images. We have some understanding of scaling laws. I think the next step is a deeper understanding of how world modeling fits in with action generation -- how much can you get with just world modeling, versus world modeling plus reward/action combined?

If the transformer architecture is enough to get us there, it tells us a sort of null hypothesis for intelligence -- that the structure for predicting sequences by comparing all pairs of elements of a limited sequence -- is general.

Not rhetorically, what kind of questions you think would better lead to understanding how AGI works?

I think teaching a transformer with an internal thought process (predicting the next tokens over a part of the sequence that's "showing your work") would be an interesting insight into how intelligence might work. I thought of this a little while back but also discovered this is also a long standing MIRI research direction into transparency. I wouldn't be surprised if Google took it up at this point.

Reply