This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
All of technicalities's Comments + Replies
Shallow review of live agendas in alignment & safety
technicalities
1y
1
0
I like this. It's like a structural version of control evaluations. Will think where to put it in
Reply
3
Lawrence Chan
1y
Expanding on this -- this whole area is probably best known as "AI Control", and I'd lump it under "Control the thing" as its own category. I'd also move Control Evals to this category as well, though someone at RR would know better than I.
I like this. It's like a structural version of control evaluations. Will think where to put it in