This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Verification
Edit
History
Subscribe
Discussion
(0)
Help improve this page (1 flag)
Edit
History
Subscribe
Discussion
(0)
Help improve this page (1 flag)
Verification
Random Tag
Contributors
Posts tagged
Verification
Most Relevant
0
79
Formal verification, heuristic explanations and surprise accounting
Jacob Hilton
6mo
3
1
44
Compact Proofs of Model Performance via Mechanistic Interpretability
Lawrence Chan
,
rajashree
,
Adrià Garriga-Alonso
,
Jason Gross
6mo
2
0
7
Making it harder for an AGI to "trick" us, with STVs
Tor Økland Barstad
2y
0
1
7
Alignment with argument-networks and assessment-predictions
Tor Økland Barstad
2y
0