This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Practice & Philosophy of Science
•
Applied to
A Defense of Peer Review
by
Niko McCarty
1mo
ago
•
Applied to
Standard SAEs Might Be Incoherent: A Choosing Problem & A “Concise” Solution
by
Kola Ayonrinde
1mo
ago
•
Applied to
My Number 1 Epistemology Book Recommendation: Inventing Temperature
by
Raymond Arnold
2mo
ago
•
Applied to
Simon DeDeo on Explore vs Exploit in Science
by
Raymond Arnold
2mo
ago
•
Applied to
We Should Try to Directly Measure the Value of Scientific Papers
by
ohmurphy
3mo
ago
•
Applied to
AI x Human Flourishing: Introducing the Cosmos Institute
by
Brendan McCord
3mo
ago
•
Applied to
Quick look: applications of chaos theory
by
Elizabeth
3mo
ago
•
Applied to
Honest science is spirituality
by
Pavel Chvykov
5mo
ago
•
Applied to
"What the hell is a representation, anyway?" | Clarifying AI interpretability with tools from philosophy of cognitive science | Part 1: Vehicles vs. contents
by
IwanWilliams
6mo
ago
•
Applied to
"Successful language model evals" by Jason Wei
by
Arjun Panickssery
6mo
ago
•
Applied to
On what research policymakers actually need
by
Tobias D.
7mo
ago
•
Applied to
How I select alignment research projects
by
Ethan Perez
7mo
ago
•
Applied to
Sparsify: A mechanistic interpretability research agenda
by
Lee Sharkey
8mo
ago
•
Applied to
Metascience of the Vesuvius Challenge
by
Maxwell Tabarrok
8mo
ago
•
Applied to
Scientific Method
by
Andrij “Androniq” Ghorbunov
9mo
ago
•
Applied to
Tips for Empirical Alignment Research
by
Ethan Perez
9mo
ago
•
Applied to
Quantum Darwinism, social constructs, and the scientific method
by
Pavel Chvykov
10mo
ago
•
Applied to
Leading The Parade
by
Arun Jose
10mo
ago
•
Applied to
The case for more ambitious language model evals
by
Arun Jose
10mo
ago