This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Rationalization
Settings
•
Applied to
So you want to be a witch
by
Levi Ackerman, fac.
1mo
ago
•
Applied to
Implications—How Conscious Significance Could Inform Our lives
by
James Stephen Brown
2mo
ago
•
Applied to
On Intentionality, or: Towards a More Inclusive Concept of Lying
by
Cornelius Dybdahl
4mo
ago
•
Applied to
Inquisitive vs. adversarial rationality
by
gb
5mo
ago
•
Applied to
Lessons from Failed Attempts to Model Sleeping Beauty Problem
by
Ape in the coat
1y
ago
•
Applied to
Refusal mechanisms: initial experiments with Llama-2-7b-chat
by
Roger Dearnaley
1y
ago
•
Applied to
Rationalization Maximizes Expected Value
by
Kevin Dorst
2y
ago
•
Applied to
Clever arguers give weak evidence, not zero
by
dkl9
2y
ago
•
Applied to
My Time As A Goddess
by
Evenstar
2y
ago
•
Applied to
Going Crazy and Getting Better Again
by
Evenstar
2y
ago
•
Applied to
Morality is Accidental & Self-Congratulatory
by
Kaj Sotala
2y
ago
•
Applied to
A "super-intelligence" unintended consequences "preserve life" scenario
by
Punken Drublic
2y
ago
•
Applied to
Asking for a name for a symptom of rationalization
by
Ruben Bloom
2y
ago
•
Applied to
Slack matters more than any outcome
by
Malcolm Ocean
2y
ago
•
Applied to
Understanding and avoiding value drift
by
Alex Turner
2y
ago
•
Applied to
Post hoc justifications as Compression Algorithm
by
Ruben Bloom
3y
ago
•
Applied to
The horror of what must, yet cannot, be true
by
Kaj Sotala
3y
ago