This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Wikitags
AF
Login
Outer Alignment
Settings
Applied to
PRISM: Perspective Reasoning for Integrated Synthesis and Mediation (Interactive Demo)
by
Anthony Diamond
3d
ago
Applied to
A Universal Prompt as a Safeguard Against AI Threats
by
Zhaiyk Sultan
4d
ago
Applied to
Maintaining Alignment during RSI as a Feedback Control Problem
by
Beren Millidge
12d
ago
Applied to
The Theoretical Reward Learning Research Agenda: Introduction and Motivation
by
Joar Skalse
14d
ago
Applied to
Does human (mis)alignment pose a significant and imminent existential threat?
by
jr
19d
ago
Applied to
Unaligned AGI & Brief History of Inequality
by
ank
20d
ago
Applied to
Intelligence–Agency Equivalence ≈ Mass–Energy Equivalence: On Static Nature of Intelligence & Physicalization of Ethics
by
ank
20d
ago
Applied to
Places of Loving Grace [Story]
by
ank
23d
ago
Applied to
Artificial Static Place Intelligence: Guaranteed Alignment
by
ank
1mo
ago
Applied to
Static Place AI Makes Agentic AI Redundant: Multiversal AI Alignment & Rational Utopia
by
ank
1mo
ago
Applied to
Rational Utopia & Narrow Way There: Multiversal AI Alignment, Non-Agentic Static Place AI, New Ethics... (V. 4)
by
ank
1mo
ago
Applied to
How will we update about scheming?
by
raztronaut
1mo
ago
Applied to
Alignment Can Reduce Performance on Simple Ethical Questions
by
Daan Henselmans
1mo
ago
Applied to
Tetherware #1: The case for humanlike AI with free will
by
Jáchym Fibír
1mo
ago
Applied to
The Road to Evil Is Paved with Good Objectives: Framework to Classify and Fix Misalignments.
by
Shivam Arora
1mo
ago
Applied to
Disproving the "People-Pleasing" Hypothesis for AI Self-Reports of Experience
by
rife
2mo
ago
Applied to
Popular materials about environmental goals/agent foundations? People wanting to discuss such topics?
by
Q Home
2mo
ago
Applied to
"Pick Two" AI Trilemma: Generality, Agency, Alignment.
by
Black Flag
2mo
ago
Applied to
I Recommend More Training Rationales
by
Gianluca Calcagni
2mo
ago
Dakara
v1.7.0
Dec 30th 2024 GMT
(
+150
/
-57
)
LW
1