AI ALIGNMENT FORUM
Wikitags
AF

Embedded Agency

Settings

Applied to Meaning: duality and self genesis by Davi_ CRAFT 9h ago

Applied to Infra-Bayesian physicalism: a formal theory of naturalized induction by Vanessa Kosoy 20d ago

Applied to Non-Monotonic Infra-Bayesian Physicalism by Vanessa Kosoy 21d ago

Applied to Mistral Large 2 (123B) exhibits alignment faking by Gunnar Zarncke 1mo ago

Applied to Reducing LLM deception at scale with self-other overlap fine-tuning by Gunnar Zarncke 1mo ago

Applied to Unaligned AGI & Brief History of Inequality by ank 2mo ago

Applied to Static Place AI Makes Agentic AI Redundant: Multiversal AI Alignment & Rational Utopia by ank 2mo ago

Applied to Rational Effective Utopia & Narrow Way There: Multiversal AI Alignment, Place AI, New Ethicophysics... (Updated) by ank 2mo ago

Applied to Subjective Naturalism in Decision Theory: Savage vs. Jeffrey–Bolker by Daniel Herrmann 3mo ago

Applied to Rebuttals for ~all criticisms of AIXI by Cole Wyeth 3mo ago

Applied to Are You More Real If You're Really Forgetful? by Thane Ruthenis 5mo ago

Applied to Can subjunctive dependence emerge from a simplicity prior? by Daniel C 7mo ago

Applied to Open Problems in AIXI Agent Foundations by Cole Wyeth 7mo ago

Applied to Jonothan Gorard:The territory is isomorphic to an equivalence class of its maps by Daniel C 8mo ago

Applied to All the Following are Distinct by Gunnar Zarncke 9mo ago

Applied to Self-Other Overlap: A Neglected Approach to AI Alignment by Keenan Pepper 9mo ago

Applied to Live Theory Part 0: Taking Intelligence Seriously by Sahil 10mo ago