AI ALIGNMENT FORUM
Tags
AF

Power Seeking (AI)

•

Applied to Intrinsic Power-Seeking: AI Might Seek Power for Power’s Sake by Alex Turner 1mo ago

•

Applied to From Human to Posthuman: Transhumanism, Anarcho-Capitalism, and AI’s Role in Global Disparity and Governance by DyingNaive 2mo ago

•

Applied to A framework for thinking about AI power-seeking by RobertM 5mo ago

•

Applied to Steering Llama-2 with contrastive activation additions by Alex Turner 1y ago

•

Applied to Natural Abstraction: Convergent Preferences Over Information Structures by paulom 1y ago

•

Applied to You can't fetch the coffee if you're dead: an AI dilemma by hennyge 1y ago

•

Applied to The Game of Dominance by Karl von Wendt 1y ago

•

Applied to Incentives from a causal perspective by Tom Everitt 1y ago

•

Applied to Instrumental Convergence? [Draft] by Dan H 2y ago

•

Applied to Categorical-measure-theoretic approach to optimal policies tending to seek power by Victoria Krakovna 2y ago

•

Applied to My Overview of the AI Alignment Landscape: Threat Models by Michelle Viotti 2y ago

•

Applied to Ideas for studies on AGI risk by dr_s 2y ago

•

Applied to Instrumental convergence in single-agent systems by Jacob Pfau 2y ago

•

Applied to [Linkpost] Shorter version of report on existential risk from power-seeking AI by Ruben Bloom 2y ago