AI ALIGNMENT FORUM
Wikitags
AF

Distributional Shifts

Settings

Applied to Ambiguous out-of-distribution generalization on an algorithmic task by Wilson Wu 1mo ago

Applied to Inevitable Growth and Consequences of AI by baleful pokemon 2mo ago

Applied to Why do we need RLHF? Imitation, Inverse RL, and the role of reward by Ran W 1y ago

Applied to Nonlinear limitations of ReLUs by magfrump 1y ago

Applied to how 2 tell if ur input is out of distribution given only model weights by Daniel Kirmani 2y ago

Applied to Speculative inferences about path dependence in LLM supervised fine-tuning from results on linear mode connectivity and model souping by Robert Kirk 2y ago

Applied to Requirements for a STEM-capable AGI Value Learner (my Case for Less Doom) by Roger Dearnaley 2y ago

Applied to We are misaligned: the saddening idea that most of humanity doesn't intrinsically care about x-risk, even on a personal level by Christopher King 2y ago

Applied to Have you heard about MIT's "liquid neural networks"? What do you think about them? by Christopher King 2y ago

Applied to Is there a ML agent that abandons it's utility function out-of-distribution without losing capabilities? by Christopher King 2y ago

Applied to Causal representation learning as a technique to prevent goal misgeneralization by Pablo Antonio Moreno Casares 2y ago

Applied to Disentangling inner alignment failures by Erik Jenner 2y ago

Applied to Distribution Shifts and The Importance of AI Safety by RobertM 2y ago

Applied to Breaking down the training/deployment dichotomy by Erik Jenner 3y ago

ojorgensen v1.4.0Aug 25th 2022 GMT (+4/-4) LW1

Abram Demski v1.3.0Aug 24th 2022 GMT LW2

Abram Demski v1.2.0Aug 24th 2022 GMT LW2

changed name from Distributional Shifts (was: Out of Distribution) to Distributional Shifts

Abram Demski v1.1.0Aug 24th 2022 GMT (+3060/-88) LW12