This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Distillation & Pedagogy
•
Applied to
An Illustrated Summary of "Robust Agents Learn Causal World Model"
by
Dalcy
7d
ago
•
Applied to
Concrete Methods for Heuristic Estimation on Neural Networks
by
Oliver Daniels
1mo
ago
•
Applied to
Graceful Degradation
by
Screwtape
2mo
ago
•
Applied to
Distillation Of DeepSeek-Prover V1.5
by
IvanLin
2mo
ago
•
Applied to
Podcast: "How the Smart Money teaches trading with Ricki Heicklen" (Patrick McKenzie interviewing)
by
Raymond Arnold
5mo
ago
•
Applied to
Video Intro to Guaranteed Safe AI
by
Mike Vaiana
5mo
ago
•
Applied to
DIY RLHF: A simple implementation for hands on experience
by
Mike Vaiana
5mo
ago
•
Applied to
Poker is a bad game for teaching epistemics. Figgie is a better one.
by
RobertM
5mo
ago
•
Applied to
Dialogue introduction to Singular Learning Theory
by
Olli Järviniemi
5mo
ago
•
Applied to
An Extremely Opinionated Annotated List of My Favourite Mechanistic Interpretability Papers v2
by
Neel Nanda
5mo
ago
•
Applied to
How ARENA course material gets made
by
CallumMcDougall
6mo
ago
•
Applied to
Distillation of 'Do language models plan for future tokens'
by
TheManxLoiner
6mo
ago
•
Applied to
Failure Modes of Teaching AI Safety
by
Eleni Angelou
6mo
ago
•
Applied to
AI Safety Strategies Landscape
by
Charbel-Raphael Segerie
7mo
ago
•
Applied to
Observations on Teaching for Four Weeks
by
ClareChiaraVincent
8mo
ago
•
Applied to
Ironing Out the Squiggles
by
Lauren (often wrong)
8mo
ago
•
Applied to
Superposition is not "just" neuron polysemanticity
by
Lawrence Chan
8mo
ago
•
Applied to
"Deep Learning" Is Function Approximation
by
Lauren (often wrong)
9mo
ago