This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AXRP - the AI X-risk Research Podcast
AF
Login
AXRP - the AI X-risk Research Podcast
Transcripts of AXRP episodes.
7
AXRP Episode 1 - Adversarial Policies with Adam Gleave
DanielFilan
4y
3
7
AXRP Episode 2 - Learning Human Biases with Rohin Shah
DanielFilan
4y
0
11
AXRP Episode 3 - Negotiable Reinforcement Learning with Andrew Critch
DanielFilan
4y
0
21
AXRP Episode 4 - Risks from Learned Optimization with Evan Hubinger
DanielFilan
4y
10
17
AXRP Episode 5 - Infra-Bayesianism with Vanessa Kosoy
DanielFilan
4y
2
13
AXRP Episode 6 - Debate and Imitative Generalization with Beth Barnes
DanielFilan
4y
3
23
AXRP Episode 7 - Side Effects with Victoria Krakovna
DanielFilan
4y
6
14
AXRP Episode 8 - Assistance Games with Dylan Hadfield-Menell
DanielFilan
4y
1
34
AXRP Episode 9 - Finite Factored Sets with Scott Garrabrant
DanielFilan
3y
2
20
AXRP Episode 10 - AI’s Future and Impacts with Katja Grace
DanielFilan
3y
2
13
AXRP Episode 11 - Attainable Utility and Power with Alex Turner
DanielFilan
3y
5
22
AXRP Episode 12 - AI Existential Risk with Paul Christiano
DanielFilan
3y
0
15
AXRP Episode 13 - First Principles of AGI Safety with Richard Ngo
DanielFilan
3y
1
15
AXRP Episode 14 - Infra-Bayesian Physicalism with Vanessa Kosoy
DanielFilan
3y
10
18
AXRP Episode 15 - Natural Abstractions with John Wentworth
DanielFilan
3y
0
14
AXRP Episode 16 - Preparing for Debate AI with Geoffrey Irving
DanielFilan
2y
0
10
AXRP Episode 17 - Training for Very High Reliability with Daniel Ziegler
DanielFilan
2y
0
8
AXRP Episode 18 - Concept Extrapolation with Stuart Armstrong
DanielFilan
2y
0
25
AXRP Episode 19 - Mechanistic Interpretability with Neel Nanda
DanielFilan
2y
0
11
AXRP Episode 20 - ‘Reform’ AI Alignment with Scott Aaronson
DanielFilan
2y
0
10
AXRP Episode 21 - Interpretability for Engineers with Stephen Casper
DanielFilan
2y
1
30
AXRP Episode 22 - Shard Theory with Quintin Pope
DanielFilan
2y
4
15
AXRP Episode 23 - Mechanistic Anomaly Detection with Mark Xu
DanielFilan
1y
0
19
AXRP Episode 24 - Superalignment with Jan Leike
DanielFilan
1y
3
17
AXRP Episode 25 - Cooperative AI with Caspar Oesterheld
DanielFilan
1y
0
8
AXRP Episode 26 - AI Governance with Elizabeth Seger
DanielFilan
1y
0
38
AXRP Episode 27 - AI Control with Buck Shlegeris and Ryan Greenblatt
DanielFilan
8mo
6
6
AXRP Episode 28 - Suing Labs for AI Risk with Gabriel Weil
DanielFilan
8mo
0
15
AXRP Episode 29 - Science of Deep Learning with Vikrant Varma
DanielFilan
8mo
1
11
AXRP Episode 30 - AI Security with Jeffrey Ladish
DanielFilan
8mo
0
37
AXRP Episode 31 - Singular Learning Theory with Daniel Murfet
DanielFilan
8mo
0
13
AXRP Episode 32 - Understanding Agency with Jan Kulveit
DanielFilan
7mo
0
20
AXRP Episode 33 - RLHF Problems with Scott Emmons
DanielFilan
6mo
0
13
AXRP Episode 34 - AI Evaluations with Beth Barnes
DanielFilan
5mo
0
13
AXRP Episode 35 - Peter Hase on LLM Beliefs and Easy-to-Hard Generalization
DanielFilan
4mo
0
14
AXRP Episode 36 - Adam Shai and Paul Riechers on Computational Mechanics
DanielFilan
3mo
0
11
AXRP Episode 37 - Jaime Sevilla on Forecasting AI
DanielFilan
3mo
1
9
AXRP Episode 38.0 - Zhijing Jin on LLMs, Causality, and Multi-Agent Systems
DanielFilan
1mo
0
7
AXRP Episode 38.1 - Alan Chan on Agent Infrastructure
DanielFilan
1mo
0
20
AXRP Episode 38.2 - Jesse Hoogland on Singular Learning Theory
DanielFilan
24d
0
25
AXRP Episode 39 - Evan Hubinger on Model Organisms of Misalignment
DanielFilan
20d
0
12
AXRP Episode 38.3 - Erik Jenner on Learned Look-Ahead
DanielFilan
9d
0