This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Exercises / Problem-Sets
•
Applied to
The Cognitive Bootcamp Agreement
by
Raymond Arnold
2mo
ago
•
Applied to
OODA your OODA Loop
by
Raymond Arnold
2mo
ago
•
Applied to
Forecasting One-Shot Games
by
Raymond Arnold
4mo
ago
•
Applied to
Brief notes on the Wikipedia game
by
Olli Järviniemi
5mo
ago
•
Applied to
Prompts for Big-Picture Planning
by
Raymond Arnold
8mo
ago
•
Applied to
Exercise: Planmaking, Surprise Anticipation, and "Baba is You"
by
Raymond Arnold
10mo
ago
•
Applied to
Meetup In a Box: Year In Review
by
Czynski
10mo
ago
•
Applied to
D&D.Sci(-fi): Colonizing the SuperHyperSphere
by
abstractapplic
1y
ago
•
Applied to
Interpretability with Sparse Autoencoders (Colab exercises)
by
CallumMcDougall
1y
ago
•
Applied to
Staying Split: Sabatini and Social Justice
by
Alex Vermillion
1y
ago
•
Applied to
Game Theory without Argmax [Part 2]
by
Cleo Nardo
1y
ago
•
Applied to
Game Theory without Argmax [Part 1]
by
Cleo Nardo
1y
ago
•
Applied to
How well does your research adress the theory-practice gap?
by
Jonas Hallgren
1y
ago
•
Applied to
Impact stories for model internals: an exercise for interpretability researchers
by
Tassilo Neubauer
1y
ago
•
Applied to
Exercise: Solve "Thinking Physics"
by
Raymond Arnold
1y
ago
•
Applied to
Mech Interp Puzzle 2: Word2Vec Style Embeddings
by
Raymond Arnold
1y
ago
•
Applied to
Rationality !== Winning
by
Raymond Arnold
1y
ago
•
Applied to
Tiny Mech Interp Projects: Emergent Positional Embeddings of Words
by
Raymond Arnold
1y
ago
•
Applied to
Mech Interp Puzzle 1: Suspiciously Similar Embeddings in GPT-Neo
by
Raymond Arnold
1y
ago
•
Applied to
An Exercise to Build Intuitions on AGI Risk
by
Lauro Langosco
2y
ago