This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Wikitags
AF
Login
Chain-of-Thought Alignment
Settings
Applied to
The Language Bottleneck in AI Reasoning: Are We Forgetting to Think?
by
Wotaker
21d
ago
Applied to
We should start looking for scheming "in the wild"
by
Marius Hobbhahn
23d
ago
Applied to
Artificial Static Place Intelligence: Guaranteed Alignment
by
ank
1mo
ago
Applied to
Seven sources of goals in LLM agents
by
Seth Herd
2mo
ago
Applied to
DeepSeek-R1 for Beginners
by
Anton Razzhigaev
2mo
ago
Applied to
Post-hoc reasoning in chain of thought
by
Kyle Cox
2mo
ago
Applied to
Worries about latent reasoning in LLMs
by
Caleb Biddulph
2mo
ago
Applied to
System 2 Alignment
by
Seth Herd
2mo
ago
Applied to
Inference-Time-Compute: More Faithful? A Research Note
by
James Chua
2mo
ago
Applied to
Generating Cognateful Sentences with Large Language Models
by
Vijay Kethanaboyina
3mo
ago
Applied to
Reduce AI Self-Allegiance by saying "he" instead of "I"
by
Knight Lee
3mo
ago
Applied to
AGI with RL is Bad News for Safety
by
Nadav Brandes
3mo
ago
Applied to
Simple Steganographic Computation Eval - gpt-4o and gemini-exp-1206 can't solve it yet
by
Filip Sondej
3mo
ago
Applied to
Testing which LLM architectures can do hidden serial reasoning
by
Filip Sondej
3mo
ago
Applied to
LLMs Do Not Think Step-by-step In Implicit Reasoning
by
Bogdan Ionut Cirstea
4mo
ago
Applied to
Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts?
by
Bogdan Ionut Cirstea
4mo
ago
Applied to
A Little Depth Goes a Long Way: the Expressive Power of Log-Depth Transformers
by
Bogdan Ionut Cirstea
4mo
ago