This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Chain-of-Thought Alignment
•
Applied to
Generating Cognateful Sentences with Large Language Models
by
Vijay Kethanaboyina
8h
ago
•
Applied to
Reduce AI Self-Allegiance by saying "he" instead of "I"
by
Knight Lee
14d
ago
•
Applied to
AGI with RL is Bad News for Safety
by
Nadav Brandes
16d
ago
•
Applied to
Simple Steganographic Computation Eval - gpt-4o and gemini-exp-1206 can't solve it yet
by
Filip Sondej
18d
ago
•
Applied to
Testing which LLM architectures can do hidden serial reasoning
by
Filip Sondej
22d
ago
•
Applied to
LLMs Do Not Think Step-by-step In Implicit Reasoning
by
Bogdan Ionut Cirstea
1mo
ago
•
Applied to
Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts?
by
Bogdan Ionut Cirstea
1mo
ago
•
Applied to
A Little Depth Goes a Long Way: the Expressive Power of Log-Depth Transformers
by
Bogdan Ionut Cirstea
2mo
ago
•
Applied to
~80 Interesting Questions about Foundation Model Agent Safety
by
Rohan Subramani
2mo
ago
•
Applied to
Meta AI (FAIR) latest paper integrates system-1 and system-2 thinking into reasoning models.
by
happy friday
2mo
ago
•
Applied to
the case for CoT unfaithfulness is overstated
by
Rohan Subramani
3mo
ago
•
Applied to
Thinking LLMs: General Instruction Following with Thought Generation
by
Bogdan Ionut Cirstea
3mo
ago
•
Applied to
5 ways to improve CoT faithfulness
by
CBiddulph
3mo
ago
•
Applied to
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
by
Bogdan Ionut Cirstea
4mo
ago
•
Applied to
Understanding Hidden Computations in Chain-of-Thought Reasoning
by
rokosbasilisk
4mo
ago