AI ALIGNMENT FORUM
Tags
AF

Chain-of-Thought Alignment

•

Applied to Generating Cognateful Sentences with Large Language Models by Vijay Kethanaboyina 8h ago

•

Applied to Reduce AI Self-Allegiance by saying "he" instead of "I" by Knight Lee 14d ago

•

Applied to AGI with RL is Bad News for Safety by Nadav Brandes 16d ago

•

Applied to Simple Steganographic Computation Eval - gpt-4o and gemini-exp-1206 can't solve it yet by Filip Sondej 18d ago

•

Applied to Testing which LLM architectures can do hidden serial reasoning by Filip Sondej 22d ago

•

Applied to LLMs Do Not Think Step-by-step In Implicit Reasoning by Bogdan Ionut Cirstea 1mo ago

•

Applied to Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts? by Bogdan Ionut Cirstea 1mo ago

•

Applied to A Little Depth Goes a Long Way: the Expressive Power of Log-Depth Transformers by Bogdan Ionut Cirstea 2mo ago

•

Applied to ~80 Interesting Questions about Foundation Model Agent Safety by Rohan Subramani 2mo ago

•

Applied to Meta AI (FAIR) latest paper integrates system-1 and system-2 thinking into reasoning models. by happy friday 2mo ago

•

Applied to the case for CoT unfaithfulness is overstated by Rohan Subramani 3mo ago

•

Applied to Thinking LLMs: General Instruction Following with Thought Generation by Bogdan Ionut Cirstea 3mo ago

•

Applied to 5 ways to improve CoT faithfulness by CBiddulph 3mo ago

•

Applied to To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning by Bogdan Ionut Cirstea 4mo ago

•

Applied to Understanding Hidden Computations in Chain-of-Thought Reasoning by rokosbasilisk 4mo ago