AI ALIGNMENT FORUM
Wikitags
AF

1

Transformers

1

This page is a stub.

Posts tagged Transformers

1

61How LLMs are and are not myopic

2y

7

1

70Modern Transformers are AGI, and Human-Level

1y

30

2

33Residual stream norms grow exponentially over the forward pass

Stefan Heimersheim, Alex Turner

2y

6

1

27Tracr: Compiled Transformers as a Laboratory for Interpretability | DeepMind

2y

9

0

17Concrete Steps to Get Started in Transformer Mechanistic Interpretability

2y

5

1

8AGI will be made of heterogeneous components, Transformer and Selective SSM blocks will be among them

1y

0

1

143Transformers Represent Belief State Geometry in their Residual Stream

1y

4

1

28Attention SAEs Scale to GPT-2 Small

Connor Kissane, Robert Krzyzanowski, Arthur Conmy, Neel Nanda

1y

0

1

20Brief Notes on Transformers

3y

2

1

21Understanding mesa-optimization using toy models

tilmanr, rusheb, Guillaume Corlouer, Dan Valentine, Alex Spies, Michael Ivanitskiy, Can

2y

0

1

17Building a transformer from scratch - AI safety up-skilling challenge

Marius Hobbhahn

3y

0

0

13Deconfusing In-Context Learning

Arjun Panickssery

1y

0

1

16New Tool: the Residual Stream Viewer

2y

1

1

9The positional embedding matrix and previous-token heads: how do they actually work?

2y

1