Comment Author | Post | Deleted By User | Deleted Date | Deleted Public | Reason |
---|---|---|---|---|---|
How We Picture Bayesian Agents | LawrenceC | 1d | false | Whoops, Gwern already mentioned this work, my bad. | |
LLMs for Alignment Research: a safety priority? | lukehmiles | 17d | false | ||
Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small | leogao | 2mo | false | ||
Discussion: Challenges with Unsupervised LLM Knowledge Discovery | Clément Dumas | 4mo | true | Sorry I didn't understand you were confused because of the visualization | |
Evaluating the historical value misspecification argument | Daniel Kokotajlo | 5mo | true | Accidental duplicate | |
Evaluating the historical value misspecification argument | Daniel Kokotajlo | 5mo | true | Accidental duplicate | |
TurnTrout's shortform feed | Ben Pace | 5mo | false | ||
TurnTrout's shortform feed | Ben Pace | 5mo | false | ||
Coup probes: Catching catastrophes with probes trained off-policy | Fabien Roger | 5mo | false | ||
Preventing Language Models from hiding their reasoning | Fabien Roger | 6mo | true |
Author | Post | Banned Users |
---|---|---|
Asymptotically Unambitious AGI |
ID | Banned From Frontpage | Banned from Personal Posts |
---|---|---|
User | Ended at | Type |
---|---|---|
18d | allComments | |
16d | allComments | |
13d | allComments | |
20d | allComments | |
1y | allComments | |
1mo | allPosts | |
1mo | allComments |