x

AI ALIGNMENT FORUM
AF

Jacob_Hilton's Shortform — AI Alignment Forum

Jacob_Hilton's Shortform

by Jacob_Hilton

1st May 2025

1 min read

4

This is a special post for quick takes by Jacob_Hilton. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

Jacob_Hilton's Shortform

More from Jacob_Hilton

Curated and popular this week

34

1 comment, sorted by

Click to highlight new comments since: Today at 2:18 AM

[-]Jacob_Hilton10mo*40

I recently gave this talk at the Safety-Guaranteed LLMs workshop:

The talk is about ARC's work on low probability estimation (LPE), covering:

Theoretical motivation for LPE and (towards the end) activation modeling approaches (both described here)
Empirical work on LPE in language models (described here)
Recent work-in-progress on theoretical results

Mentioned in

57My AGI timeline updates from GPT-5 (and 2025 so far)

58Trust me bro, just one more RL scale up, this one will be the real scale up with the good environments, the actually legit one, trust me bro

39Research Agenda: Synthesizing Standalone World-Models