Note: you are ineligible to complete this challenge if you’ve studied Ancient or Modern Greek, or if you natively speak Modern Greek, or if for other reasons you know what mistakes I’m claiming Opus 4.6 makes. If you’re ineligible, please don’t help other people complete the challenge.
I have recently started using Claude Opus 4.6 to start studying Ancient Greek. Specifically, I initially used it to grade problem sets at the end of the textbook I’ve been using, but then I got worried about it being sycophantic towards my answers, so started having it just write out the answers itself.
I recently gave it this prompt, from the end of Chapter 3 of my textbook:
...Can you write out the answers to this Ancient Greek fill-in-the-blanks exercise so
In this post, I'll go through some of my best guesses for the current situation in AI as of the start of April 2026. You can think of this as a scenario forecast, but for the present (which is already uncertain!) rather than the future. I will generally state my best guess without argumentation and without explaining my level of confidence: some of these claims are highly speculative while others are better grounded, certainly some will be wrong. I tried to make it clear which claims are relatively speculative by saying something like "I guess", "I expect", etc. (but I may have missed some).
You can think of this post as more like a list of my current views rather than a structured post with a thesis, but I think it...
The amount of compute determines the optimal number of active params, available systems determine how the number of total params maps to efficiency of inference. Chinese models had to make strange choices on both counts, not having enough compute to make models with a lot of active params, but then needing to compensate for that with so many total params that the available systems couldn't work with them efficiently, and so giving up completely on fitting them in a few scale-up worlds. As a result, you get things like DeepSeek-V3 with 37B active and 671B t...
TLDR: The first in a planned series of three or more papers, which constitute the first major in-road in the compositional learning programme, and a substantial step towards bridging agent foundations theory with practical algorithms.
Official Abstract: We propose novel algorithms for sequence prediction based on ideas from stringology. These algorithms are time and space efficient and satisfy mistake bounds related to particular stringological complexity measures of the sequence. In this work (the first in a series) we focus on two such measures: (i) the size of the smallest straight-line program that produces the sequence, and (ii) the number of states in the minimal automaton that can compute any symbol in the sequence when given its position in base
I haven't tried LZP in practice, but you can guess what results to expect by looking at the size of the LZ77-compression of the text. I expect that any remotely decent text prediction algorithm would be based on stochastic process prediction. The deterministic setting is just a toy model.
Thanks for the catch!
I've recently updated towards substantially shorter AI timelines and much faster progress in some areas. [1] The largest updates I've made are (1) an almost 2x higher probability of full AI R&D automation by EOY 2028 (I'm now a bit below 30% [2] while I was previously expecting around 15%; my guesses are pretty reflectively unstable) and (2) I expect much stronger short-term performance on massive and pretty difficult but easy-and-cheap-to-verify software engineering (SWE) tasks that don't require that much novel ideation [3] . For instance, I expect that by EOY 2026, AIs will have a 50%-reliability...
AI stack + conflict parity requires lots of robots (or crazy novel tech) but doesn't require AIs as capable as TEDAI. TEDAI is a very high capabilities bar. So, in worlds without a software only singularity and especially with slower takeoff, I think you may reach AI stack + conflict parity prior to TEDAI. (It's certainly possible to have great military robots and robot industrial capacity with AIs that are well within the human range on key skills.) TEDAI probably follows reasonably quickly, because economic doubling times are so fast in such a world. In ...
This post reflects my personal opinion and not necessarily that of other members of Apollo Research.
TLDR: I think funders should heavily incentivize AI safety work that enables spending $100M+ in compute or API budgets on automated AI labor that directly and differentially translates to safety.
I think we are in a short timeline world (and we should take the possibility seriously even if we don't have full confidence yet). This means that I think funders should aim to allocate large amounts of money (e.g. $1-50B per year across the ecosystem) on AI safety in the next 2-3 years.
I think that the AI safety funders have been allocating way too little funding and their spending has been far too conservative in the past 5 years. So, in my opinion,...
Datasets might be nice.
I used my agent orchestrator with Opus 4.6 and told it:
I ran one version w... (read more)