Review
This is a linkpost for https://blog.google/technology/ai/google-gemini-ai/
I wonder why Gemini used RLHF instead of Direct Preference Optimization (DPO). DPO was written up 6 months ago; it's simpler and apparently more compute-efficient than RLHF.
in each of the 50 different subject areas that we tested it on, it's as good as the best expert humans in those areas
That sounds like an incredibly strong claim, but I suspect that the phrasing is very misleading. What kind of tests is Hassabis talking about here? Maybe those are tests that rely on remembering known facts much more than on making novel inferences? Surely Gemini is not (say) as good as the best mathematicians at solving open problems in mathematics?
Google just announced Gemini, and Hassabis claims that "in each of the 50 different subject areas that we tested it on, it's as good as the best expert humans in those areas"
It also seems like it can understand video, which is new for multimodal models (GPT-4 cannot do this currently).