AI ALIGNMENT FORUM
AF

DeepMindAI
Personal Blog

20

Google Gemini Announced

by Jacob G-W
6th Dec 2023
1 min read
22

20

This is a linkpost for https://blog.google/technology/ai/google-gemini-ai/
DeepMindAI
Personal Blog
Google Gemini Announced
5Nisan
5Vanessa Kosoy
New Comment
2 comments, sorted by
top scoring
Click to highlight new comments since: Today at 3:45 PM
[-]Nisan2y53

I wonder why Gemini used RLHF instead of Direct Preference Optimization (DPO). DPO was written up 6 months ago; it's simpler and apparently more compute-efficient than RLHF.

  • Is the Gemini org structure so sclerotic that it couldn't switch to a more efficient training algorithm partway through a project?
  • Is DPO inferior to RLHF in some way? Lower quality, less efficient, more sensitive to hyperparameters?
  • Maybe they did use DPO, even though they claimed it was RLHF in their technical report?
Reply
[-]Vanessa Kosoy2y52

in each of the 50 different subject areas that we tested it on, it's as good as the best expert humans in those areas

 

That sounds like an incredibly strong claim, but I suspect that the phrasing is very misleading. What kind of tests is Hassabis talking about here? Maybe those are tests that rely on remembering known facts much more than on making novel inferences? Surely Gemini is not (say) as good as the best mathematicians at solving open problems in mathematics?

Reply
Moderation Log
Curated and popular this week
2Comments

Google just announced Gemini, and Hassabis claims that "in each of the 50 different subject areas that we tested it on, it's as good as the best expert humans in those areas"

State-of-the-art performance

We've been rigorously testing our Gemini models and evaluating their performance on a wide variety of tasks. From natural image, audio and video understanding to mathematical reasoning, Gemini Ultra’s performance exceeds current state-of-the-art results on 30 of the 32 widely-used academic benchmarks used in large language model (LLM) research and development.

With a score of 90.0%, Gemini Ultra is the first model to outperform human experts on MMLU (massive multitask language understanding), which uses a combination of 57 subjects such as math, physics, history, law, medicine and ethics for testing both world knowledge and problem-solving abilities.

Our new benchmark approach to MMLU enables Gemini to use its reasoning capabilities to think more carefully before answering difficult questions, leading to significant improvements over just using its first impression.

It also seems like it can understand video, which is new for multimodal models (GPT-4 cannot do this currently).