Gemini 1.5 Pro/Advanced on-par with OpenAI's GPT-4o model, latest benchmark score reveals

Reading time icon 2 min. read

Readers help support MSpoweruser. We may get a commission if you buy through our links. Tooltip Icon

Read our disclosure page to find out how can you help MSPoweruser sustain the editorial team Read more

Key notes

  • Google’s Gemini 1.5 Pro, capable of 1 million tokens, will soon have a 2M token version.
  • It performs on par with OpenAI’s GPT-4o in some categories, based on the Arena Elo system.
  • Gemini 1.5 Flash, available on Google AI Studio, competes well with Microsoft’s latest Phi-3 models.

Google’s Gemini 1.5 Pro arrived some time ago, running up to 1 million tokens. But recently, the Mountain View tech giant announced during the Google I/O 2024 event that a 2M token version was coming soon for developers to try.

But how good the Gemini 1.5 Pro actually is, though? Usually, benchmark numbers are a good start, although it does not necessarily paint an accurate entire picture. It turns out that Gemini 1.5 Pro, or even its “Advanced” tier, is on par with OpenAI’s latest GPT-4o in certain categories.

As seen on the overall leaderboard comparison above from LMSYS Org, both Gemini-1.5-Pro-API-0514 & Gemini-Advanced-0514 come close to GPT-4o according to the Arena Elo system measurement. These two models are also extremely popular in Chinese, and the Gemini 1.5 Pro also comes close in the “hard prompts” category.

The Arena Elo system measures the skill of large language models (LLMs) by having users anonymously vote on which model performs better in random battles, updating their ratings like the Elo system in chess. The non-profit AI-oriented organization focuses on comparing models side-by-side.

Gemini 1.5 Flash, which is now available to try on Google AI Studio & Vertex AI, comes close. For a small, lightweight model, it sure holds up with Microsoft’s latest addition to the Phi-3 family, the Phi-3-vision & Phi Silica.

When OpenAI arrived a little while ago with the new GPT-4o model and a ChatGPT desktop app, the expectations were high. The latest model makes the AI chatbot sound extremely humane when having a conversation with the users.