Skip to main content
gradually.ai logogradually.ai
  • Blog
  • About Us
AI Newsletter
AI Newsletter
  1. Home
  2. AI Blog

Gemini Models: All Google Models at a Glance

All Google Gemini models compared: From Gemini 1.0 to 3.0 with prices, context windows, and use cases. Find the best model.

FHFinn Hillebrandt
March 13, 2026
Auf Deutsch lesen
AI Technology
Gemini Models: All Google Models at a Glance
𝕏XShare on XFacebookShare on FacebookLinkedInShare on LinkedInPinterestShare on PinterestThreadsShare on ThreadsFlipboardShare on Flipboard
Links marked with * are affiliate links. If a purchase is made through such links, we receive a commission.

13 models across 4 generations. Some of them free. Some of them genuinely excellent. And some already discontinued.

The Gemini model lineup has gotten confusing fast since Google launched the first version in December 2023. Pro, Flash, Flash-Lite, Nano, Ultra. What does each one do? Which one should you actually use?

I use Gemini models regularly (mostly via the API and Gemini CLI), and I keep this overview updated as new models drop. Here is everything you need to know about features, prices, and availability of all ChatGPT and Claude competitor models from Google.

TL;DRKey Takeaways
  • Gemini 2.5 Pro is the latest premium model (November 2024) with 1 million token context and strong performance in code and analysis for $1.25-2.50/$10-15 per million tokens
  • Gemini 2.5 Flash-Lite is the cheapest powerful LLM on the market ($0.10/$0.40 per million tokens) and offers the best balance of speed, cost, and quality
  • All modern Gemini models (from 1.5) are natively multimodal and process text, images, audio, and video simultaneously with up to 1 million token context

What are Gemini Models?

Gemini models are Google's advanced Large Language Models developed by DeepMind and Google Research.

What makes Gemini different? A few things stand out immediately:

First: Native multimodality from the start. Google trained Gemini with text, images, audio, and video, not like other providers who patched that in later. This gives Gemini a much deeper understanding of all these modalities together.

Then there's the context window: Gemini 2.5 Pro processes up to 1 million tokens (experimentally even 2 million). That's approximately 700,000 words or over 1,400 book pages. In a single request. That's... very large.

Google also doesn't have a one-model strategy. Instead: Nano for smartphones, Flash for most standard tasks, Pro for demanding stuff. Each has its place. And because Gemini is deeply integrated into Google Search, Workspace, and Android, it works particularly well there.

Google has taken a different approach with Gemini than OpenAI: Instead of focusing on maximum benchmark performance, the focus is on practical versatility, multimodality, and integration into the Google ecosystem.

Tip
If you want to get the most out of Gemini, I recommend our guides on prompting techniques and our comparison of the best AI tools.

Comparison of All Gemini Models

Here's a detailed overview of all Gemini models with their key properties:

Model
Release
Context Window
Multimodal
Status
Gemini 1.0 Pro12/202332,000 TokensNoDiscontinued
Gemini 1.0 Ultra12/202332,000 TokensNoDiscontinued
Gemini 1.5 Pro02/20242M TokensYesDiscontinued · 04/2025
Gemini 1.5 Flash05/20241M TokensYesDiscontinued · 04/2025
Gemini 2.0 Flash09/20241M TokensYesDiscontinued · 06/2026
Gemini 2.5 Flash-Lite11/20241M TokensYesActive
Gemini 2.5 Flash11/20241M TokensYesActive
Gemini 2.5 Pro11/20241M TokensYesActive
Gemini 3 Flash12/20251M TokensYesActive
Gemini 3 Pro11/20251M TokensYesDiscontinued · 03/2026
Gemini 3.1 Pro02/20261M TokensYesActive
Gemini Nano-112/20234,000 TokensNoActive
Gemini Nano-205/20244,000 TokensYesActive

Gemini 3.0 Pro Preview

Released: November 2025

Gemini 3.0 Pro Preview is the latest generation of Google's AI models and is currently in early access.

Key Features:

  • Latest AI generation from Google DeepMind
  • Preview access for selected developers and companies
  • Improved reasoning capabilities compared to Gemini 2.5
  • Multimodal improvements especially in video understanding
  • 1 million token context window (input), up to 64,000 tokens output
  • Tiered API pricing: $2.00 / $12.00 per million tokens (under 200k context), $4.00 / $18.00 (over 200k context)
  • Access via Google AI Studio Early Access Program

Availability: Gemini 3.0 Pro is generally available via Google AI Studio and the Gemini API. Google has since released Gemini 3.1 Pro (February 2026) as the latest flagship model.

Gemini 2.5 Pro

Released: November 2024

Gemini 2.5 Pro is Google's current premium variant. It has the highest performance of the entire family. If you need to solve complex tasks, this is your model. (More about the Gemini API in our separate guide.)

What does Pro offer specifically?

  • Top-tier performance on complex reasoning and code tasks
  • 1 million token context window (experimentally also 2 million)
  • Tiered pricing: $1.25 / $10 for standard prompts (≤ 128K tokens), $2.50 / $15 for longer ones
  • Native multimodality (process text, images, audio, video together)
  • Prompt caching with 75% discount on cached inputs ($0.3125 instead of $1.25-2.50)
  • API model string: gemini-2.5-pro

What makes Gemini 2.5 Pro special?

Gemini 2.5 Pro is Google's answer to Claude 4 Opus and GPT-4o. It offers comparable performance on complex reasoning tasks and surpasses both competitors in processing very long contexts. The 1 million token window enables analysis of complete books, large codebases, or hours of video transcripts in a single API call.

The tiered pricing structure makes it economical: For most standard prompts (≤ 128K tokens) you pay only $1.25 / $10, significantly cheaper than Claude 4 Opus ($15 / $75) with comparable performance.

Where can you get it? The Google AI API, Google AI Studio, Vertex AI, or Google Cloud.

When do you need Pro? When you want to analyze entire codebases, comb through long research papers, summarize thick contracts, or process hours of video in one shot. This isn't meant for chatbots. That's what Flash is for and it's cheaper.

Gemini 2.5 Flash

Released: November 2024

Gemini 2.5 Flash is the balanced variant, the model evergreen of the 2.5 series. It delivers 90% of Pro performance but costs a fraction and is significantly faster.

The key specs:

  • 90% of Pro performance at a fraction of the cost
  • 2-3x faster than Pro (inference speed)
  • 1 million token context
  • $0.30 input / $2.50 output per million tokens
  • Prompt caching: $0.075 for cached inputs
  • Multimodal: text, images, audio, video
  • API string: gemini-2.5-flash

What makes Gemini 2.5 Flash special?

Gemini 2.5 Flash is the ideal production model for 90% of all use cases. It offers nearly the same quality as Pro (90% performance) at 80% lower cost and 2-3x faster response time. This makes it perfect for chatbots, content generation, and automation workflows where fast responses matter more than absolute highest precision.

Compared to ChatGPT GPT-4o ($15 / $60 per million tokens), Gemini 2.5 Flash offers 98% cost savings at similar quality. An unbeatable price-performance ratio.

You can find Flash via Google AI API, Google AI Studio, Vertex AI, Google Cloud, and it's also the backend model for many Google products.

Specific use cases: Chatbots that need to respond quickly. Content generation (articles, marketing copy, social posts). Data extraction from unstructured sources. Email classification, sentiment analysis, summaries. Screenshot understanding and OCR. For all this, you don't need Pro, Flash is sufficient and saves money.

Tip
Before deciding on a Gemini model, use our API cost calculator to calculate the actual costs for your application.

Gemini 2.5 Flash-Lite

Released: November 2024

Gemini 2.5 Flash-Lite is what it says: The cheapest usable LLM on the market. And extremely fast at the same time.

The key numbers:

  • $0.10 input / $0.40 output per million tokens (cheapest on the market)
  • 5x faster than Pro models
  • Still 70-80% of Flash performance
  • 1 million token context
  • Prompt caching: $0.025 for cached inputs
  • Multimodal: text, images, audio, video
  • API string: gemini-2.5-flash-lite

Why is this so interesting? It's 50-60% cheaper than GPT-4o-mini ($0.15 / $0.60) or Claude 3 Haiku ($0.25 / $1.25). And it's not slow. Rather the opposite.

The quality? 70-80% of Flash performance for chatbot responses, simple text generation, and classification. If you need millions of API calls daily, the cost savings are enormous.

Where can you find it? Google AI API, Google AI Studio, Vertex AI.

Use cases: Chatbots with millions daily. Large-scale content moderation. Sentiment analysis, categorization, tags. Real-time applications where low latency matters. Massive batch processing on a small budget.

Gemini 2.0 Flash

Released: September 2024

Gemini 2.0 Flash is the older version of Flash. The advantage: Free with rate limits.

Quick info:

  • 100% free (rate limits: 15 req./min, 1,500/day, 1M/month)
  • ~80% of 2.5 Flash performance
  • 1 million token context
  • Multimodal: text, images, audio
  • API string: gemini-2.0-flash

Use case: Prototyping, quick tests, low-volume applications. If you really need production without rate limits, upgrade to 2.5 Flash.

Gemini 1.5 Pro

Released: February 2024

Gemini 1.5 Pro was a big deal in 2024: First model with 2 million token context. That was a world record at the time.

Today: It was shut down on April 30, 2025. If you're still using 1.5 Pro, migrate to 2.5 Pro or newer. Better performance, less hassle.

What 1.5 had: 2 million tokens (impressive back then). Native multimodality. Strong video and document analysis. But that was 2024.

Gemini 1.5 Flash

Released: May 2024

Gemini 1.5 Flash was basically the cheaper, faster version of 1.5 Pro. Also deprecated.

The facts: 1 million token context. Fast, low cost. Multimodal. But it went offline on April 30, 2025. Users should switch to 2.5 Flash or newer.

Gemini 1.0 Pro and Ultra

Released: December 2023

Gemini 1.0 was the first attempt. Today: No longer relevant.

What was it? 32,000 token context. Text-only, no images/videos. Pro was standard, Ultra was premium. Both are long gone. Google quickly replaced them with 1.5 and 2.x, much better models.

Gemini Nano

Released: December 2023 / May 2024

Gemini Nano is different: On-device AI for smartphones. Runs locally, no cloud.

What's important:

  • On-device: Directly on smartphones, no cloud call
  • Two variants: Nano-1 (text-only) and Nano-2 (multimodal)
  • 4,000 token context (small, but sufficient for smartphone tasks)
  • Privacy: Everything stays local
  • Hardware: Pixel smartphones, Samsung Galaxy S24+, other Android devices
  • Use cases: Smart reply, live transcription, offline translation, photo editing

Availability: Already integrated in various Android phones. Google rolls it out via system updates. Developers can use the AICore API.

Price Comparison of All Gemini Models

The following table shows a detailed overview of all Gemini prices (all figures in $ per million tokens). For a detailed analysis, we recommend our API cost calculator:

Model
Status
Input (Standard)
Output (Standard)
Input (Cached)
Output (Cached)
Gemini 3.1 ProActive$2 ≤200K / $4 >200K$12 ≤200K / $18 >200K$0.2 ≤200K / $0.4 >200K—
Gemini 3 ProDiscontinued · 03/2026$2 ≤200K / $4 >200K$12 ≤200K / $18 >200K$0.2 ≤200K / $0.4 >200K—
Gemini 3 FlashActive$0.5$3$0.05—
Gemini 2.5 ProActive$1.25 ≤200K / $2.5 >200K$10 ≤200K / $15 >200K$0.13 ≤200K / $0.25 >200K—
Gemini 2.5 FlashActive$0.3$2.5$0.03—
Gemini 2.5 Flash-LiteActive$0.1$0.4$0.01—
Gemini 2.0 FlashDiscontinued · 06/2026$0.1$0.4$0.03—

Important notes on the price table:

  • Gemini 2.5 Pro has tiered pricing: Lower prices for prompts ≤ 128,000 tokens ($1.25 / $10), higher prices for longer prompts (greater than 128,000 tokens: $2.50 / $15)
  • Context caching (prompt caching) enables 75% discount on cached input tokens with repeated use. Example: Gemini 2.5 Flash input normally costs $0.30, cached only $0.075
  • Gemini 2.0 Flash is completely free with rate limits: 15 requests per minute, 1,500 per day, 1 million per month
  • Output prices for cached prompts remain the same as standard (no discount on output)

Frequently Asked Questions About Gemini Models

𝕏XShare on XFacebookShare on FacebookLinkedInShare on LinkedInPinterestShare on PinterestThreadsShare on ThreadsFlipboardShare on Flipboard
FH

Finn Hillebrandt

AI Expert & Blogger

Finn Hillebrandt is the founder of Gradually AI, an SEO and AI expert. He helps online entrepreneurs simplify and automate their processes and marketing with AI. Finn shares his knowledge here on the blog in 50+ articles as well as through his ChatGPT Course and the AI Business Club.

Learn more about Finn and the team, follow Finn on LinkedIn, join his Facebook group for ChatGPT, OpenAI & AI Tools or do like 17,500+ others and subscribe to his AI Newsletter with tips, news and offers about AI tools and online business. Also visit his other blog, Blogmojo, which is about WordPress, blogging and SEO.

Similar Articles

Grok Statistics 2026: Key Numbers, Data & Facts About xAI
AI Technology

Grok Statistics 2026: Key Numbers, Data & Facts About xAI

April 10, 2026
FHFinn Hillebrandt
Perplexity Statistics 2026: Key Numbers, Data & Facts
AI Technology

Perplexity Statistics 2026: Key Numbers, Data & Facts

April 10, 2026
FHFinn Hillebrandt
The 9 Best AI Image Generation Models in 2026
AI Technology

The 9 Best AI Image Generation Models in 2026

March 13, 2026
FHFinn Hillebrandt
ChatGPT Statistics 2026: Fascinating Numbers, Data & Facts
AI Technology

ChatGPT Statistics 2026: Fascinating Numbers, Data & Facts

March 13, 2026
FHFinn Hillebrandt
9 Signs to Recognize AI-Generated Images
AI Technology

9 Signs to Recognize AI-Generated Images

March 11, 2026
FHFinn Hillebrandt
ChatGPT Versions: All 32 GPT Models at a Glance
AI Technology

ChatGPT Versions: All 32 GPT Models at a Glance

March 8, 2026
FHFinn Hillebrandt

Stay Updated with the AI Newsletter

Get the latest AI tools, tutorials, and exclusive tips delivered to your inbox weekly

Unsubscribe anytime. About 4 to 8 emails per month. Consent includes notes on revocation, service provider, and statistics according to our Privacy Policy.

gradually.ai logogradually.ai

Germany's leading platform for AI tools and knowledge for online entrepreneurs.

AI Tools

  • AI Chat
  • ChatGPT in German
  • Text Generator
  • Prompt Enhancer
  • Prompt Link Generator
  • FLUX AI Image Generator
  • AI Art Generator
  • Midjourney Prompt Generator
  • Veo 3 Prompt Generator
  • AI Humanizer
  • AI Text Detector
  • Gemini Watermark Remover
  • All Tools →

Creative Tools

  • Blog Name Generator
  • AI Book Title Generator
  • Song Lyrics Generator
  • Artist Name Generator
  • Team Name Generator
  • AI Mindmap Generator
  • Headline Generator
  • Company Name Generator
  • AI Slogan Generator
  • Brand Name Generator
  • Newsletter Name Generator
  • YouTube Channel Name Generator

Business Tools

  • API Cost Calculator
  • Token Counter
  • AI Ad Generator
  • AI Copy Generator
  • Essay Generator
  • Story Generator
  • AI Rewrite Generator
  • Blog Post Generator
  • Meta Description Generator
  • AI Email Generator
  • Email Subject Line Generator
  • Instagram Bio Generator
  • AI Hashtag Generator

Resources

  • Claude Code MCP Servers
  • Claude Code Skills
  • n8n Hosting Comparison
  • OpenClaw Hosting Comparison
  • Claude Code Plugins
  • Claude Code Use Cases
  • Claude Cowork Use Cases
  • OpenClaw Use Cases
  • Changelogs

© 2025 Gradually AI. All rights reserved.

  • Blog
  • About Us
  • Legal Notice
  • Privacy Policy