4032
model brief

Google DeepMind

balanced tier · 2024

Google DeepMind

Gemini 2.0 Pro

Balanced multimodal Gemini model that blends quality, speed, and long-context reasoning.

Context window

1M tokens (streaming) / 128k cached context

Peak context for this model.

Availability

Google AI Studio, Vertex AI

Where you can run it.

Modalities

Text · Vision · Audio · Code

Input/output coverage.

Pricing

$0.35 / 1M input tokens, $1.05 / 1M output tokens

Latency: Interactive latency with streaming enabled by default

Strengths

  • Strong grounding on web-scale knowledge with low-latency streaming.
  • Handles mixed modality inputs across screenshots, PDFs, and audio snippets.
  • Reliable JSON modes for structured calls and function execution.

Best for

  • Production chat and copilots that need latency caps.
  • Long-context analysis with mixed media attachments.
  • Retrieval-augmented generation and analytics over customer data.

Summary

  • Tier: balanced
  • Release: 2024
  • Latency: Interactive latency with streaming enabled by default