balanced tier · 2024
Google DeepMind
Gemini 2.0 Pro
Balanced multimodal Gemini model that blends quality, speed, and long-context reasoning.
Context window
1M tokens (streaming) / 128k cached context
Peak context for this model.
Availability
Google AI Studio, Vertex AI
Where you can run it.
Modalities
Text · Vision · Audio · Code
Input/output coverage.
Pricing
$0.35 / 1M input tokens, $1.05 / 1M output tokens
Latency: Interactive latency with streaming enabled by default
Strengths
- Strong grounding on web-scale knowledge with low-latency streaming.
- Handles mixed modality inputs across screenshots, PDFs, and audio snippets.
- Reliable JSON modes for structured calls and function execution.
Best for
- Production chat and copilots that need latency caps.
- Long-context analysis with mixed media attachments.
- Retrieval-augmented generation and analytics over customer data.
Summary
- Tier: balanced
- Release: 2024
- Latency: Interactive latency with streaming enabled by default
Other models