fast tier · 2024
Anthropic
Claude 3.5 Haiku
Lightweight Claude tuned for speed while keeping surprisingly strong reasoning and extraction quality.
Context window
200k tokens
Peak context for this model.
Availability
Anthropic API, Amazon Bedrock, Google Cloud Vertex
Where you can run it.
Modalities
Text · Vision · Code
Input/output coverage.
Pricing
$0.80 / 1M input tokens, $4.00 / 1M output tokens
Latency: Low; tuned for interactive experiences
Strengths
- Fast responses with grounded summaries and low hallucination rates.
- Excellent structured extraction across tables, docs, and receipts.
- Cost-efficient while still supporting reliable tool use.
Best for
- High-traffic chat surfaces and internal assistants.
- Knowledge extraction, enrichment, and tagging.
- Routing and pre-processing before heavier models or pipelines.
Summary
- Tier: fast
- Release: 2024
- Latency: Low; tuned for interactive experiences
Other models