open-weight tier · 2024
Meta
Llama 3.2 90B
Open-weight Llama 3.2 model with strong reasoning for an open license footprint.
Context window
128k tokens
Peak context for this model.
Availability
Self-hosted, cloud marketplaces, supported by major GPU providers
Where you can run it.
Modalities
Text · Code
Input/output coverage.
Pricing
Open-weight (no per-token licensing)
Latency: Varies by host; scales across GPU clusters
Strengths
- High quality for an open-weight model with competitive reasoning.
- Supports fine-tuning and RAG pipelines on self-hosted infra.
- Transparent licensing for on-prem or VPC deployments.
Best for
- Teams that need vendor-neutral, controllable deployments.
- Private RAG stacks with custom tuning and observability.
- Cost-controlled batch inference across dedicated GPUs.
Summary
- Tier: open-weight
- Release: 2024
- Latency: Varies by host; scales across GPU clusters