2024 open-weight Varies by host; scales across GPU clusters
Llama 3.2 90B
Open-weight Llama 3.2 model with strong reasoning for an open license footprint.
Context
128k tokens
Modalities
Text · Code
Pricing
Open-weight (no per-token licensing)
Availability
Self-hosted, cloud marketplaces, supported by major GPU providers
Strengths
- High quality for an open-weight model with competitive reasoning.
- Supports fine-tuning and RAG pipelines on self-hosted infra.
- Transparent licensing for on-prem or VPC deployments.
Best for
- Teams that need vendor-neutral, controllable deployments.
- Private RAG stacks with custom tuning and observability.
- Cost-controlled batch inference across dedicated GPUs.