4032
provider brief

Meta

1 models · 2 modalities · 1 tiers

market

Meta

Meta lineup overview: capabilities, latency profiles, and where each model fits inside the 4032.ai bridge.

Modalities

Code · Text

Coverage across the lineup.

Max context

128k tokens

Largest window offered by this provider.

Tiers

open-weight

Blend of speed, reasoning, and openness.

lineup

Meta models

Compare the models from Meta side by side. Look at tiers, latency, pricing, and where they slot into your workloads.

2024 open-weight Varies by host; scales across GPU clusters

Llama 3.2 90B

Open-weight Llama 3.2 model with strong reasoning for an open license footprint.

Details →

Context

128k tokens

Modalities

Text · Code

Pricing

Open-weight (no per-token licensing)

Availability

Self-hosted, cloud marketplaces, supported by major GPU providers

Strengths

  • High quality for an open-weight model with competitive reasoning.
  • Supports fine-tuning and RAG pipelines on self-hosted infra.
  • Transparent licensing for on-prem or VPC deployments.

Best for

  • Teams that need vendor-neutral, controllable deployments.
  • Private RAG stacks with custom tuning and observability.
  • Cost-controlled batch inference across dedicated GPUs.