All changelog entries
Improvement0.8.1ModelsAPITransparency

See the quantization each model runs at

June 3, 2026

Model listings now tell you the numeric precision a model is served at, so you can pick with eyes open.

  • GET /v1/models (and the public models list) now include a quantization field — e.g. fp8, int8, bf16 — when we know it.
  • Most large models run at fp8; smaller or budget routes may be int8. Models with unknown precision simply omit the field.

No action needed — it's purely additive on the responses you already get.