Times promptStreaming() over a few fixed prompts. Decode is output-token throughput from the first chunk to the last; prefill is input-token throughput up to the first streamed chunk. Token counts come from session.measureContextUsage(), which uses the model's real tokenizer.
| Test | In tok | Out tok | TTFT (ms) | Prefill (tok/s) | Decode (tok/s) |
|---|