LLMs
Model Fleet Overview
Monitor model health, latency, and safety posture across the active LLM inventory. Track evaluation trends, signal grading, and cost metrics in real-time.
Suites Tracked
12
Daily Evals
847
Block Rate
2.3%
Median Score
0.92
Evaluation Trends
14-day suite performance
Signal Grading Heatmap
Model × Signal performance
Emergent Behavior
14-day signal trends
Latency Profile
p50/p95 latency & error rates
Scenario × Model Matrix
Pass rate by scenario and model
Cost Analysis
Model usage & spend
| Model | Tokens | Avg Latency | Total Spend |
|---|---|---|---|
| GPT-4 | 11,858,937 | 595ms | $3025.92 |
| GPT-4-Turbo | 17,540,252 | 355ms | $1637.02 |
| Claude-3-Opus | 48,434,037 | 1150ms | $2560.91 |
| Claude-3-Sonnet | 48,907,152 | 1425ms | $1971.29 |
| Gemini-Pro | 32,907,598 | 641ms | $4908.27 |
| Llama-3-70B | 44,475,747 | 525ms | $1782.31 |