Groq client with wrapGroq() to trace every inference call, including latency breakdowns useful for Groq’s fast inference speeds.
Installation
Setup
Example
What gets captured
| Field | Details |
|---|---|
| Model | llama-3.3-70b-versatile, mixtral-8x7b-32768, gemma2-9b-it, etc. |
| Input tokens | From usage.prompt_tokens |
| Output tokens | From usage.completion_tokens |
| Cost | Calculated from token counts and Groq pricing |
| Latency | Total request duration (Groq latency is typically under 1s) |

