Skip to main content
Wrap your Groq client with wrapGroq() to trace every inference call, including latency breakdowns useful for Groq’s fast inference speeds.

Installation

npm install @zespan/sdk groq-sdk

Setup

import Groq from "groq-sdk";
import { zespan } from "@zespan/sdk";

zespan.init({ apiKey: process.env.ZESPAN_API_KEY! });

const groq = zespan.wrapGroq(new Groq({ apiKey: process.env.GROQ_API_KEY! }));

Example

const completion = await groq.chat.completions.create({
  model: "llama-3.3-70b-versatile",
  messages: [{ role: "user", content: "Summarize this in one sentence." }],
});

What gets captured

FieldDetails
Modelllama-3.3-70b-versatile, mixtral-8x7b-32768, gemma2-9b-it, etc.
Input tokensFrom usage.prompt_tokens
Output tokensFrom usage.completion_tokens
CostCalculated from token counts and Groq pricing
LatencyTotal request duration (Groq latency is typically under 1s)