LiteLLM

Available for: Python and TypeScript.

LiteLLM provides a unified interface for 100+ LLM providers. In Python, patch the litellm module directly with patch_litellm(). In TypeScript, wrapLiteLLM() points a wrapped OpenAI-compatible client at your LiteLLM proxy server — the Node ecosystem doesn’t have a litellm client package, so TypeScript apps talk to LiteLLM through its OpenAI-compatible proxy endpoint instead of an in-process SDK.

Installation

npm install @zespan/sdk openai

pip install zespan litellm

TypeScript only needs the openai package — wrapLiteLLM() builds an OpenAI client internally and points it at your running LiteLLM proxy server. Python calls the litellm package’s module-level completion() / acompletion() functions directly, no proxy required.

Setup

import { zespan } from "@zespan/sdk";

zespan.init({ apiKey: process.env.ZESPAN_API_KEY! });

const litellm = zespan.wrapLiteLLM({
  baseURL: "http://localhost:4000", // your LiteLLM proxy server
  apiKey: process.env.LITELLM_API_KEY,
});

import os
import litellm
import zespan

zespan.init(api_key=os.environ["ZESPAN_API_KEY"])
zespan.patch_litellm()

wrapLiteLLM() takes a { baseURL, apiKey } options object (not a client instance) and returns an OpenAI-compatible wrapped client pointed at that proxy. patch_litellm() monkey-patches litellm.completion and litellm.acompletion at the module level — no client object to construct at all.

Example

const response = await litellm.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "What is observability?" }],
});

response = litellm.completion(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is observability?"}],
)

# Async
response = await litellm.acompletion(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is observability?"}],
)

Both litellm.completion(..., stream=True) (Python, sync and async) and streaming calls through the wrapped TypeScript client are traced end-to-end, including time-to-first-token.

What gets captured

Field	Details
Model	The model string passed to LiteLLM (e.g. `gpt-4o`, `claude-sonnet-4-6`, `gemini/gemini-2.5-flash`)
Input tokens	From LiteLLM’s normalized `usage` field
Output tokens	From LiteLLM’s normalized `usage` field
Cost	Calculated from token counts and Zespan’s model pricing registry
Latency	Total call duration
Finish reason	LiteLLM’s normalized `finish_reason` (e.g. `stop`, `length`, `tool_calls`)
Tool calls	Tool name and parsed arguments, both languages

Overview

TypeScript SDK

Python SDK

Advanced SDK Configuration

Integrations

LLM Providers

Agent Frameworks

RAG Frameworks

Vector Databases

Custom / Other

Guides

Installation

Setup

Example

What gets captured

​Installation

​Setup

​Example

​What gets captured

Installation

Setup

Example

What gets captured