Skip to main content
The Zespan Python SDK instruments your Python LLM application with a single init() call. It uses a background daemon thread to flush events without blocking your application, and registers an atexit handler to flush on process exit.

Installation

pip install zespan

Initialization

Call zespan.init() once at startup before making any LLM calls. All parameters are keyword arguments.
import zespan

zespan.init(
    api_key="zsp_your_api_key_here",
    environment="production",
    store_prompts=True,
    sample_rate=1.0,
    debug=False,
)
Parameters:
api_key
string
required
Your Zespan API key. Must start with zsp_. Find this in your project settings.
environment
string
default:"production"
Environment label attached to every event. Use "staging" or "development" to separate traces by environment.
store_prompts
boolean
default:"True"
When True (default), prompt and completion text are stored alongside traces with PII redaction applied before transmission. Set to False to disable prompt storage entirely.
sample_rate
float
default:"1.0"
Fraction of events to send, between 0.0 and 1.0. Set to 0.1 to trace 10% of calls.
redact_keys
list[str]
Keys whose values are redacted before any data is stored. Applied regardless of store_prompts.
batch_size
int
default:"50"
Number of events to accumulate before flushing.
flush_interval
float
default:"2.0"
Seconds between automatic flushes. The SDK also flushes on process exit.
base_url
string
default:"https://api.zespan.com"
Override the API base URL. Use this only for self-hosted deployments.
enable_otel
boolean
default:"False"
When True, also exports spans to an OpenTelemetry-compatible backend. Requires otel_endpoint.
otel_endpoint
string
OTel collector endpoint URL. Required when enable_otel=True.
debug
boolean
default:"False"
When True, logs internal flush errors to stdout. Enable during integration testing.

Auto-patching all providers

autopatch() detects which LLM libraries are installed and patches them all in one call. Use this instead of calling individual patch functions.
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.autopatch()
Covers: OpenAI, Anthropic, Google Generative AI, AWS Bedrock, Mistral, Groq, LiteLLM.

Provider patches

OpenAI

import openai
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai()

client = openai.OpenAI()

# Sync
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is the capital of France?"}],
)

# Async
import asyncio

async def main():
    async_client = openai.AsyncOpenAI()
    response = await async_client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Translate 'hello' to Spanish."}],
    )

asyncio.run(main())

Anthropic

import anthropic
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_anthropic()

client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Explain monads in plain English."}],
)

Google Generative AI

import google.generativeai as genai
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_google()

genai.configure(api_key="your_google_api_key")
model = genai.GenerativeModel("gemini-2.5-flash")
response = model.generate_content("What is quantum entanglement?")

AWS Bedrock

import boto3
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_bedrock()

client = boto3.client("bedrock-runtime", region_name="us-east-1")
response = client.converse(
    modelId="amazon.nova-lite-v1:0",
    messages=[{"role": "user", "content": [{"text": "Summarize this document."}]}],
)

Mistral

from mistralai import Mistral
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_mistral()

client = Mistral(api_key="your_mistral_api_key")
response = client.chat.complete(
    model="mistral-small-latest",
    messages=[{"role": "user", "content": "What is the capital of France?"}],
)

Groq

from groq import Groq
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_groq()

client = Groq(api_key="your_groq_api_key")
response = client.chat.completions.create(
    model="llama-3.3-70b-versatile",
    messages=[{"role": "user", "content": "Explain gradient descent."}],
)

LiteLLM

import litellm
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_litellm()

response = litellm.completion(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello from LiteLLM!"}],
)

OpenRouter

import openai
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openrouter()

client = openai.OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="your_openrouter_api_key",
)
response = client.chat.completions.create(
    model="anthropic/claude-sonnet-4-6",
    messages=[{"role": "user", "content": "Hello from OpenRouter!"}],
)

Context enrichment

Use with_zespan_context to attach a user_id, session_id, or custom tags to all traces generated within a function scope.
import zespan
from zespan import with_zespan_context

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai()

import openai
client = openai.OpenAI()

def handle_request(user_id: str, session_id: str, message: str) -> str:
    with with_zespan_context(user_id=user_id, session_id=session_id, tags={"feature": "chat"}):
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": message}],
        )
        return response.choices[0].message.content
Set user_id and session_id on every request that involves a logged-in user. This enables per-user cost breakdown and session replay in the Zespan dashboard.

Agent tracing with with_agent

Use the with_agent context manager to trace a multi-step agent workflow. It creates an agent span and exposes methods to log plans, trace tool calls, and record handoffs to other agents.
import zespan
from zespan import with_agent

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai()

with with_agent(
    name="CustomerSupportAgent",
    role="specialist",
    framework="custom",
    tools=[{"name": "lookup_order", "description": "Lookup order by id"}],
) as agent:
    agent.log_plan(["Lookup order", "Check refund policy", "Draft response"])

    order = agent.trace_tool(
        "lookup_order",
        {"order_id": "123"},
        lambda: {"id": "123", "status": "delivered", "total": 49.99},
    )

    agent.delegate_to("RefundPolicyAgent", "refund requested")
with_agent parameters:
name
string
required
Display name for this agent in traces and the agent registry.
role
string
default:"specialist"
Role label such as "coordinator", "specialist", or "planner".
framework
string
default:"custom"
Framework name, e.g. "custom", "langchain", "google-adk".
tools
object[]
List of tool definition objects with name and description fields.
AgentContext methods:
  • agent.log_plan(steps: list[str]) — records a planning span
  • agent.trace_tool(name, args, callable) — wraps a callable, records args and return value as a tool span
  • agent.delegate_to(target_name, reason) — records a handoff span

Manual spans with start_span

Use start_span to instrument any function as a custom span and attach evaluation scores.
import zespan
from zespan import start_span

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai()

import openai
client = openai.OpenAI()

with start_span(name="rag-pipeline") as span:
    docs = retrieve_documents("user query")
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": f"Use these docs: {docs}"},
            {"role": "user", "content": "user query"},
        ],
    )
    span.set_eval_score("relevance", 0.92)

Prompt management

The PromptClient fetches versioned prompts from the Zespan prompt library at runtime. Results are cached locally for 5 minutes.
from zespan import PromptClient, get_client
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai()

prompts = PromptClient(get_client())

prompt = prompts.get("support-reply", label="production")
text = prompts.compile(prompt, {
    "customer_name": "Alex",
    "order_id": "ORD-7821",
})

import openai
client = openai.OpenAI()
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": text},
        {"role": "user", "content": "I need help with my order."},
    ],
)
See the TypeScript prompt management docs for the full API reference — the Python PromptClient exposes the same methods: get, list, create, update_labels, compile, and clear_cache.
In Python, method and parameter names use snake_case: update_labels, clear_cache, prompt_type, commit_message.

Flushing in serverless environments

In short-lived processes such as AWS Lambda, Vercel Functions, or Cloud Run, call zespan.flush() explicitly before the handler returns to guarantee delivery.
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai()

def lambda_handler(event, context):
    response = openai_client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": event["prompt"]}],
    )
    result = response.choices[0].message.content

    zespan.flush()
    return {"statusCode": 200, "body": result}
Omitting zespan.flush() in serverless environments is the most common cause of missing traces. Always call it before your handler returns.

PII redaction

Zespan automatically redacts values from tags and metadata fields before they leave your application. The key is preserved; the value is replaced with "[REDACTED]". Default redacted keys (always applied): password, secret, token, api_key, apikey, auth, authorization, access_token, refresh_token, private_key, credential, ssn, credit_card, card_number, cvv. Add custom keys at initialization — they are merged with the defaults:
import zespan

zespan.init(
    api_key="zsp_your_api_key_here",
    redact_keys=["email", "phone", "address", "ip_address", "dob"],
)
To replace the defaults entirely with your own list:
zespan.init(
    api_key="zsp_your_api_key_here",
    redact_keys=["ssn", "account_number"],
    replace_default_redact_keys=True,
)
Replacing the defaults removes protection for common sensitive field names like password and token. Only do this if your custom list covers all sensitive fields your application may produce.
Redaction applies to tags and metadata fields, and to stored prompt and completion text. Prompt storage is on by default — set store_prompts=False to disable it entirely.

Guardrails

Guardrails run content safety checks before sending a prompt to the LLM (pre-check) and before returning the completion (post-check). Pass guardrails=True to any patch function to enable both phases.
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai(guardrails=True)
For fine-grained control, pass a dict:
zespan.patch_openai(guardrails={
    "pre": True,         # Check prompt before sending to LLM
    "post": True,        # Check completion before returning to app
    "fail_closed": False # If guardrail service errors, allow call through
})
Handle blocks with GuardrailBlockedError:
import zespan
from zespan import GuardrailBlockedError

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai(guardrails=True)

import openai
client = openai.OpenAI()

def generate_response(user_message: str) -> str:
    try:
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": user_message}],
        )
        return response.choices[0].message.content
    except GuardrailBlockedError as e:
        print(f"Guardrail blocked {e.phase} content:", e.results)
        return "I'm sorry, I can't help with that request."
GuardrailBlockedError properties:
  • phase"pre" (input blocked) or "post" (output blocked)
  • results — list of per-policy results with passed, action, reason, modified_text
Configure guardrail policies in the Zespan dashboard under Settings → Guardrails. Policies are evaluated server-side — update them without redeploying your application.

Config propagation

Zespan can push configuration changes — model overrides, sample rate, guardrail toggles — to your running application without a redeployment. Changes made via ZespanPilot or the dashboard take effect within the next flush cycle (default 2 seconds). What can be propagated:
Rule typeEffect
model_overrideRedirect calls for a given model to a different model
sample_rateIncrease or decrease the fraction of events traced
guardrail_enableEnable or disable guardrails on a wrapper
pii_redact_keysAdd keys to the redaction list
log_levelToggle debug logging
disable_tracingStop all tracing immediately
Force an immediate config refresh:
import zespan

zespan.init(api_key="zsp_your_api_key_here")

# Force refresh — useful during startup
zespan.get_client().refresh_config()
To disable remote config propagation entirely:
zespan.init(
    api_key="zsp_your_api_key_here",
    disable_config_sync=True,
)
With disable_config_sync=True, the SDK ignores all remote config changes. All behavior is determined solely by the options passed to init().