The Zespan Python SDK instruments your Python LLM application with a single init() call. It uses a background daemon thread to flush events without blocking your application, and registers an atexit handler to flush on process exit.
Installation
Initialization
Call zespan.init() once at startup before making any LLM calls. All parameters are keyword arguments.
import zespan
zespan.init(
api_key="zsp_your_api_key_here",
environment="production",
store_prompts=True,
sample_rate=1.0,
debug=False,
)
Parameters:
Your Zespan API key. Must start with zsp_. Find this in your project settings.
environment
string
default:"production"
Environment label attached to every event. Use "staging" or "development" to separate traces by environment.
When True (default), prompt and completion text are stored alongside traces with PII redaction applied before transmission. Set to False to disable prompt storage entirely.
Fraction of events to send, between 0.0 and 1.0. Set to 0.1 to trace 10% of calls.
Keys whose values are redacted before any data is stored. Applied regardless of store_prompts.
Number of events to accumulate before flushing.
Seconds between automatic flushes. The SDK also flushes on process exit.
base_url
string
default:"https://api.zespan.com"
Override the API base URL. Use this only for self-hosted deployments.
When True, also exports spans to an OpenTelemetry-compatible backend. Requires otel_endpoint.
OTel collector endpoint URL. Required when enable_otel=True.
When True, logs internal flush errors to stdout. Enable during integration testing.
Auto-patching all providers
autopatch() detects which LLM libraries are installed and patches them all in one call. Use this instead of calling individual patch functions.
import zespan
zespan.init(api_key="zsp_your_api_key_here")
zespan.autopatch()
Covers: OpenAI, Anthropic, Google Generative AI, AWS Bedrock, Mistral, Groq, LiteLLM.
Provider patches
OpenAI
import openai
import zespan
zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai()
client = openai.OpenAI()
# Sync
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What is the capital of France?"}],
)
# Async
import asyncio
async def main():
async_client = openai.AsyncOpenAI()
response = await async_client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Translate 'hello' to Spanish."}],
)
asyncio.run(main())
Anthropic
import anthropic
import zespan
zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_anthropic()
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "Explain monads in plain English."}],
)
Google Generative AI
import google.generativeai as genai
import zespan
zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_google()
genai.configure(api_key="your_google_api_key")
model = genai.GenerativeModel("gemini-2.5-flash")
response = model.generate_content("What is quantum entanglement?")
AWS Bedrock
import boto3
import zespan
zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_bedrock()
client = boto3.client("bedrock-runtime", region_name="us-east-1")
response = client.converse(
modelId="amazon.nova-lite-v1:0",
messages=[{"role": "user", "content": [{"text": "Summarize this document."}]}],
)
Mistral
from mistralai import Mistral
import zespan
zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_mistral()
client = Mistral(api_key="your_mistral_api_key")
response = client.chat.complete(
model="mistral-small-latest",
messages=[{"role": "user", "content": "What is the capital of France?"}],
)
Groq
from groq import Groq
import zespan
zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_groq()
client = Groq(api_key="your_groq_api_key")
response = client.chat.completions.create(
model="llama-3.3-70b-versatile",
messages=[{"role": "user", "content": "Explain gradient descent."}],
)
LiteLLM
import litellm
import zespan
zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_litellm()
response = litellm.completion(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello from LiteLLM!"}],
)
OpenRouter
import openai
import zespan
zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openrouter()
client = openai.OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="your_openrouter_api_key",
)
response = client.chat.completions.create(
model="anthropic/claude-sonnet-4-6",
messages=[{"role": "user", "content": "Hello from OpenRouter!"}],
)
Context enrichment
Use with_zespan_context to attach a user_id, session_id, or custom tags to all traces generated within a function scope.
import zespan
from zespan import with_zespan_context
zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai()
import openai
client = openai.OpenAI()
def handle_request(user_id: str, session_id: str, message: str) -> str:
with with_zespan_context(user_id=user_id, session_id=session_id, tags={"feature": "chat"}):
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": message}],
)
return response.choices[0].message.content
Set user_id and session_id on every request that involves a logged-in user. This enables per-user cost breakdown and session replay in the Zespan dashboard.
Agent tracing with with_agent
Use the with_agent context manager to trace a multi-step agent workflow. It creates an agent span and exposes methods to log plans, trace tool calls, and record handoffs to other agents.
import zespan
from zespan import with_agent
zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai()
with with_agent(
name="CustomerSupportAgent",
role="specialist",
framework="custom",
tools=[{"name": "lookup_order", "description": "Lookup order by id"}],
) as agent:
agent.log_plan(["Lookup order", "Check refund policy", "Draft response"])
order = agent.trace_tool(
"lookup_order",
{"order_id": "123"},
lambda: {"id": "123", "status": "delivered", "total": 49.99},
)
agent.delegate_to("RefundPolicyAgent", "refund requested")
with_agent parameters:
Display name for this agent in traces and the agent registry.
role
string
default:"specialist"
Role label such as "coordinator", "specialist", or "planner".
Framework name, e.g. "custom", "langchain", "google-adk".
List of tool definition objects with name and description fields.
AgentContext methods:
agent.log_plan(steps: list[str]) — records a planning span
agent.trace_tool(name, args, callable) — wraps a callable, records args and return value as a tool span
agent.delegate_to(target_name, reason) — records a handoff span
Manual spans with start_span
Use start_span to instrument any function as a custom span and attach evaluation scores.
import zespan
from zespan import start_span
zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai()
import openai
client = openai.OpenAI()
with start_span(name="rag-pipeline") as span:
docs = retrieve_documents("user query")
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": f"Use these docs: {docs}"},
{"role": "user", "content": "user query"},
],
)
span.set_eval_score("relevance", 0.92)
Prompt management
The PromptClient fetches versioned prompts from the Zespan prompt library at runtime. Results are cached locally for 5 minutes.
from zespan import PromptClient, get_client
import zespan
zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai()
prompts = PromptClient(get_client())
prompt = prompts.get("support-reply", label="production")
text = prompts.compile(prompt, {
"customer_name": "Alex",
"order_id": "ORD-7821",
})
import openai
client = openai.OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": text},
{"role": "user", "content": "I need help with my order."},
],
)
See the TypeScript prompt management docs for the full API reference — the Python PromptClient exposes the same methods: get, list, create, update_labels, compile, and clear_cache.
In Python, method and parameter names use snake_case: update_labels, clear_cache, prompt_type, commit_message.
Flushing in serverless environments
In short-lived processes such as AWS Lambda, Vercel Functions, or Cloud Run, call zespan.flush() explicitly before the handler returns to guarantee delivery.
import zespan
zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai()
def lambda_handler(event, context):
response = openai_client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": event["prompt"]}],
)
result = response.choices[0].message.content
zespan.flush()
return {"statusCode": 200, "body": result}
Omitting zespan.flush() in serverless environments is the most common cause of missing traces. Always call it before your handler returns.
PII redaction
Zespan automatically redacts values from tags and metadata fields before they leave your application. The key is preserved; the value is replaced with "[REDACTED]".
Default redacted keys (always applied): password, secret, token, api_key, apikey, auth, authorization, access_token, refresh_token, private_key, credential, ssn, credit_card, card_number, cvv.
Add custom keys at initialization — they are merged with the defaults:
import zespan
zespan.init(
api_key="zsp_your_api_key_here",
redact_keys=["email", "phone", "address", "ip_address", "dob"],
)
To replace the defaults entirely with your own list:
zespan.init(
api_key="zsp_your_api_key_here",
redact_keys=["ssn", "account_number"],
replace_default_redact_keys=True,
)
Replacing the defaults removes protection for common sensitive field names like password and token. Only do this if your custom list covers all sensitive fields your application may produce.
Redaction applies to tags and metadata fields, and to stored prompt and completion text. Prompt storage is on by default — set store_prompts=False to disable it entirely.
Guardrails
Guardrails run content safety checks before sending a prompt to the LLM (pre-check) and before returning the completion (post-check). Pass guardrails=True to any patch function to enable both phases.
import zespan
zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai(guardrails=True)
For fine-grained control, pass a dict:
zespan.patch_openai(guardrails={
"pre": True, # Check prompt before sending to LLM
"post": True, # Check completion before returning to app
"fail_closed": False # If guardrail service errors, allow call through
})
Handle blocks with GuardrailBlockedError:
import zespan
from zespan import GuardrailBlockedError
zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai(guardrails=True)
import openai
client = openai.OpenAI()
def generate_response(user_message: str) -> str:
try:
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": user_message}],
)
return response.choices[0].message.content
except GuardrailBlockedError as e:
print(f"Guardrail blocked {e.phase} content:", e.results)
return "I'm sorry, I can't help with that request."
GuardrailBlockedError properties:
phase — "pre" (input blocked) or "post" (output blocked)
results — list of per-policy results with passed, action, reason, modified_text
Configure guardrail policies in the Zespan dashboard under Settings → Guardrails. Policies are evaluated server-side — update them without redeploying your application.
Config propagation
Zespan can push configuration changes — model overrides, sample rate, guardrail toggles — to your running application without a redeployment. Changes made via ZespanPilot or the dashboard take effect within the next flush cycle (default 2 seconds).
What can be propagated:
| Rule type | Effect |
|---|
model_override | Redirect calls for a given model to a different model |
sample_rate | Increase or decrease the fraction of events traced |
guardrail_enable | Enable or disable guardrails on a wrapper |
pii_redact_keys | Add keys to the redaction list |
log_level | Toggle debug logging |
disable_tracing | Stop all tracing immediately |
Force an immediate config refresh:
import zespan
zespan.init(api_key="zsp_your_api_key_here")
# Force refresh — useful during startup
zespan.get_client().refresh_config()
To disable remote config propagation entirely:
zespan.init(
api_key="zsp_your_api_key_here",
disable_config_sync=True,
)
With disable_config_sync=True, the SDK ignores all remote config changes. All behavior is determined solely by the options passed to init().