Python SDK

The Zespan Python SDK instruments your Python LLM application with a single init() call. It uses a background daemon thread to flush events without blocking your application, and registers an atexit handler to flush on process exit.

Installation

pip install zespan

Initialization

Call zespan.init() once at startup before making any LLM calls. All parameters are keyword arguments.

import zespan

zespan.init(
    api_key="zsp_your_api_key_here",
    environment="production",
    store_prompts=True,
    sample_rate=1.0,
    debug=False,
)

Parameters:

string

required

Your Zespan API key. Must start with zsp_. Find this in your project settings.

string

default:"production"

Environment label attached to every event. Use "staging" or "development" to separate traces by environment.

boolean

default:"True"

When True (default), prompt and completion text are stored alongside traces with PII redaction applied before transmission. Set to False to disable prompt storage entirely.

float

default:"1.0"

Fraction of events to send, between 0.0 and 1.0. Set to 0.1 to trace 10% of calls.

list[str]

Keys whose values are redacted before any data is stored. Applied regardless of store_prompts. Passing this option replaces the default list rather than adding to it — see PII redaction below.

boolean

default:"False"

Opt-in pattern-based PII detection (emails, SSNs, credit cards, and similar) layered on top of key-based redact_keys matching. Tune it with pii_preset, pii_redaction_mode, pii_whitelist, pii_categories, and pii_confidence_threshold.

int

default:"50"

Number of events to accumulate before flushing.

float

default:"2.0"

Seconds between automatic flushes. The SDK also flushes on process exit.

int

default:"1000"

Maximum number of events held in the in-memory queue. Once exceeded, the oldest queued event is dropped to make room for new ones.

string

default:"https://api.zespan.com/v1/ingest"

Full ingest URL. Prompts, datasets, and guardrail checks all derive their base URL from this value. Override it for self-hosted deployments.

string

default:"https://api.zespan.com"

Root API URL used only by the config-propagation client (see Config propagation). For a self-hosted deployment, set this to the same host as endpoint.

string

Your Zespan project ID. Required, together with enable_zespan_pilot=True, to receive live config changes pushed from ZespanPilot or the dashboard.

boolean

default:"False"

When True and project_id is set, the SDK applies config changes (model overrides, sample rate, guardrail toggles) pushed from ZespanPilot or the dashboard. See Config propagation.

boolean

default:"False"

When True, also exports spans to an OpenTelemetry-compatible backend. Requires otel_endpoint.

string

OTel collector endpoint URL. Required when enable_otel=True.

string

default:"zespan-sdk"

Service name attached to exported OTel spans.

boolean

default:"False"

When True, logs internal flush errors to stdout. Enable during integration testing.

Auto-patching all providers

autopatch() detects which LLM libraries are installed and patches them all in one call. Use this instead of calling individual patch functions.

import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.autopatch()

Covers: OpenAI, Anthropic, Google Generative AI (legacy google-generativeai package), AWS Bedrock, Mistral, Groq, LiteLLM — whichever of these packages are importable in your environment.

autopatch() does not patch the new Google GenAI SDK (google-genai) or OpenRouter. Call zespan.patch_google_genai() or zespan.patch_openrouter() explicitly for those.

Provider patches

OpenAI

import openai
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai()

client = openai.OpenAI()

# Sync
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is the capital of France?"}],
)

# Async
import asyncio

async def main():
    async_client = openai.AsyncOpenAI()
    response = await async_client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Translate 'hello' to Spanish."}],
    )

asyncio.run(main())

Anthropic

import anthropic
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_anthropic()

client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Explain monads in plain English."}],
)

Google Generative AI (legacy SDK)

import google.generativeai as genai
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_google()

genai.configure(api_key="your_google_api_key")
model = genai.GenerativeModel("gemini-2.5-flash")
response = model.generate_content("What is quantum entanglement?")

Also patches genai.embed_content() automatically — embedding calls emit span_kind: embedding. Image generation models (e.g. gemini-3.1-flash-image) are detected from the response and emit span_kind: image_gen.

Google GenAI (new SDK — recommended)

The google-genai package is Google’s current Python SDK. patch_google_genai() patches Client.__init__ so every client instance you create is automatically traced.

pip install google-genai

from google import genai
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_google_genai()

client = genai.Client(api_key="your_google_api_key")

# Text generation
response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Explain quantum entanglement.",
)

# Embeddings (span_kind: embedding)
embed_response = client.models.embed_content(
    model="gemini-embedding-2",
    contents="The quick brown fox.",
)

# Image generation (span_kind: image_gen)
img_response = client.models.generate_images(
    model="imagen-3.0-generate-001",
    prompt="A photorealistic sunset over the ocean.",
)

# Video generation — Veo (span_kind: video_gen)
vid_op = client.models.generate_videos(
    model="veo-3.1-generate-preview",
    prompt="A slow-motion close-up of rain hitting a still lake.",
)

AWS Bedrock

import boto3
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_bedrock()

client = boto3.client("bedrock-runtime", region_name="us-east-1")
response = client.converse(
    modelId="amazon.nova-lite-v1:0",
    messages=[{"role": "user", "content": [{"text": "Summarize this document."}]}],
)

Mistral

from mistralai import Mistral
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_mistral()

client = Mistral(api_key="your_mistral_api_key")
response = client.chat.complete(
    model="mistral-small-latest",
    messages=[{"role": "user", "content": "What is the capital of France?"}],
)

Groq

from groq import Groq
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_groq()

client = Groq(api_key="your_groq_api_key")
response = client.chat.completions.create(
    model="llama-3.3-70b-versatile",
    messages=[{"role": "user", "content": "Explain gradient descent."}],
)

LiteLLM

import litellm
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_litellm()

response = litellm.completion(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello from LiteLLM!"}],
)

OpenRouter

import openai
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openrouter()

client = openai.OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="your_openrouter_api_key",
)
response = client.chat.completions.create(
    model="anthropic/claude-sonnet-4-6",
    messages=[{"role": "user", "content": "Hello from OpenRouter!"}],
)

Framework and agent integrations

Beyond the direct provider patches above, the SDK integrates with popular agent and orchestration frameworks. Each has its own dedicated guide:

LangChain — ZespanCallbackHandler traces chains, agents, tools, and retrievers. See LangChain.
LlamaIndex — ZespanLlamaIndexCallbackHandler traces query engines and retrievers. See LlamaIndex.
AutoGen / AG2 — wrap_autogen_agent traces ConversableAgent replies. See AutoGen.
CrewAI — wrap_crew traces crew task execution. See CrewAI.
Google ADK — wrap_adk_agent (or ZespanADKTracer) traces ADK agent runs. See Google ADK.
Haystack — ZespanHaystackTracer traces pipeline components. See Haystack.
Semantic Kernel — instrument_semantic_kernel traces kernel function invocations. See Semantic Kernel.

Context enrichment

Use with_zespan_context to attach a user_id, session_id, or custom tags to all traces generated within a function scope.

import zespan
from zespan import with_zespan_context

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai()

import openai
client = openai.OpenAI()

def handle_request(user_id: str, session_id: str, message: str) -> str:
    with with_zespan_context(user_id=user_id, session_id=session_id, tags={"feature": "chat"}):
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": message}],
        )
        return response.choices[0].message.content

Set user_id and session_id on every request that involves a logged-in user. This enables per-user cost breakdown and session replay in the Zespan dashboard.

Agent tracing with `with_agent`

Use the with_agent context manager to trace a multi-step agent workflow. It creates an agent span and exposes methods to log plans, trace tool calls, and record handoffs to other agents.

import zespan
from zespan import with_agent

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai()

with with_agent(
    name="CustomerSupportAgent",
    role="specialist",
    framework="custom",
    tools=[{"name": "lookup_order", "description": "Lookup order by id"}],
) as agent:
    agent.log_plan(["Lookup order", "Check refund policy", "Draft response"])

    order = agent.trace_tool(
        "lookup_order",
        {"order_id": "123"},
        lambda: {"id": "123", "status": "delivered", "total": 49.99},
    )

    agent.delegate_to("RefundPolicyAgent", "refund requested")

with_agent parameters:

string

required

Display name for this agent in traces and the agent registry.

string

default:"specialist"

Role label such as "coordinator", "specialist", or "planner".

string

default:"custom"

Framework name, e.g. "custom", "langchain", "google-adk".

object[]

List of tool definition objects with name and description fields.

AgentContext methods:

agent.log_plan(steps: list[str]) — records a planning span
agent.trace_tool(name, args, callable) — wraps a callable, records args and return value as a tool span
agent.delegate_to(target_name, reason) — records a handoff span

Manual spans with `start_span`

Use start_span to instrument any function as a custom span and attach evaluation scores. It returns a ManualSpan — enter span.run() to propagate trace context to nested calls, then call span.end() when the work completes to emit the event.

import zespan
from zespan import start_span

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai()

import openai
client = openai.OpenAI()

span = start_span(name="rag-pipeline")
with span.run():
    docs = retrieve_documents("user query")
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": f"Use these docs: {docs}"},
            {"role": "user", "content": "user query"},
        ],
    )
    span.set_eval_score("relevance", 0.92)
span.end()

Prompt management

The PromptClient fetches versioned prompts from the Zespan prompt library at runtime. Results are cached locally for 5 minutes.

from zespan import PromptClient, get_client
import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai()

prompts = PromptClient(get_client())

prompt = prompts.get("support-reply", label="production")
text = prompts.compile(prompt, {
    "customer_name": "Alex",
    "order_id": "ORD-7821",
})

import openai
client = openai.OpenAI()
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": text},
        {"role": "user", "content": "I need help with my order."},
    ],
)

See the Prompt management page for the full API reference — the Python PromptClient exposes the same methods: get, list, create, update_labels, compile, and clear_cache. get() also accepts a fallback dict so calls degrade gracefully if the prompt library is unreachable, and compile() accepts a placeholders dict for prompts that splice in reusable message-list segments.

In Python, method and parameter names use snake_case: update_labels, clear_cache, prompt_type, commit_message, placeholders, fallback.

Dataset experiment runs

Use DatasetsClient (client.datasets) to run your own pipeline against a Zespan dataset, link each result back to Zespan via the trace ID your code already produces, then score and compare runs in the dashboard.

import zespan
from zespan import get_client, get_current_context, start_span

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai()

datasets = get_client().datasets

items = datasets.get_items("support-qa-v2")

run = datasets.create_run(
    "support-qa-v2",
    "gpt-4o-baseline",
    description="Baseline run before the prompt rewrite",
)

import openai
client = openai.OpenAI()

for item in items:
    span = start_span(name="dataset-item")
    with span.run():
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": item["input"]}],
        )
        trace_id = get_current_context()["trace_id"]
    span.end()

    run.link(dataset_item_id=item["id"], trace_id=trace_id)

DatasetsClient methods:

method

Lists a dataset’s items by dataset name.

method

Creates or fetches a named run against a dataset. Idempotent — safe to call every time a job starts. Returns a run handle.

Run handle methods:

method

Links a dataset item to this run via a trace ID your own code already produced. Pass observation_id to link a specific span instead of the whole trace.

This links raw traces to a run — it doesn’t score or compare anything itself. See Datasets for the full dashboard-side walkthrough of scoring linked runs and comparing them against each other.

Flushing in serverless environments

In short-lived processes such as AWS Lambda, Vercel Functions, or Cloud Run, call zespan.get_client().flush() explicitly before the handler returns to guarantee delivery.

import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai()

def lambda_handler(event, context):
    response = openai_client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": event["prompt"]}],
    )
    result = response.choices[0].message.content

    zespan.get_client().flush()
    return {"statusCode": 200, "body": result}

Omitting zespan.get_client().flush() in serverless environments is the most common cause of missing traces. Always call it before your handler returns.

PII redaction

Zespan automatically redacts values from tags and metadata fields before they leave your application. The key is preserved; the value is replaced with "[REDACTED]". Default redacted keys (applied whenever redact_keys is not passed at all): password, secret, token, api_key. Pass your own redact_keys list at initialization to protect additional fields:

import zespan

zespan.init(
    api_key="zsp_your_api_key_here",
    redact_keys=["email", "phone", "address", "ip_address", "dob"],
)

redact_keys replaces the default list rather than adding to it. If you pass your own list, include password, secret, token, and api_key explicitly if you still want them redacted.

Key-based redaction applies to tags and metadata fields, and to stored prompt and completion text. Prompt storage is on by default — set store_prompts=False to disable it entirely.

Pattern-based PII detection (opt-in)

Set redact_pii=True to layer pattern-based detection (emails, SSNs, credit cards, and similar) on top of key-based redact_keys matching:

import zespan

zespan.init(
    api_key="zsp_your_api_key_here",
    redact_pii=True,
    pii_preset="gdpr",             # gdpr | hipaa | ccpa | pci-dss | soc2 | finance | education | transportation
    pii_redaction_mode="mask-middle",  # placeholder | mask-middle | mask-all
    pii_confidence_threshold=0.7,
    pii_whitelist=["support@yourcompany.com"],
)

string

Named bundle of PII categories to detect, e.g. "gdpr" or "hipaa".

list[str]

Explicit list of PII categories to detect, as an alternative to pii_preset.

string

default:"placeholder"

How matched values are redacted: "placeholder" (replace entirely), "mask-middle", or "mask-all".

list[str]

Values that should never be redacted even if they match a PII pattern.

float

default:"0.7"

Minimum detector confidence, between 0.0 and 1.0, required before a match is redacted.

Guardrails

Guardrails run content safety checks before sending a prompt to the LLM (pre-check) and before returning the completion (post-check). Pass guardrails=True to any patch function to enable both phases.

import zespan

zespan.init(api_key="zsp_your_api_key_here")
zespan.patch_openai(guardrails=True)

See Guardrails for the full reference — parameter defaults, all fields, the direct check_guardrails() API, and a multi-wrapper example.

Config propagation

Zespan can push configuration changes — model overrides, fallback models, retry/timeout policies, sample rate, guardrail toggles, and more — to your running application without a redeployment. Changes made via ZespanPilot or the dashboard are picked up on the next event flush (default every 2 seconds), as long as both project_id and enable_zespan_pilot=True are set at init:

import zespan

zespan.init(
    api_key="zsp_your_api_key_here",
    project_id="proj_your_project_id",
    enable_zespan_pilot=True,
)

See Config propagation for the full list of rule types and the programmatic ConfigClient API.

Overview

TypeScript SDK

Advanced SDK Configuration

Integrations

LLM Providers

Agent Frameworks

RAG Frameworks

Vector Databases

Custom / Other

Guides

Python SDK — zespan

Installation

Initialization

Auto-patching all providers

Provider patches

OpenAI

Anthropic

Google Generative AI (legacy SDK)

Google GenAI (new SDK — recommended)

AWS Bedrock

Mistral

Groq

LiteLLM

OpenRouter

Framework and agent integrations

Context enrichment

Agent tracing with `with_agent`

Manual spans with `start_span`

Prompt management

Dataset experiment runs

Flushing in serverless environments

PII redaction

Pattern-based PII detection (opt-in)

Guardrails

Config propagation

​Installation

​Initialization

​Auto-patching all providers

​Provider patches

​OpenAI

​Anthropic

​Google Generative AI (legacy SDK)

​Google GenAI (new SDK — recommended)

​AWS Bedrock

​Mistral

​Groq

​LiteLLM

​OpenRouter

​Framework and agent integrations

​Context enrichment

​Agent tracing with with_agent

​Manual spans with start_span

​Prompt management

​Dataset experiment runs

​Flushing in serverless environments

​PII redaction

​Pattern-based PII detection (opt-in)

​Guardrails

​Config propagation

Installation

Initialization

Auto-patching all providers

Provider patches

OpenAI

Anthropic

Google Generative AI (legacy SDK)

Google GenAI (new SDK — recommended)

AWS Bedrock

Mistral

Groq

LiteLLM

OpenRouter

Framework and agent integrations

Context enrichment

Agent tracing with `with_agent`

Manual spans with `start_span`

Prompt management

Dataset experiment runs

Flushing in serverless environments

PII redaction

Pattern-based PII detection (opt-in)

Guardrails

Config propagation