TypeScript SDK
Complete reference for the Retrace TypeScript SDK.
Installation
npm install retrace-sdkRequires Node.js 20+. This is an ESM-only package with a single lightweight runtime dependency (ws for WebSocket transport).
Configuration
Configure the SDK at application startup:
import { configure } from "retrace-sdk";
configure({
apiKey: "rt_live_...",
baseUrl: "https://api.retraceai.tech",
projectId: "my-project",
});Or rely on environment variables:
| Variable | Default | Description |
|---|---|---|
RETRACE_API_KEY | — | Your API key (required) |
RETRACE_BASE_URL | https://api.retraceai.tech | API endpoint |
RETRACE_PROJECT_ID | — | Default project identifier |
RETRACE_ENABLED | true | Set to false to disable tracing |
record() Function
Create a recorder instance for manual control over the trace lifecycle:
import { record } from "retrace-sdk";
const recorder = record({ name: "my-agent" });
recorder.start();
// ... your agent logic ...
recorder.end(result);If an error occurs, call recorder.fail(error) to mark the trace as failed and record the exception.
trace() Helper
Wrap any async function for automatic trace recording. Input arguments and return values are captured:
import { trace } from "retrace-sdk";
const myAgent = trace(async (prompt: string) => {
const response = await openai.chat.completions.create({
model: "gpt-5.5",
messages: [{ role: "user", content: prompt }],
});
return response.choices[0].message.content;
}, { name: "my-agent" });
const result = await myAgent("hello");Exceptions thrown inside the wrapped function are recorded and re-thrown.
Manual Spans
Add child spans for fine-grained observability into individual steps:
import { record, SpanType } from "retrace-sdk";
const recorder = record({ name: "agent" });
recorder.start();
const span = recorder.startSpan("web_search", SpanType.TOOL_CALL, { query: "test" });
const results = await search("test");
recorder.endSpan(span, results);
recorder.end();Supported span types: LLM_CALL, TOOL_CALL, TOOL_RESULT, REASONING, ACTION, ERROR, FORK_POINT.
Auto-Instrumentation
Retrace automatically captures LLM calls from all major providers.
OpenAI
import { installOpenAIInterceptor } from "retrace-sdk";
installOpenAIInterceptor((span) => recorder.addSpan(span));Captures all openai.chat.completions.create() calls.
Anthropic
import { installAnthropicInterceptor } from "retrace-sdk";
installAnthropicInterceptor((span) => recorder.addSpan(span));Captures all anthropic.messages.create() calls.
Gemini
Automatically capture all Gemini SDK calls as spans:
import { installGeminiInterceptor } from "retrace-sdk";
installGeminiInterceptor((span) => recorder.addSpan(span));Records model, messages, token usage, latency, and estimated cost per call.
Transport Modes
| Mode | Protocol | Use Case |
|---|---|---|
auto | WebSocket with HTTP fallback | Default for most environments |
ws | WebSocket only | Real-time streaming, persistent processes |
http | HTTP batch (native fetch) | Serverless (Vercel, Cloudflare Workers) |
Configure explicitly:
configure({ apiKey: "rt_live_...", transport: "http" });Metadata and Tags
Attach custom metadata to any trace for filtering and search:
const recorder = record({
name: "agent",
metadata: { environment: "production", version: "2.1.0" },
tags: ["customer-facing", "high-priority"],
});Disabling in Tests
process.env.RETRACE_ENABLED = "false";Or pass the flag directly:
configure({ apiKey: "rt_live_...", enabled: false });[!NOTE] The TypeScript SDK is ESM-only. For CommonJS projects, use a dynamic import:
const retrace = await import("retrace-sdk").
Sampling
Control what percentage of traces are recorded to reduce costs at high volume:
configure({
apiKey: "rt_live_...",
sampleRate: 0.5, // Record 50% of traces
});Or via environment variable:
RETRACE_SAMPLE_RATE=0.1 # Record 10% of tracesSampled-out traces execute normally with zero SDK overhead.
Deterministic Sampling
For reproducible sampling decisions (useful in tests and replay), provide a seed:
configure({
apiKey: "rt_live_...",
sampleRate: 0.5,
sampleSeed: "my-stable-seed", // Same seed + trace name = same decision
});Or via environment variable:
RETRACE_SAMPLE_SEED=my-stable-seedW3C Traceparent Propagation
Inject distributed tracing headers into outgoing HTTP requests:
import { setTraceContext, injectTraceparent } from "retrace-sdk";
// Inside a traced function, set the active context
setTraceContext(traceId, spanId);
// When making HTTP calls, inject the traceparent header
const headers = injectTraceparent({ "Content-Type": "application/json" });
// headers now includes: { traceparent: "00-{trace_id}-{span_id}-01", ... }
await fetch("https://downstream-service.com/api", { headers });Parse incoming traceparent from upstream services:
import { parseTraceparent } from "retrace-sdk";
const tp = parseTraceparent(req.headers["traceparent"]);
// { traceId: "abc...", parentId: "def...", sampled: true }Streaming Interception
The SDK automatically captures streaming responses from OpenAI and Anthropic:
// Streaming is intercepted transparently — no extra config needed
const stream = await openai.chat.completions.create({
model: "gpt-5.5",
messages: [{ role: "user", content: "Hello" }],
stream: true, // ← SDK wraps the iterator, emits span on completion
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || "");
}
// Span is emitted here with full output, tokens, and costPer-Span-Type Truncation
Configure payload size limits per span type:
import { setTruncationLimits } from "retrace-sdk";
setTruncationLimits({
llm_call: 51200, // 50KB for LLM prompts
tool_call: 10240, // 10KB for tool args
tool_result: 10240,
reasoning: 20480,
action: 5120,
});Token ID Capture
The SDK automatically captures output token IDs and log-probabilities when available from the LLM provider. This enables speculative decoding during replay — dramatically reducing replay latency.
// Token IDs are captured automatically from OpenAI when logprobs are enabled
const response = await openai.chat.completions.create({
model: "gpt-5.5",
messages: [{ role: "user", content: "Hello" }],
logprobs: true, // ← Enables token ID capture
});
// The span will include:
// - token_ids: [1234, 5678, ...] (output token IDs)
// - logprobs: [-0.1, -0.05, ...] (per-token log probabilities)Token IDs are stored in the span and used during fork replay to seed the draft model, achieving 90%+ acceptance rates and 3x throughput improvement.
[!NOTE] Token ID capture adds ~2-3x to span storage size. The data is only stored when the provider returns it (requires
logprobs: truefor OpenAI).