Installation

npm install retrace-sdk

Requires Node.js 20+. This is an ESM-only package with a single lightweight runtime dependency (ws for WebSocket transport).

Configuration

Configure the SDK at application startup:

import { configure } from "retrace-sdk";

configure({
  apiKey: "rt_live_...",
  baseUrl: "https://api.retraceai.tech",
  projectId: "my-project",
});

Or rely on environment variables:

Variable	Default	Description
`RETRACE_API_KEY`	—	Your API key (required)
`RETRACE_BASE_URL`	`https://api.retraceai.tech`	API endpoint
`RETRACE_PROJECT_ID`	—	Default project identifier
`RETRACE_ENABLED`	`true`	Set to `false` to disable tracing

record() Function

Create a recorder instance for manual control over the trace lifecycle:

import { record } from "retrace-sdk";

const recorder = record({ name: "my-agent" });
recorder.start();

// ... your agent logic ...

recorder.end(result);

If an error occurs, call recorder.fail(error) to mark the trace as failed and record the exception.

trace() Helper

Wrap any async function for automatic trace recording. Input arguments and return values are captured:

import { trace } from "retrace-sdk";

const myAgent = trace(async (prompt: string) => {
  const response = await openai.chat.completions.create({
    model: "gpt-5.5",
    messages: [{ role: "user", content: prompt }],
  });
  return response.choices[0].message.content;
}, { name: "my-agent" });

const result = await myAgent("hello");

Exceptions thrown inside the wrapped function are recorded and re-thrown.

Manual Spans

Add child spans for fine-grained observability into individual steps:

import { record, SpanType } from "retrace-sdk";

const recorder = record({ name: "agent" });
recorder.start();

const span = recorder.startSpan("web_search", SpanType.TOOL_CALL, { query: "test" });
const results = await search("test");
recorder.endSpan(span, results);

recorder.end();

Supported span types: LLM_CALL, TOOL_CALL, TOOL_RESULT, REASONING, ACTION, ERROR, FORK_POINT.

Auto-Instrumentation

Retrace automatically captures LLM calls from all major providers.

OpenAI

import { installOpenAIInterceptor } from "retrace-sdk";
installOpenAIInterceptor((span) => recorder.addSpan(span));

Captures all openai.chat.completions.create() calls.

Anthropic

import { installAnthropicInterceptor } from "retrace-sdk";
installAnthropicInterceptor((span) => recorder.addSpan(span));

Captures all anthropic.messages.create() calls.

Gemini

Automatically capture all Gemini SDK calls as spans:

import { installGeminiInterceptor } from "retrace-sdk";

installGeminiInterceptor((span) => recorder.addSpan(span));

Records model, messages, token usage, latency, and estimated cost per call.

Transport Modes

Mode	Protocol	Use Case
`auto`	WebSocket with HTTP fallback	Default for most environments
`ws`	WebSocket only	Real-time streaming, persistent processes
`http`	HTTP batch (native fetch)	Serverless (Vercel, Cloudflare Workers)

Configure explicitly:

configure({ apiKey: "rt_live_...", transport: "http" });

Metadata and Tags

Attach custom metadata to any trace for filtering and search:

const recorder = record({
  name: "agent",
  metadata: { environment: "production", version: "2.1.0" },
  tags: ["customer-facing", "high-priority"],
});

Disabling in Tests

process.env.RETRACE_ENABLED = "false";

Or pass the flag directly:

configure({ apiKey: "rt_live_...", enabled: false });

[!NOTE] The TypeScript SDK is ESM-only. For CommonJS projects, use a dynamic import: const retrace = await import("retrace-sdk").

Sampling

Control what percentage of traces are recorded to reduce costs at high volume:

configure({
  apiKey: "rt_live_...",
  sampleRate: 0.5, // Record 50% of traces
});

Or via environment variable:

RETRACE_SAMPLE_RATE=0.1  # Record 10% of traces

Sampled-out traces execute normally with zero SDK overhead.

Deterministic Sampling

For reproducible sampling decisions (useful in tests and replay), provide a seed:

configure({
  apiKey: "rt_live_...",
  sampleRate: 0.5,
  sampleSeed: "my-stable-seed", // Same seed + trace name = same decision
});

Or via environment variable:

RETRACE_SAMPLE_SEED=my-stable-seed

W3C Traceparent Propagation

Inject distributed tracing headers into outgoing HTTP requests:

import { setTraceContext, injectTraceparent } from "retrace-sdk";

// Inside a traced function, set the active context
setTraceContext(traceId, spanId);

// When making HTTP calls, inject the traceparent header
const headers = injectTraceparent({ "Content-Type": "application/json" });
// headers now includes: { traceparent: "00-{trace_id}-{span_id}-01", ... }

await fetch("https://downstream-service.com/api", { headers });

Parse incoming traceparent from upstream services:

import { parseTraceparent } from "retrace-sdk";

const tp = parseTraceparent(req.headers["traceparent"]);
// { traceId: "abc...", parentId: "def...", sampled: true }

Streaming Interception

The SDK automatically captures streaming responses from OpenAI and Anthropic:

// Streaming is intercepted transparently — no extra config needed
const stream = await openai.chat.completions.create({
  model: "gpt-5.5",
  messages: [{ role: "user", content: "Hello" }],
  stream: true, // ← SDK wraps the iterator, emits span on completion
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}
// Span is emitted here with full output, tokens, and cost

Per-Span-Type Truncation

Configure payload size limits per span type:

import { setTruncationLimits } from "retrace-sdk";

setTruncationLimits({
  llm_call: 51200,   // 50KB for LLM prompts
  tool_call: 10240,  // 10KB for tool args
  tool_result: 10240,
  reasoning: 20480,
  action: 5120,
});

Token ID Capture

The SDK automatically captures output token IDs and log-probabilities when available from the LLM provider. This enables speculative decoding during replay — dramatically reducing replay latency.

// Token IDs are captured automatically from OpenAI when logprobs are enabled
const response = await openai.chat.completions.create({
  model: "gpt-5.5",
  messages: [{ role: "user", content: "Hello" }],
  logprobs: true, // ← Enables token ID capture
});

// The span will include:
// - token_ids: [1234, 5678, ...] (output token IDs)
// - logprobs: [-0.1, -0.05, ...] (per-token log probabilities)

Token IDs are stored in the span and used during fork replay to seed the draft model, achieving 90%+ acceptance rates and 3x throughput improvement.

[!NOTE] Token ID capture adds ~2-3x to span storage size. The data is only stored when the provider returns it (requires logprobs: true for OpenAI).

TypeScript SDK

Installation

Configuration

record() Function

trace() Helper

Manual Spans

Auto-Instrumentation

OpenAI

Anthropic

Gemini

Transport Modes

Metadata and Tags

Disabling in Tests

Sampling

Deterministic Sampling

W3C Traceparent Propagation

Streaming Interception

Per-Span-Type Truncation

Token ID Capture

On this page

TypeScript SDK

Installation

Configuration

record() Function

trace() Helper

Manual Spans

Auto-Instrumentation

OpenAI

Anthropic

Gemini

Transport Modes

Metadata and Tags

Disabling in Tests

Sampling

Deterministic Sampling

W3C Traceparent Propagation

Streaming Interception

Per-Span-Type Truncation

Token ID Capture

On this page