Fork & Replay

Fork from any span in a trace, modify the input, and watch the entire agent re-execute from that point forward. Context from the fork flows into subsequent LLM calls automatically.

Creating a Fork

curl -X POST https://api.retraceai.tech/api/v1/forks \
  -H "x-retrace-key: rt_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "trace_id": "abc-123",
    "fork_point_span_id": "span-456",
    "modified_input": "What about using a different approach?"
  }'

Replaying a Fork

curl -X POST https://api.retraceai.tech/api/v1/forks/:id/replay \
  -H "x-retrace-key: rt_live_..."

The SDK re-executes the full function from the fork point. Every LLM call after the fork receives context about the modified path.

Viewing the Diff

curl https://api.retraceai.tech/api/v1/forks/:id/diff \
  -H "x-retrace-key: rt_live_..."

Returns a side-by-side comparison: original vs forked execution with cost, latency, and token deltas.

Fork Sweeps (Batch)

Test multiple input variants simultaneously:

curl -X POST https://api.retraceai.tech/api/v1/forks/sweep \
  -H "x-retrace-key: rt_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "trace_id": "abc-123",
    "fork_point_span_id": "span-456",
    "variants": [
      { "label": "concise", "modified_input": "Be brief." },
      { "label": "detailed", "modified_input": "Explain in detail." }
    ]
  }'

Sweeps run with concurrency 3 and checkpoint progress in Valkey for resumability.

Sensitivity Analysis

Score how sensitive an agent's output is to input perturbations:

curl -X POST https://api.retraceai.tech/api/v1/forks/sensitivity \
  -H "x-retrace-key: rt_live_..." \
  -H "Content-Type: application/json" \
  -d '{ "trace_id": "abc-123", "fork_point_span_id": "span-456" }'

CLI

retrace forks create --trace <id> --span <id> --input "new prompt"
retrace forks replay <id> --wait
retrace forks diff <id> --json

Fork replay re-executes your actual agent function with modified input. Because LLMs are inherently non-deterministic (temperature, sampling, model updates), the forked path may produce different outputs even with identical input. This is by design — Retrace shows you what would happen, not what must happen.

To increase reproducibility:

Set temperature=0 in your LLM calls
Use a fixed seed parameter if your provider supports it
Pin a specific model version (e.g., gemini-2.5-flash-001 instead of gemini-2.5-flash)

The divergence score in the diff view accounts for this — it measures semantic difference between outputs, not exact string matching.

Cost Estimation

Before executing a fork, estimate the cost:

curl https://api.retraceai.tech/api/v1/forks/:id/estimate \
  -H "x-retrace-key: rt_live_..."

Returns a breakdown showing naive cost vs optimized cost with savings from semantic caching, context compression, and differential replay. See Replay Optimizations for details.

Inline Editing

Edit a span's input directly and create a fork in one step:

curl -X POST https://api.retraceai.tech/api/v1/traces/:id/inline-edit \
  -H "x-retrace-key: rt_live_..." \
  -H "Content-Type: application/json" \
  -d '{"span_id": "span-456", "modified_input": "Try a different approach"}'

After replay, view the step-by-step diff:

curl https://api.retraceai.tech/api/v1/traces/:id/diff/:forkId \
  -H "x-retrace-key: rt_live_..."

Commit a successful fork as the new baseline:

curl -X POST https://api.retraceai.tech/api/v1/forks/:id/commit \
  -H "x-retrace-key: rt_live_..."

Fork & Replay

Fork & Replay

Creating a Fork

Replaying a Fork

Viewing the Diff

Fork Sweeps (Batch)

Sensitivity Analysis

CLI

Non-Determinism in Replay

Cost Estimation

Inline Editing

On this page

Fork & Replay

Fork & Replay

Creating a Fork

Replaying a Fork

Viewing the Diff

Fork Sweeps (Batch)

Sensitivity Analysis

CLI

Non-Determinism in Replay

Cost Estimation

Inline Editing

On this page