Trace Data Flow
Trace how a value, field, or variable flows through the codebase from source to sink.
/trace-data-flow<variable, field, or value to trace>npx agentscamp add commands/trace-data-flowInstall to ~/.claude/commands/trace-data-flow.md
Read-only command that traces a value, field, or variable across the codebase: where it originates, every place it is transformed, validated, or renamed, where it is read, and where it is persisted or sent out — following renames across boundaries — then reports a directed source-to-sink flow map with file:line citations, flagging missing validation, plaintext logging, or unit/type mismatches.
Trace how $ARGUMENTS moves through this codebase — from where it enters, through every transform, to where it lands. Build a directed flow map (source → transforms → sinks) with file:line citations, and flag anything notable on the path. Do not change any files; the map and the observations are the whole deliverable.
Scope
$ARGUMENTS is the value to trace — a request/response field (order.shippingAddress), a config key (STRIPE_WEBHOOK_SECRET), a DB column (users.email_verified_at), a query param, an event property, or a plain variable. Trace the data, not just the name: the same value is often spelled differently at each layer (snake_case column → camelCase model attr → kebab-case JSON key), so you are following an identity across renames, not grepping one literal.
If $ARGUMENTS is empty, do not guess. Ask one focused question: "Which value should I trace — name a field, config key, column, or variable?"
WARNING
Read-only mode. Use only Read, Grep, and Glob. Do not edit files, run code, or hit a database to "follow" the value. The flow map is reconstructed from source, not from a live trace.
Step 1 — Pin down the value and its aliases
Find every name this value can wear before you start tracing, or you will lose it at the first layer boundary.
# Seed search on the literal name and its common case variants
rg -n "shippingAddress|shipping_address|shipping-address" src- Note the declared type/shape at each spelling (string, cents-int, ISO-8601 string, enum, nullable).
- Watch for destructuring and renames —
const { email: userEmail } = body,address AS shippingin SQL,@SerializedName,@JsonProperty, ORM column maps, GraphQL field resolvers, protobuf/zod/pydanticschemas. Each is a rename you must carry forward.
NOTE
Aliases hide at every boundary: HTTP body → DTO, DTO → domain model, model → ORM entity, entity → table column, and back out through serializers. Build the alias set first; trace second.
Step 2 — Find the source(s)
Locate where the value first enters this system. Typical origins:
- Inbound request — route/controller param, request body field, header, query string.
- Config / environment —
process.env.X, a config file, a secrets loader. - Storage read — a column selected from a query, a cache
get, a file read. - External call / event — a webhook payload, a queue message, a third-party API response.
Record each source as file:line with the type it has at the moment of entry. If there are multiple independent sources, the value has multiple origins — trace each.
Step 3 — Walk every transform and validation
From each source, follow the value forward. At each hop, classify what happens to it:
| Hop kind | What to capture |
|---|---|
| Validation / parse | the rule (schema, regex, range, enum) and what passes through unchecked |
| Transform | the function and the type/unit change (cents↔dollars, ms↔s, trim/normalize, encrypt/hash) |
| Rename / remap | old name → new name across the boundary |
| Branch / default | conditionals that drop, substitute, or fork the value |
| Aggregation | merged into another object, array, or computed field |
Follow it through function calls and across files — when it is passed as an argument, jump into the callee and keep going. Stop a branch only when the value is consumed (read into a decision and not propagated) or reaches a sink.
NOTE
A transform that changes units or type without a matching change at the consumer is the highest-value bug this command finds — e.g. cents stored but dollars displayed, or a UTC timestamp compared against a local one. Record the unit/type at every hop so mismatches between layers are visible.
Step 4 — Identify the sinks
A sink is where the value leaves your control. Find all of them:
- Persistence — DB write/upsert, cache
set, file write. - Outbound — API response body, third-party request, queue publish, email/SMS.
- Logs / telemetry —
console.log, logger calls, metrics tags, error reporters.
For each sink, record the name and type the value has as it leaves, so the entry shape and exit shape can be compared end to end.
Step 5 — Assemble the flow map and the observations
Compose the hops into a single directed map, then list what you noticed along the way.
## Flow: `$ARGUMENTS`
**Source** → `routes/orders.ts:31` — `body.shipping_address` (string, unvalidated)
→ **validate** `schemas/order.ts:18` — zod `.string().min(1)` (rejects empty only)
→ **rename** `services/order.ts:74` — `shipping_address` → `shippingAddress`
→ **transform** `services/geo.ts:22` — normalized + uppercased country code
→ **persist** `repo/orders.ts:55` — `INSERT orders.shipping_addr`
→ **outbound** `clients/shipping.ts:40` — POSTed to carrier API as `destination`
→ **log** `services/order.ts:80` — full address written to info log
## Observations
- [validation gap] `routes/orders.ts:31` — accepts any non-empty string; no postal-code/country check before it reaches the carrier API.
- [type mismatch] `repo/orders.ts:55` vs `clients/shipping.ts:40` — column is `varchar(120)`; carrier rejects > 100 chars, no truncation between.
- [sensitive log] `services/order.ts:80` — PII (full address) logged at info level in plaintext.Use → to show direction; indent or fork the arrows when the value branches. Every node carries a file:line and the value's name+type at that point.
Report
Deliver the flow map and the observations as your message — that is the whole deliverable. Make sure:
- Each node has a real
file:linecitation; never invent a path you did not open. - Every rename across a layer boundary is shown explicitly.
- Observations are tagged (
[validation gap],[type mismatch],[sensitive log],[unit mismatch],[dead path]) and each cites the exact line.
End with the single most important finding — the one hop a reviewer should look at first — or, if the path is clean, say so plainly and name the source and primary sink.
Related
- Find N+1 QueriesScan code read-only for N+1 query patterns — loops that query per iteration and handlers that fan out per-row — and report each with a location, why it is N+1, and the concrete eager-load/batch/set-based fix.
- Rename SymbolSafely rename a symbol project-wide, distinguishing the real symbol from coincidental substring matches.
- Auth Flow ReviewerRead-only review of authentication AND authorization flows — session/token model, cookie flags, CSRF, token rotation, password-reset/email-verification, OAuth redirect/state, and per-route object-level access checks — for exploitable gaps. Use before shipping login/session/token code, when adding a protected route or sharing-by-URL feature, or during a security pass. Reports findings by severity with location, impact, and the concrete fix; never edits code.
- Webhook Handler ScaffolderScaffold a robust inbound webhook handler that verifies the signature on the raw body first, dedupes on the provider's event id, acknowledges fast, and processes asynchronously — the four things naive handlers get wrong. Use when wiring up events from a third party (Stripe, GitHub, Shopify, Slack, Twilio), when a provider keeps retrying because your endpoint times out or 500s, or when duplicate events are double-charging or double-creating records.
- Extract ModuleSplit an overgrown file into cohesive, well-bounded modules — find the natural seams, design each new module's public interface before moving a line, then relocate one unit at a time keeping tests green. Use when a file has grown too large, mixes unrelated responsibilities, or every change to it forces unrelated diffs and merge conflicts.
- Type Coverage ImproverRaise TypeScript type strictness incrementally — measure the any/implicit-any baseline, enable one strict sub-flag at a time, and fix the fallout per flag instead of all at once, keeping the typecheck green at every step. Use when a codebase is loosely typed, when you want strict mode on without a big-bang break, or when `any` keeps hiding bugs that surface in production.
- Estimate EffortProduce a grounded effort and complexity estimate for a task by exploring the codebase read-only.
- Write Design DocExplore the codebase and write a decision-oriented design doc / RFC for a feature or system change.
- Hunt Flaky TestsReproduce a flaky test, find the real source of nondeterminism, and fix the cause.