Contract Test Designer
Design consumer-driven contract tests between services so an API provider can't break its consumers unnoticed — without slow, flaky full end-to-end environments. Use when independent services or teams integrate over an API, when integration bugs only surface in staging or prod, or when E2E suites are too slow and brittle to catch breaking API changes.
npx agentscamp add skills/contract-test-designerInstall to ~/.claude/skills/contract-test-designer/SKILL.md
Cross-service E2E tests are slow, flaky, and catch breaking API changes too late. This skill flips it: the consumer declares the requests it sends and the exact fields it depends on, the provider replays those expectations in its own CI, and a contract-violating change fails the provider's build before deploy — interface shape only, business logic stays in unit tests.
Cross-service E2E suites are slow, flaky, and tell you a provider broke a consumer only after both are deployed to a shared environment. This skill designs consumer-driven contract tests instead: the consumer declares the exact requests it sends and the precise response fields and types it actually reads, and the provider replays those expectations against its real handler in its own CI. A provider change that violates any consumer's contract fails the provider's build — before merge, before deploy, with no other service running. The deliverable is the consumer-defined contract(s), the provider-side verification wired into CI, and a sharing-plus-versioning approach so the two sides can evolve.
When to use this skill
- Two or more independently deployed services (often owned by different teams) integrate over HTTP/JSON, gRPC, or a message queue, and a provider can ship a change that silently breaks a consumer.
- Integration regressions only appear in staging or prod because nothing in either repo's CI exercises the actual cross-service shape.
- The cross-service E2E suite is too slow or flaky to gate merges, so breaking API changes slip through.
- You're standing up a new client against an existing API and want to lock the dependency to exactly the fields you read, not the whole payload.
Instructions
- Let the CONSUMER define the contract — and only the part it uses. Write the contract from the consumer's test suite, not the provider's spec. For each interaction, state the request the consumer sends (method, path, query/body, headers that matter) and the response shape it actually depends on: the status code, the fields it reads, and their types. If the consumer parses
order.id(string) andorder.total(number) and ignores the other 20 fields, the contract asserts those two fields and nothing else. The contract is a description of this consumer's needs, never the provider's full API surface. - Match on type and structure, not frozen example values. Use matchers, not literals: assert
totalis a number,statusis one of a set,itemsis a non-empty array of objects withsku/qty— nottotal == 4250. Frozen example values turn the contract into a snapshot test that breaks on every data change. Reserve exact-value matching for fields whose literal value is part of the contract (an enum the consumer branches on, a fixedContent-Type). - Pick a tool/pattern and generate the artifact. Match what the stack already uses before adding a dep. Pact (pact-js / pact-jvm / pact-python / pact-go) is the default for HTTP and async messages — the consumer test runs against a mock provider and emits a pact JSON file. Spring Cloud Contract suits a JVM-heavy shop. For simpler needs, a shared JSON Schema / OpenAPI fragment committed to both repos, validated on each side, is a legitimate lightweight contract. Whatever the tool, the output is a machine-checkable artifact of the consumer's expectations.
- Verify the PROVIDER against the contract in the PROVIDER's own CI. This is the half teams skip and the half that earns the value. The provider's pipeline fetches every consumer contract and replays each recorded request against the real running provider (no consumer process involved), asserting the live response satisfies the matchers. Wire it as a required check: a provider change that drops
order.totalor renamesstatusfails the provider build, so the break is caught at the source before merge. Useprovider states(Pact) to set up the data each interaction needs (given "order 42 exists"→ seed that fixture) rather than depending on ambient DB state. - Share contracts via a broker or committed artifacts, and gate deploys on verification. For more than a couple of services, run a Pact Broker (or PactFlow): consumers publish contracts tagged by branch/version, providers fetch and verify, and
can-i-deployblocks a release whose verified contracts don't cover the consumer versions currently in prod. For a small, co-located set, committing the contract artifact into a shared repo or the provider repo and verifying in CI is simpler and adequate — pick the lightest mechanism that still makes verification a required, automated gate, not a manual step. - Version contracts so provider and consumer can evolve independently. Tag each contract with the consumer's version and the environment where that consumer version runs. Additive provider changes (new optional field) keep old contracts passing — that's the point of matching only what the consumer reads. For a breaking change, support both shapes until every consumer has published a contract for the new one (verified via the broker), then retire the old. Never edit a published contract in place to make a failing provider build go green — that defeats the gate.
- Keep contracts to interface shape; push behavior into unit tests. A contract verifies the integration surface — fields, types, status codes, error envelopes — not that the provider computes the right total or applies the right discount. That logic belongs in the provider's own unit/integration tests. A contract bloated with business assertions becomes a second, worse copy of the provider's logic suite and breaks on unrelated correct changes.
WARNING
Contract tests verify the INTERFACE shape, not end-to-end behavior. They replace brittle cross-service E2E for catching breaking API changes — but they do not prove the provider's logic is correct or that the wired-up system works. Keep the provider's own logic tests, and a thin smoke E2E for the critical path; contracts shrink the E2E suite, they don't delete it.
WARNING
A contract that asserts the provider's entire response — every field, exact values — instead of only the fields this consumer reads is an anti-pattern: it produces false breakages on unrelated, backward-compatible changes (a new field, a reordered key, a changed value the consumer never reads), and trains teams to ignore red builds. Assert the minimum the consumer actually depends on.
Output
For the integration, the skill produces:
- The consumer-defined contract(s) — for each interaction, the request (method, path, body, key headers) and the response expectations as matchers (status code + only the fields/types this consumer reads), in the chosen tool's format.
- The provider-side verification setup — the CI step that fetches the contract(s) and replays them against the real provider, the provider-state fixtures each interaction needs, and the required-check wiring so a violation fails the provider build.
- The sharing + versioning approach — broker vs. committed artifact, how contracts are tagged by consumer version/environment, and the deploy gate (e.g.
can-i-deploy) plus the rule for evolving through a breaking change.
Example — a consumer contract for an order-service client, in pact-js:
const { PactV3, MatchersV3: M } = require("@pact-foundation/pact");
const provider = new PactV3({ consumer: "checkout-web", provider: "order-service" });
// The consumer reads only id (string), total (number), and status (one of two values).
// It ignores every other field on the order — so the contract asserts only these.
provider
.given("order 42 exists") // provider state: seeded in provider CI
.uponReceiving("a request for an order")
.withRequest({ method: "GET", path: "/orders/42" })
.willRespondWith({
status: 200,
headers: { "Content-Type": "application/json" },
body: {
id: M.string("ord_42"), // type match, not the literal "ord_42"
total: M.number(4250),
status: M.regex(/^(open|closed)$/, "open"), // enum the consumer branches on
},
});
await provider.executeTest(async (mock) => {
const order = await fetchOrder(`${mock.url}/orders/42`);
expect(order.status).toBe("open");
});This test emits a pact file; the provider's pipeline then replays GET /orders/42 against the real order-service (with state order 42 exists seeded) and fails the provider build if total stops being a number or status leaves the enum. Hand the request/response shapes to openapi-doc-writer to keep the published spec in sync, and use test-scaffolder to flesh out the provider-state fixtures.
Related
- Test ScaffolderScaffold a test file with sensible cases for a given module or function. Use when adding tests to untested code and you want a fast, structured starting point.
- OpenAPI Doc WriterProduce and maintain OpenAPI documentation for an HTTP API. Use when documenting endpoints, request/response schemas, or generating API reference docs.
- Coverage Gap FinderRun the project's coverage tool and identify the highest-value untested paths — error branches, edge cases, and critical modules — then propose specific test cases for each gap. Use when you have a coverage report but don't know where new tests will pay off most.
- Idempotency DesignerMake unsafe, retryable API operations idempotent so a client retry or a network hiccup can't double-charge, double-create, or double-send — design a client-supplied idempotency key, an atomic store-and-check (unique constraint or conditional write), in-flight conflict handling, and a retention policy. Use when a POST/mutation can be retried (payments, order creation, sends, webhooks), or when duplicate side effects have already shown up in production.
- Agent Trajectory EvaluatorEvaluate a multi-step AI agent's whole run — tool calls, intermediate steps, and final result — not just final-answer correctness, so you can pinpoint WHERE it went wrong. Use when building or debugging a tool-using or multi-step agent, when final-answer-only evals can't explain failures, or when a prompt/model change quietly makes the agent less efficient or more error-prone even though the answer still looks right.
- Strangler Fig MigratorPlan the incremental replacement of a legacy module or service using the strangler-fig pattern — grow new code around the old behind an interception seam until the old is dead, instead of a big-bang rewrite. Use when a legacy system is too risky to rewrite at once, or when migrating off a deprecated framework/dependency gradually while staying shippable and rollback-able at every step.
- Threat Model BuilderBuild a practical threat model for a feature or system using STRIDE — diagram the data flow, mark trust boundaries, enumerate concrete threats where data crosses them, and prioritize by likelihood × impact so security is reasoned about before shipping instead of bolted on after. Use when designing a feature that touches auth, money, or sensitive data, running a security design review, or hardening before a launch.
- Integration Test DesignerDesign integration tests that exercise components against REAL collaborators — actual database, queue, HTTP boundary — at a deliberately chosen seam, instead of a unit suite that mocks everything or a slow flaky full E2E. Use when bugs slip past green unit tests, when wiring or contracts between layers break in production, or when a mocked DB test passes but the real query/migration/serialization fails.
- Mutation Test RunnerMeasure whether a test suite actually catches bugs by running mutation testing — introduce small faults into the code and check which ones a test kills versus which slip through silently. Use when line coverage is high but bugs still ship, when you suspect tests assert weakly, or to find the exact assertions a suite is missing.