Pagination Designer
Design correct, scalable pagination (plus the filtering and sorting that ride with it) for a list endpoint — pick cursor (keyset) vs offset and justify it, define an opaque cursor with a unique tiebreaker so no row is skipped or repeated, return a consistent envelope, bound page size, and name the indexes the sort actually needs. Use when adding a list endpoint, when OFFSET pagination crawls on a large table, or when clients see duplicate or missing rows while paging.
npx agentscamp add skills/pagination-designerInstall to ~/.claude/skills/pagination-designer/SKILL.md
Pagination breaks two quiet ways: OFFSET goes O(n), scanning then discarding every prior row on deep pages; a non-unique sort key without a tiebreaker silently drops or duplicates rows across pages. This skill picks cursor vs offset, designs an opaque cursor on (sort key + tiebreaker), bounds page size, names the required indexes, and defines how filters and sorts compose.
Pagination looks trivial until the table grows or the data moves under the reader. OFFSET 100000 doesn't skip to row 100,000 — the database scans and throws away the first 100,000 matching rows on every request, so latency climbs linearly with depth. And sorting by a non-unique column (created_at, name, score) without a tiebreaker gives a partial order: rows that tie can reorder between requests, so paging skips some and shows others twice. This skill makes the pagination scheme an explicit decision — keyset vs offset, the cursor encoding, the tiebreaker, the page-size bounds, and the indexes — and defines how filtering and sorting compose with it.
When to use this skill
- You're adding a list/collection endpoint and need to decide how clients page through it.
- An existing
OFFSET/LIMITendpoint is fast on page 1 and slow on page 500, or it times out on deep pages. - Clients report seeing the same row twice or missing rows entirely while scrolling — the classic symptom of an unstable sort under concurrent inserts/deletes.
- The list is large, append-heavy, or actively changing (feeds, logs, events, search results) and you need stable paging that doesn't drift as rows are added.
Instructions
-
Choose cursor (keyset) vs offset from the dataset, and justify it.
- Cursor / keyset — the default for large or actively-changing data. Instead of
OFFSET, the next page seeks on the sort key:WHERE (created_at, id) < (:last_created_at, :last_id) ORDER BY created_at DESC, id DESC LIMIT :n. It's stable under inserts/deletes (each page is anchored to a real row, not a positional count) and stays fast at any depth because it uses an index range scan instead of scanning prior rows. Cost: no random page jumps, no total page count. - Offset / limit — acceptable only for small, stable, human-paginated lists where users click numbered pages (an admin table of a few thousand rows). It allows arbitrary jumps and easy "page 7 of 20" UIs. Never use it for infinite scroll, large tables, or feeds. State which you chose and the property (depth performance + stability vs random access) that drove it.
- Cursor / keyset — the default for large or actively-changing data. Instead of
-
Always include a unique tiebreaker so the sort order is total. A cursor seeking on a non-unique column alone (
created_at) can't disambiguate ties: two rows with the same timestamp have no defined relative order, so one can land on both sides of a page boundary. Encode the user-facing sort key plus a unique, monotonic tiebreaker (the primary key) — the cursor compares on the tuple(created_at, id). This makes the order total: every row has exactly one position, so no row is skipped or repeated. Even when the apparent sort is "by id" alone, that already happens to be unique — but any user-chosen sort needs the explicit, idtiebreaker appended. -
Make the cursor opaque. Encode the tuple
(sort_key_value, tiebreaker_value)(and, if filters/sort are part of the page identity, a version or the sort direction) into a single base64url token —next_cursor: "eyJjcmVhdGVkX2F0IjoiMjAyNi0wNi0xN1QwOTozMDowMFoiLCJpZCI6IjQ4ODEyIn0". Opaque means clients treat it as a blob and pass it back verbatim; you keep freedom to change the internal encoding without breaking them. Do not expose raw(timestamp, id)as query params — clients will hand-craft them, couple to your schema, and break on the next change. -
Return one consistent envelope. Every list endpoint returns the same shape:
{ "data": [ ... ], "next_cursor": "…", "has_more": true }next_cursorisnullwhen there are no more rows. Derivehas_morereliably by fetchingLIMIT n + 1: if you getn + 1rows, there's another page — drop the extra row and setnext_cursorfrom the last kept row. This avoids a separateCOUNTand is correct even when the last page is exactly full. Do not return a total count for keyset pagination; computing it scans the whole filtered set and defeats the point. -
Bound page size with a sane default and a hard max. Read the page size from
limit(orpage_size), clamp it: default 20–50, hard max 100–200 — never unbounded. An unboundedlimitlets one client request a million rows and OOM the server or exhaust the DB. Clamp silently (returnmin(requested, max)) and document the cap. -
Name the indexes the sort actually needs — this is non-negotiable for keyset. The
ORDER BY (sort_key, tiebreaker)and theWHERE (sort_key, tiebreaker) < (...)seek are only fast if a composite index on those exact columns in that exact order and direction exists. Sortingcreated_at DESC, id DESCneeds an index supporting that; a plain index oncreated_atalone forces a sort and undoes the win. If filters narrow the set, the index should lead with the equality-filter columns, then the sort columns:(tenant_id, created_at, id)for a query filtered by tenant and sorted by time. Verify the index exists or flag it as required. -
Define how filtering and sorting compose with the cursor. The cursor is only valid for the filter and sort it was issued under — a cursor minted for
?status=active&sort=created_atis meaningless if the next request changesstatusorsort. Specify the contract: which fields are filterable, which are sortable (whitelist them — never interpolate a client-supplied column name intoORDER BY), and that changing any filter or sort param invalidates the cursor and resets to the first page. For multi-column sorts, the tiebreaker is appended after all user sort columns, and the seek predicate must compare the full tuple (row-value comparison(a, b, c) < (:a, :b, :c), nota < :a OR (a = :a AND b < :b) OR …unless your engine lacks tuple comparison).
WARNING
Deep OFFSET is O(n), not O(1). OFFSET 100000 LIMIT 20 makes the database read and discard 100,000 matching rows before returning 20 — every request, getting worse as users page deeper, holding locks and burning IO the whole time. Page 1 being fast tells you nothing about page 5,000. If the table can grow large or users can reach deep pages, use keyset.
WARNING
A non-unique sort key without a tiebreaker silently corrupts paging. With ORDER BY created_at and several rows sharing a timestamp, the engine may return those tied rows in a different order on the next request — so a row sitting on the page boundary gets skipped on one page and the previous boundary row reappears on the next. There is no error, just missing and duplicated data. Always append a unique tiebreaker (, id) to every sort.
NOTE
Offset and keyset can coexist behind one envelope: serve numbered offset pages for a small admin UI and keyset for the public feed, both returning { data, next_cursor, has_more } (offset endpoints simply also accept page/leave next_cursor null). Pick per endpoint from its access pattern, not one rule for the whole API.
Output
A pagination spec stating: the chosen scheme (cursor vs offset) + rationale; the response envelope (data / next_cursor / has_more, with the null-when-done and LIMIT n+1 rules); the cursor encoding — the exact tuple (sort key, unique tiebreaker) and that it's base64url-opaque; the page-size default and hard max; the required indexes (exact columns, order, and direction, leading with equality-filter columns); and the filter/sort contract — the filterable/sortable field whitelist, the tuple seek predicate, and that changing any filter or sort param invalidates the cursor.
Related
- Rate Limiter DesignerDesign and implement API rate limiting that actually holds under load — pick the algorithm (token bucket vs sliding-window-counter vs fixed window) and justify it, choose the limiting key and per-tier limits, use cross-instance atomic storage, and return standard 429 signals. Use when protecting an API from abuse or scrapers, enforcing per-tier quotas, or replacing an in-memory limiter that breaks behind multiple replicas.
- Postgres Index StrategistRecommend the right Postgres index for a query or workload — choosing B-Tree vs. GIN vs. BRIN vs. partial/covering/expression, checking for redundant or unused indexes, and verifying the choice against the query plan. Use when a query needs an index, when deciding an index type for jsonb/array/full-text/time-series data, or when auditing an over-indexed table.
- Structured Logging DesignerDesign a structured (JSON) logging strategy with a stable field schema, correlation-ID propagation, and a disciplined level policy — then migrate ad-hoc string logs toward it. Use when logs are unsearchable plain text, when debugging a request across services means grepping multiple log streams by hand, or when standing up logging for a new service.
- GraphQL Schema DesignerDesign a clean, evolvable GraphQL schema (SDL) that won't paint you into a corner — model the graph around domain types and their relationships rather than as RPC-over-GraphQL, set nullability deliberately, standardize lists with Relay connections, plan DataLoader batching for per-parent fields, and evolve by adding + @deprecated instead of versioning. Use when designing a new GraphQL API, reviewing an SDL, or migrating REST endpoints to a graph.