How We Paginate 22,000 Health Plans: Cursor, Offset, and When to Use Each

Pagination is one of those decisions you make early in an API design and then live with forever. We support both cursor-based and offset-based pagination on the health plan search endpoint, and the dual-mode approach was the right call. Here is the actual reasoning, the implementation details, and an honest account of the tradeoffs.

The Data We’re Paginating

The Opelyx plan search endpoint queries about 22,000 health insurance plans across 51 jurisdictions. Users filter by state, county, metal tier (Bronze/Silver/Gold/Platinum/Catastrophic), insurer, plan type (HMO/PPO/EPO), and premium range. The result set after filtering can range from a handful of plans (small county, specific metal tier) to several thousand (large state, all tiers, all insurers).

Plans join to a rates table with 1.4 million rows. A full search query involves an INNER JOIN, WHERE clause evaluation, sorting, and then pagination. At the default page size of 25 plans, payloads are manageable, but the pagination mechanism determines what happens when clients traverse the full result set.

Two Modes, One Endpoint

The API supports both pagination modes on the same endpoint:

Offset mode (page parameter): Best for interactive browsing with random page access and backward navigation.
Cursor mode (cursor parameter): Best for sequential traversal by data pipelines, aggregators, and integrations.

GET /v1/health/plans?zip=33125&age=30&page=3&per_page=25
GET /v1/health/plans?zip=33125&age=30&cursor=eyJ2Ijo0NDUsImlkIjoiNDcxNjNGTDA4MTAwMDItMDEifQ&per_page=25

You pick the mode by which parameter you send. Sending both cursor and page (where page > 1) returns a 400 error, because the two modes are mutually exclusive.

How OFFSET Actually Performs on D1

OFFSET-based pagination is how most developers learn pagination. SELECT * FROM plans LIMIT 25 OFFSET 200 gives you the 9th page. The concern with OFFSET is that the database scans N + LIMIT rows, discards the first N, and returns the rest.

For our 22,000-plan dataset on D1 (SQLite), here are the real numbers:

Offset 0: under 10ms
Offset 750 (page 30 of 25): 60-80ms
Offset 22,000 (worst case): still under 200ms

These are well within acceptable limits for an API with a 300ms p99 target. D1 responses are typically cached at the edge anyway, so deep-page latency rarely hits end users directly. The principled concern with OFFSET is not the absolute latency at our current scale, but the linear growth characteristic: as the dataset grows, deep-page latency grows proportionally. Cursor pagination eliminates that growth curve entirely.

There is a second, more subtle problem: results can shift during concurrent writes. If a plan gets inserted between page 1 and page 2 of a session, the offset shifts and you either see a duplicate row or skip one. For a dataset that updates via annual batch imports, this matters less in practice. But building a system that is correct by design is better than relying on “our data doesn’t change often.”

Cursor Pagination: How It Works

Instead of tracking position by row number, cursor pagination tracks position by the last item seen. The API returns a next_cursor value alongside each page of results. To get the next page, pass the cursor back as a query parameter. The database query becomes a WHERE clause that seeks past the cursor position using an index, then reads forward.

SQLite’s B-tree index means this query executes in roughly constant time regardless of how deep in the result set you are. No scanning, no discarding.

Compound Cursors for Multi-Column Sort

Since the API supports sorting by premium, deductible, max out-of-pocket, or plan name, the cursor must encode the sort field value alongside the plan ID tiebreaker. The cursor payload is:

interface CursorPayload {
  /** Sort field value at the cursor position */
  v: number | string;
  /** Plan ID at the cursor position (tiebreaker) */
  id: string;
}

The v field is the value of whichever column the client is sorting by. For sort_by=premium, v is the premium amount. For sort_by=name, v is the plan marketing name. The id field breaks ties when multiple plans share the same sort value.

The encoding uses base64url (URL-safe, no padding) so the cursor can appear directly in query strings:

function encodeCursor(sortValue: number | string, planId: string): string {
  const payload: CursorPayload = { v: sortValue, id: planId };
  const json = JSON.stringify(payload);
  return btoa(json).replace(/\+/g, "-").replace(/\//g, "_").replace(/=+$/, "");
}

function decodeCursor(
  cursor: string,
  expectedValueType?: "number" | "string",
): CursorPayload | null {
  try {
    const padded = cursor.replace(/-/g, "+").replace(/_/g, "/");
    const json = atob(padded);
    const parsed: unknown = JSON.parse(json);
    // Validate shape and types
    if (typeof parsed !== "object" || parsed === null) return null;
    if (!("v" in parsed) || !("id" in parsed)) return null;
    const obj = parsed as Record<string, unknown>;
    const v = obj["v"];
    const id = obj["id"];
    if (
      (typeof v !== "number" && typeof v !== "string") ||
      typeof id !== "string"
    ) {
      return null;
    }
    // Validate cursor value type matches the expected sort column
    if (expectedValueType === "number" && typeof v !== "number") return null;
    if (expectedValueType === "string" && typeof v !== "string") return null;
    return { v, id };
  } catch {
    return null;
  }
}

Note the use of btoa/atob rather than Buffer.from — Cloudflare Workers use web-standard APIs, not Node.js built-ins.

The expectedValueType parameter catches a class of bug where a cursor from a sort_by=premium query (numeric v) gets reused with a sort_by=name query (string v), or vice versa. Mismatched types would produce nonsensical keyset comparisons.

The SQL Keyset Condition

For ascending sort, the WHERE clause that implements the cursor seek is:

WHERE (sort_column > :cursor_value)
   OR (sort_column = :cursor_value AND plan_id > :cursor_plan_id)

For descending sort, the comparison flips:

WHERE (sort_column < :cursor_value)
   OR (sort_column = :cursor_value AND plan_id > :cursor_plan_id)

The plan_id tiebreaker always uses > (ascending) because plan IDs are the deterministic secondary sort, regardless of the primary sort direction. This ensures stable, repeatable ordering even when many plans share the same premium or deductible value.

Handling Filters With a Cursor

The cursor encodes position, not the full query context. The client is responsible for sending the same filter parameters on cursor-based page requests. The API treats each request independently: apply filters, then apply cursor constraint, then limit.

There is a trust contract here: cursors are valid only for the same query parameters that produced them. If a client takes a cursor from a Silver-filtered query and uses it on an unfiltered query, the results will be unpredictable (though not dangerous — the worst case is missing or duplicated plans). We chose not to validate cursor-filter consistency because encoding the full query context in the cursor would make cursors large and fragile. The contract is documented in the API reference.

Error Handling

Malformed cursors return an RFC 9457 Problem Details response:

{
  "type": "https://api.opelyx.com/problems/validation-error",
  "title": "Validation Error",
  "status": 400,
  "detail": "Invalid cursor value.",
  "errors": [
    {
      "field": "cursor",
      "message": "Cursor is malformed or corrupted. Omit the cursor parameter to restart from page 1."
    }
  ]
}

Sending both cursor and page (where page > 1) also returns 400:

{
  "type": "https://api.opelyx.com/problems/validation-error",
  "title": "Validation Error",
  "status": 400,
  "detail": "Cannot use both cursor and page parameters.",
  "errors": [
    {
      "field": "cursor",
      "message": "Use cursor for keyset pagination OR page for offset pagination, not both."
    }
  ]
}

The special case of cursor + page=1 is allowed, since Zod defaults page to 1 and many HTTP clients include default values in query strings.

The `total` Field and COUNT(*)

On the first request (no cursor), the API runs a COUNT(*) query with the same filters and returns the result as total. This lets clients show “247 plans found” and compute total pages for offset navigation.

On cursor-paginated follow-up requests, total is null. The client already has the total from page 1, and repeating the count is redundant work. We make this explicit with null rather than omitting the field or returning 0, because null unambiguously means “not computed” while 0 could be confused with “zero results.”

For 22,000 rows on D1/SQLite, the COUNT query takes about 1ms. The cost is negligible. The reason to skip it on cursor pages is not performance — it is semantic correctness.

{
  "_embedded": { "plans": [...] },
  "total": null,
  "page": null,
  "per_page": 25,
  "next_cursor": "eyJ2Ijo0NDUsImlkIjoiNDcxNjNGTDA4MTAwMDItMDEifQ"
}

HATEOAS Links and the Link Header

Every response includes HAL-style _links for discoverability:

{
  "_links": {
    "self": {
      "href": "https://api.opelyx.com/v1/health/plans?zip=33125&age=30&per_page=25",
      "type": "application/json"
    },
    "first": {
      "href": "https://api.opelyx.com/v1/health/plans?zip=33125&age=30&per_page=25",
      "type": "application/json"
    },
    "next": {
      "href": "https://api.opelyx.com/v1/health/plans?zip=33125&age=30&per_page=25&cursor=eyJ2...",
      "type": "application/json"
    },
    "last": {
      "href": "https://api.opelyx.com/v1/health/plans?zip=33125&age=30&per_page=25&page=10",
      "type": "application/json"
    },
    "plan_detail": {
      "href": "https://api.opelyx.com/v1/health/plans/{plan_id}?year=2026",
      "templated": true,
      "type": "application/json"
    }
  }
}

The last link is only present on offset-paginated requests where total is known. The prev link appears when page > 1. The plan_detail link is an RFC 6570 URI Template for discovering individual plan endpoints.

The response also includes an RFC 8288 Link header with standard IANA relations (self, next, prev, first, last) for clients that prefer headers over response body parsing.

next_cursor in the response body is a convenience extension. Standards-oriented consumers should prefer _links.next.href, which embeds the cursor in a ready-to-fetch URL.

Consumers who need backward navigation should use offset pagination (page parameter). Cursor pagination is forward-only by design. Rather than adding a prev_cursor (which would require maintaining a reverse cursor chain), we keep the two modes clean: cursor for forward traversal, offset for random access including backward movement.

This means a client building a “Previous / Next” UI is better served by offset pagination. A data pipeline iterating through all plans to build an analytics dataset is better served by cursor pagination.

When to Use Each Mode

Use offset pagination when:

Building interactive UIs with page number navigation
Users need to jump to arbitrary pages (“show me page 47”)
Backward navigation is required
Result sets are small enough that OFFSET performance is not a concern (for our 22K-plan dataset, this is always true at current scale)

Use cursor pagination when:

Building data pipelines that iterate through the full result set
Correctness under concurrent writes matters more than random access
Forward-proofing against dataset growth (cursor performance is constant regardless of depth)
Integrating with systems that follow next links automatically

Both modes return the same response shape, the same filters, and the same sorting. The only difference is how you request the next page.