Schema Drift: Why Your AI Agent's Free-API Integration Will Quietly Break

Your integration works. Tests pass. The agent ships. Three weeks later a user pings you: "the price says $0.00, did something break?" You check the dashboard, everything is green. You hit the upstream API directly, the data is there. You read the code, the code is fine. Then you notice it: the field used to be last_price, and now it is lastPrice. No deprecation. No changelog entry. The free API just shipped a "small refactor" on a Tuesday and nothing on your end told you.

This is schema drift. It is the single most reliable way that integrations with free public APIs fail in production, and it is especially nasty for AI agents because agents do not bounce off bad data the way humans do. A human looks at "$0.00 BTC" and laughs. An agent reads "$0.00 BTC" and confidently writes "Bitcoin has crashed to zero" into a research summary. Garbage in, plausible-sounding garbage out, no error surfaced anywhere.

I run the API layer for TerminalFeed. We aggregate 30+ free upstream APIs through a single Cloudflare Worker. We have absorbed every flavor of schema drift you can think of (and a few you can't). What follows is the playbook we converged on after enough 3am alerts to know what works.

Three flavors of drift, ranked by sneakiness

It helps to name the failure modes. Schema drift is not one thing.

1. Shape drift. A field gets renamed, moved, or wrapped. price becomes data.price. users becomes { items: users, total: 50 }. The old key returns undefined. This is the loudest of the three because anything that calls a method on the missing field crashes immediately. You want this kind. It is honest.

2. Type drift. A field keeps its name but changes its type. price goes from a number to a string ("$77,401.23"). timestamp goes from seconds to milliseconds. active goes from a boolean to "true" / "false" strings. Your code does not crash, it just produces nonsense. price * quantity silently concatenates. new Date(timestamp) returns a date in 1970 or in the year 4516. The data flows through your system looking valid.

3. Semantic drift. The field, the type, and the value all look right. The meaning has changed. volume used to be 24-hour trailing volume, now it is rolling 1-hour volume. change used to be a percentage, now it is a basis point delta. status values "open" and "closed" got joined by "pre_market" and "post_market" and your switch statement falls through to the default branch. Nothing alerts. Your dashboard just lies.

Shape drift you catch with null checks. Type drift you catch with type assertions. Semantic drift you do not catch without a human on the other end of an alert, which is the painful truth nobody wants to write down.

Why agents make all of this worse

If a human is consuming the API output, they have priors. They know BTC is not actually at $0.00, they know the S&P 500 is not at 4 points, they know "open" was the old enum value. They will catch a lot of drift just by squinting at the screen.

An agent has no priors that you did not give it. An agent will happily summarize "BTC is at $0.00, suggesting a complete loss of confidence in the asset" because the prompt said "summarize the data." Worse, agents fan out: one bad call gets cached, gets pasted into a system prompt, gets used as context for a tool call, and the bad value lives for the rest of the session. By the time a user notices, you have forty downstream steps to retrace.

So the bar for catching drift in an agent integration is higher than for a dashboard. You want to detect at the boundary, not at the user.

Pattern 1: A typed envelope, not a typed body

The most useful thing we did was wrap every upstream call in a small envelope before any consumer touches the data. Not a parser, not a validator yet, just an envelope:

type Envelope<T> = {
  ok: boolean;
  data: T | null;
  source: string;
  fetched_at: string;       // ISO timestamp
  upstream_status: number;  // HTTP status from upstream
  shape_hash: string;       // sha256 of sorted top-level keys
  age_ms: number;           // 0 if fresh, >0 if from cache
};

The shape_hash is the unsung hero. It is a hash of the top-level keys (and nothing else). The first time we call an upstream we record the hash. Every call after that, we recompute it. If it changes, we log a structured event and ship an alert. The hash is cheap, deterministic, and notices renames before any field-level logic runs.

function shapeHash(obj) {
  if (obj === null || typeof obj !== 'object') return 'primitive';
  const keys = Object.keys(obj).sort().join(',');
  return crypto.subtle.digest('SHA-256', new TextEncoder().encode(keys))
    .then(b => [...new Uint8Array(b)].slice(0, 8)
      .map(x => x.toString(16).padStart(2, '0')).join(''));
}

You do not need 32 bytes of hash. The first 8 hex chars are plenty for change detection on a known schema.

Pattern 2: Validate at the boundary, never in the middle

The temptation is to add null checks deeper in the stack: "if price is missing, render dash." Resist it. Every "if missing then default" you sprinkle into your render path is a place where bad data gets quietly accepted and propagated. The middle of your code should assume the data is valid because something at the boundary already proved it.

We use zod for runtime validation, but anything similar works (valibot, ajv, custom). The point is to define the schema once, near the fetch:

import { z } from 'zod';

const BTCPriceSchema = z.object({
  symbol: z.literal('BTCUSDT'),
  price: z.coerce.number().positive(),
  ts: z.coerce.number().int().positive(),
});

async function fetchBTC() {
  const res = await fetch('https://api.binance.com/api/v3/ticker/price?symbol=BTCUSDT');
  const raw = await res.json();
  const parsed = BTCPriceSchema.safeParse(raw);

  if (!parsed.success) {
    logSchemaDrift('binance.btc', raw, parsed.error);
    return staleCache.get('btc') ?? null;
  }
  return parsed.data;
}

z.coerce.number() handles the type-drift case where a number gets stringified. The .positive() guard catches the silent-zero case. If validation fails we log a structured event and return whatever was last known good. The caller sees either valid data or null. Nothing in between. No "$0.00 BTC" ever leaves the boundary.

Pattern 3: Snapshot the contract, fail loud on drift

Once an upstream is in your system, write a snapshot test. The point is not to test logic, the point is to flag the day the upstream changes:

// tests/contracts/binance.test.ts
test('binance BTC ticker shape is unchanged', async () => {
  const res = await fetch('https://api.binance.com/api/v3/ticker/price?symbol=BTCUSDT');
  const raw = await res.json();
  const keys = Object.keys(raw).sort();
  expect(keys).toEqual(['price', 'symbol']);
  expect(typeof raw.price).toBe('string');
  expect(raw.symbol).toBe('BTCUSDT');
});

Run these on a schedule, not just on commit. We run ours hourly through a GitHub Action. When Binance ships a refactor, we know within an hour, not when a user notices a broken panel three days later. The test is allowed to fail loudly. It is a smoke detector, not a unit test.

Pattern 4: Log the freshness, not just the value

Every TerminalFeed premium response carries a _meta block with per-source latency and an age_ms stamp. We added it for AI agents, who care intensely about whether the data is real-time, ten minutes old, or stale cache from a failed upstream. The pattern generalizes:

{
  "data": { "btc_price": 77401.23, "fear_greed": 64 },
  "_meta": {
    "btc_price": { "source": "binance", "age_ms": 850, "fresh": true },
    "fear_greed": { "source": "alternative.me", "age_ms": 142000, "fresh": true },
    "endpoint": "/api/briefing"
  }
}

This is critical for agents because it lets them reason about which field in a composite response is stale. A composite endpoint that pulls from five upstreams will eventually have one of those upstreams down. Without per-field freshness, the agent has to assume the entire payload is suspect or assume none of it is. Both are wrong.

Pattern 5: Cache shapes alongside values

This is the one that surprised me most when we shipped it. We cache the parsed value (obvious), and we also cache the last-known-good shape of every upstream as a separate KV entry. When validation fails, we do not just fall back to the last known value, we compare the new shape to the last known shape and emit a diff:

added_keys:    ['preMarketPrice']
removed_keys:  ['price']
changed_types: { 'volume': 'number -> string' }

That diff goes straight into the alert. When I get the page, I do not need to dig through a JSON blob to figure out what broke. I see "binance renamed price to lastPrice" and I know the fix in twenty seconds.

What this is not

Not a substitute for talking to upstream. If you depend on an API enough to alert on its drift, you should be subscribed to whatever changelog or status feed they publish. Most free APIs do not publish one. That is half the problem.
Not free of false positives. The shape hash will fire on benign additions (a new optional field). Tune the alert: ignore additions, page on removals or type changes.
Not a replacement for monitoring values. A price feed can pass every shape check and still serve a stale value forever. You also need value-level sanity checks (is BTC within 50% of yesterday's close? is the timestamp within the last 10 minutes?).
Not specific to AI agents. All of this applies to any backend that integrates with public APIs. Agents just punish drift more visibly.

The cost

You are looking at maybe 200 lines of code per upstream once you have the pattern in place: a schema, a fetch wrapper, a snapshot test, a KV write for the shape cache. We have 30 upstreams. The total infrastructure is ~6000 lines, of which maybe 800 are drift-detection code. Of those 800 lines, the parts that have actually saved us from a production incident: the shape hash, the zod schema with safeParse, the snapshot test, and the per-field _meta block. The rest is nice-to-have.

The number that matters is incidents per quarter. Before we had this in place, we were running roughly one schema-drift incident every two weeks across the upstream set, almost always discovered by a user complaint. After: we have caught every drift event before a user reported it. Not because we are smart, because the boundary is loud.

Closing

Free APIs are great, until they aren't. The price of "free" is that you accept a contract that is not under your control and that nobody owes you a notice on. The defense is to treat every upstream like a hostile system at the boundary and like a trusted dependency in the middle. Validate hard once, then stop.

If you are building an AI agent that consumes public APIs, the cost of skipping this work is not a crash. It is a confident, well-formatted answer that is silently wrong. Which is the worst possible failure mode for an agent. Catch the drift at the door.

TerminalFeed exposes 30+ free APIs and 12 paid ones, all behind one normalized envelope. Drift is our problem, not yours.

Browse the API