DataJune 11, 20246 min read

Practical data contracts for small teams

How to stop schema breakage without drowning in governance: contracts, lineage, and a 30-minute weekly review.

data-engineering
contracts
analytics
quality

Most data breakage is social, not technical. Contracts help when they live next to the code that emits the data. I store json schemas and sample payloads with the service, then run contract tests in CI and in production sampling.

What goes into a contract

  • Schema with optional/required fields and allowed ranges
  • Example payloads for happy path and known bad inputs
  • Escalation: who gets paged and how long they have to respond

Lineage is kept intentionally shallow. Every downstream table lists the upstream owners and how to contact them. The weekly review is 30 minutes with producers to retire unused fields and check for drift.

If it is not in code or a runbook, it is not a contract. It is a wish.
Key takeaways
Highlights you can reuse.
Contracts as code: schemas, tests, and alerts in git
Lineage you actually read: three hop limit and plain English owners
A 30-minute weekly review keeps producers honest
Downloadable template
Copy the checklist and adapt it to your stack.

Includes prompts, runbooks, and rollout steps referenced here.

Shipping an AI feature in a single weekend
The constraints, scaffolding, and observability I lean on to take an idea from notebook to production by Monday morning.
Build log
8 min read
Read
LLM evaluation that does not hurt
A lightweight rubric I use to grade LLM features before users do, with examples for reasoning and tool-heavy prompts.
AI/ML
9 min read
Read
Edge AI with Workers and Rust
Running inference at the edge with predictable latency, shared wasm modules, and a hybrid routing plan for heavier models.
Edge
8 min read
Read