← Back to Work
Platforms production

Deterministic Ingestion for Regulatory Data

Rebuilt a stalled ingestion platform for a UK regulator, establishing deterministic pipelines and operable data contracts without exposing sensitive data.

Client: The Pensions Regulator (TPR) Role: Tech Lead & Architect

Context

The programme needed a deterministic ingestion service to bring multiple regulatory data sources under a single operational model. The priority was reliability and explainability rather than raw throughput.

Problem

Ingestion was fragmented and brittle. Each source required bespoke handling, and there was no consistent way to replay or audit data movement when issues appeared.

Constraints

  • Sensitive data and auditability requirements.
  • Upstream sources that changed formats without warning.
  • A small delivery team that needed clear boundaries and runbooks.

Approach

I focused on building a minimal, deterministic ingestion spine:

  • Contract-first schemas with versioning and validation gates.
  • Idempotent ingestion handlers with explicit failure quarantines.
  • Observability added alongside the first data flows, not after.
  • A simple operational model with runbooks and ownership mapping.

Outcomes

  • A repeatable ingestion pipeline that could be replayed for audits.
  • Clear onboarding steps for new sources and data contracts.
  • Operational visibility for both technical and non-technical stakeholders.

Patterns

Event-driven ingestion, schema evolution, idempotent processing, and replayable pipelines.