Skip to content

Data Platform Engineer · SF Bay Area

Jordan Lewis

I take data platforms from proof of concept to production. Right now that's a data lakehouse on Kubernetes: one query engine over 40 data sources, backing a customer-facing product and about a dozen internal teams. There's a Kafka streaming platform in staging and AI tooling on top, plus the on-call and disaster recovery that keep it running.

Jordan Lewis

Across the data lifecycle

  1. Data modeling & ETL
  2. Lakehouse & streaming platform
  3. Semantic layer
  4. Data products & APIs
  5. Analytics & BI

The platform I run, by the numbers

99.9%
Lakehouse uptime
2,500+
Analytical queries a day
30K+
Datasets, one catalog
40
Sources, one engine
10+
Teams on the platform
100+
Concurrent queries at peak

Selected work

Things I've built

Latest thinking

· 8 min read

I Built an AI Data Steward. The Hard Part Wasn't the AI.

A pipeline that documents a data catalog with an LLM sounds like a prompt-engineering problem. Almost none of my time went to prompts. It went to making the boring parts trustworthy.

Read more →

Let's talk.

Happy to talk data platforms, lakehouses, or where AI actually earns its keep in infrastructure. LinkedIn is the fastest way to reach me.