Skip to content

Full-Stack Data Engineer · SF Bay Area

Jordan Lewis

I own the data stack, end to end.

Data modeling and pipelines, a Dremio lakehouse and Kafka platform on Kubernetes, the APIs and semantic layer on top, and the tooling that keeps it all reliable.

Jordan Lewis

Full-stack across the data lifecycle

  1. Data modeling & ETL
  2. Lakehouse & streaming platform
  3. Semantic layer
  4. Data products & APIs
  5. Analytics & BI
99.9%
Production lakehouse uptime
10+
Teams using the platform
30+
Data sources in one SQL engine
8
Platforms I run on Kubernetes

What I do

Platforms, end to end

I work across the whole stack that makes analytics possible: storage, query engines, streaming, identity, and governance.

Data lakehouse platforms

I run Dremio and an Apache Iceberg catalog on Kubernetes. One SQL engine queries HDFS, Hive, object storage, Snowflake, Postgres, and Mongo together, with a semantic layer, SSO, and query acceleration on top.

Reliability & operations

I own the platform end to end. Multi-environment CI/CD, on-call, incident response, HA/DR planning, and the dashboards that keep a production service at three nines.

Streaming & AI tooling

Kafka on Confluent for Kubernetes feeds the lakehouse. On top of it I build LLM and MCP tooling that documents a catalog and answers questions about it from your editor.

Selected work

Things I've built

Writing

Recent posts

Let's talk.

Happy to talk data platforms, lakehouses, or where AI actually earns its keep in infrastructure. LinkedIn is the fastest way to reach me.