Platform
Production data lakehouse on Dremio + Kubernetes
A production-critical service at 99.9% uptime. It powers a customer-facing SaaS product and is queried directly by a dozen internal teams, data scientists, and engineers.
Data Platform Engineer · SF Bay Area
I take data platforms from proof of concept to production. Right now that's a data lakehouse on Kubernetes: one query engine over 40 data sources, backing a customer-facing product and about a dozen internal teams. There's a Kafka streaming platform in staging and AI tooling on top, plus the on-call and disaster recovery that keep it running.

Across the data lifecycle
The platform I run, by the numbers
Selected work
Platform
A production-critical service at 99.9% uptime. It powers a customer-facing SaaS product and is queried directly by a dozen internal teams, data scientists, and engineers.
Streaming
Self-hosted, secured real-time data streaming, running in staging. The streamed data lands straight in the lakehouse, ready to query.
A pipeline that documents a data catalog with an LLM sounds like a prompt-engineering problem. Almost none of my time went to prompts. It went to making the boring parts trustworthy.
Read more →
Happy to talk data platforms, lakehouses, or where AI actually earns its keep in infrastructure. LinkedIn is the fastest way to reach me.