Flagship platform

Production data lakehouse on Dremio + Kubernetes

A production-critical service at 99.9% uptime that a dozen teams across the company query directly.

Problem

The company needed one place to query data spread across on-prem HDFS, object storage, Snowflake, and relational databases, without copying it around first. It had to be reliable enough to run customer-facing reporting on.

What I built

I took Dremio from a single-VM proof of concept to a full platform on Kubernetes across dev, staging, and prod. A maintained Helm chart, GitLab CI/CD, F5 ingress, RBAC, and secrets that re-sync from a vault on every deploy. I added an Apache Iceberg catalog, a semantic layer that joins 30+ sources across six backend types in a single query, SSO, query-acceleration reflections, and dedicated compute engines. I run the upgrades, the on-call rotation, and the HA/DR posture.

Scope

The company's production lakehouse: one SQL engine over data spread across on-prem and cloud, queryable without copying it first. Customer-facing reporting runs on it.

My role

I own it end to end: the Kubernetes platform, the catalog, the upgrades, the on-call rotation, and the HA/DR posture.

Architecture

Dremio on Kubernetes across dev, staging, and prod, deployed from a maintained Helm chart through GitLab CI/CD.
An Apache Iceberg catalog over object storage, plus federated connections to HDFS, Hive, Snowflake, Postgres, and Mongo.
A semantic layer that joins 30+ sources across six backend types in a single query.
F5 ingress, RBAC, SSO, and secrets that re-sync from a vault on every deploy.
Query-acceleration reflections and dedicated compute engines for predictable performance.
Prometheus, Thanos, and Grafana for metrics and alerting.

Outcomes

Runs as a production-critical service at 99.9% uptime.
A dozen teams query it directly, self-serve.
Performance tuning took multi-minute queries down to seconds.
A written HA/DR posture: RTO/RPO targets, coordinator HA, backups, and cross-region storage.

Stack

DremioApache IcebergKubernetesHelmHDFSSnowflakeGrafana