· 3 min read

Credential Vending Only Works If Your Storage Speaks STS

Iceberg catalogs have a feature called credential vending. Instead of every query engine carrying long-lived storage keys, the engine asks the catalog for the data, and the catalog hands back short-lived, scoped-down credentials for just the files that query needs. Less key sprawl, tighter blast radius, no secrets baked into a hundred client configs. We wanted it.

We were running an Iceberg REST catalog in front of an on-prem object store that speaks the S3 API. The catalog supports vending. The storage speaks S3. I expected this to be a config flag.

The call the catalog actually makes

When you enable vending, the catalog doesn't invent credentials. It calls AWS STS AssumeRole against the storage backend and passes back whatever STS returns. That one detail is the whole story. Vending is a thin wrapper around AssumeRole, and the catalog hard-codes that specific call.

Our object store does implement STS. What it implements is AssumeRoleWithSAML and AssumeRoleWithWebIdentity, the federation flows where an external identity provider vouches for the caller. It does not implement plain AssumeRole, where you already hold credentials and ask to swap them for scoped-down ones.

So the catalog called AssumeRole, the storage had no handler for it, and vending failed before a single byte moved. The S3 data path worked fine the whole time, which made the failure confusing. Reads and writes were healthy. Only the vending handshake was dead, because it depended on an STS verb the appliance never shipped.

You can't config your way out of a missing verb

I checked whether the catalog could be told to use a federation flow instead. It couldn't. The plain AssumeRole path is wired in, and there's an open upstream issue tracking exactly this gap for S3-compatible stores that only do the federation variants. This is a real interop hole, not a misconfiguration, and recognizing the difference saved me from "fixing" a settings file for another day.

The boring fix that works

We turned vending off and gave the engines scoped access keys instead, sourced from a secret the platform already manages and rotates. The config switch that matters is the one that tells the catalog to skip the STS step and trust that the caller brought its own credentials. With that set, the catalog stops trying to AssumeRole, the engines authenticate with their keys, and queries run.

It's less elegant than vending. The keys live longer than a vended token, and rotation is on us instead of on STS. For a closed platform where every client is one we operate, that tradeoff is fine. I wrote down the exact condition for turning vending back on: either the storage vendor adds plain AssumeRole, or the catalog grows support for a federation flow. Until one of those lands, re-enabling vending just recreates the same dead handshake.

What I'd check first next time

Before assuming an Iceberg feature works against non-AWS storage, find the AWS API call hiding under it. "S3-compatible" covers the data plane. It says nothing about which STS verbs exist, and the gaps don't show up until a feature like vending reaches for one that isn't there.