I built my own data product: SteamBangers.com
Contents
- What the Bang Score tells you
- Curated for value
- Sign in, and it gets personal
- Watch the price, not the store page
- The data killed my first idea on day one
- The number that felt like value, but wasn't
- Every input is on the table. The recipe isn't.
- The platform under the number
- The stack, and what it costs
- The bugs that shipped green
- A control station, because it runs unattended
- My day job in miniature
SteamBangers scores every game on Steam from 0 to 100 for one thing: whether it's worth your money. I own every layer of it, made on my own time, outside work, and it's the first product I've ever shipped that's entirely mine! Under the hood it's a real data platform, an Iceberg lakehouse feeding a versioned score, served from the edge for about the price of a domain a year.
I built it for the kid I used to be. Video games are the reason I got into computers in the first place. I needed to learn how they worked! That itch is more or less the same one that put me in platform engineering years later. Games were also something I could barely afford. Twenty dollars was a real decision, so before I bought one I'd put in the hours to figure out if it was worth it: reading reviews, watching reviews, asking friends, doing the math. SteamBangers is that gut-check, automated, for all 7,673 games I've scored so far, with the rest of Steam on the way.
What the Bang Score tells you
The Bang Score is one number, 0 to 100. Above 85 is a banger, buy it. Under 50 is a skip, or wait for a sale. It judges a game against others in its genre, so a tight six-hour indie isn't measured against a hundred-hour RPG, and a short, dense game can beat a long, padded one.
The uncomfortable part is that most games lose. About two-thirds of what I've scored lands in "skip" at today's price. That's the whole point. The number exists to point you at the third that don't, the dense and fairly-priced gems, and away from the padded eighty-hour grind you'll quit at hour nine or the $70 launch you'll regret by the weekend. For the budget version of me, that gap is the difference between a great month and a wasted twenty bucks.
For the short version: Hollow Knight scores a 98, Hades a 99, Slay the Spire a 100. The games people love and don't regret rise to the top. The ones built to waste your time sink. There's a whole value desk for the games on sale right now, ranked by the score instead of the size of the discount.
The value desk: the deals worth taking right now, ranked by the score, not the discount.Curated for value
You don't have to scroll the whole board. Hand-built collections cut the catalog down to a lens: bangers under $5, hidden gems under $10, games you can beat in a weekend, free and still great. For the budget gamer, that's the shortcut to the good stuff.
Collections: hand-built lists, each a different lens on value.Sign in, and it gets personal
The public score is half of it. Sign in with your Steam account and SteamBangers turns the same engine on your own library and wishlist.
Rate my wishlist scores everything you're eyeing and tells you what's worth buying now versus waiting for a sale. The library report ranks what you already own by value and shows the real cost per hour of what you've actually played. The recommendations weight the games you've genuinely sunk time into, find the genres and price points you lean toward, and surface high-scoring games you don't own yet, each with a plain reason. And you can tune all of it: a budget cap, focus genres, a stricter review bar, even a Steam Deck mode that re-ranks for the handheld.
All of it is deterministic. The recommender is a transparent taste model, not a black box, so every pick is explainable and reproducible, and it runs zero AI, exactly like the score. The login is read-only Steam OpenID, so I never see a password, and you can wipe your data whenever you want.
Watch the price, not the store page
A lot of games are a banger only on sale. Bastion reviews great and isn't long for the money, so it sits at "worth it" most of the year, then a Steam sale knocks it under the line and it's a banger for a week. Every game page has its full price history, and that's where the pattern shows: the flat stretch up top is full price, the green dips are the sales where the price crossed into banger territory.
Bastion's last year. Full price up top, the green dips are the sales that tipped it into a banger.You can almost read the next sale off the chart, the seasonal ones land about when you'd expect, but I'd rather not make you camp a store page to catch it. So every game has a watch button: email me when this hits banger on a sale, or when it drops below a price I pick. One email, then it's done.
Set it and forget it. One email when the game is actually worth buying.The data killed my first idea on day one
My original plan was simpler, and wrong: price divided by hours played. I'd pull the median playtime per game from SteamSpy and divide. So I spiked that one input before building anything, which is a habit that has never let me down, and SteamSpy's median playtime came back as 0. For every game. Steam's 2018 privacy change broke the sampling that fed it, and the field has been dead ever since. My load-bearing input did not exist.
The aggregate that survived Steam's privacy lockdown is global achievement completion percent: of everyone who owns a game, what share actually finished it. So the "how much do you really get" axis stopped being how long people play and became how far they get. Spike the load-bearing assumption first, before you build anything on top of it. The data gets a vote, and it does not care about your design.
The number that felt like value, but wasn't
Here's the rewrite that cost me the most, and the one I'd hand to anyone who designs a metric.
The obvious way to score value is dollars per hour: price divided by playtime. I started there, and it came out backwards.
Dollars per hour rewards length. A four-hour game you love scores worse than an eighty-hour grind you bounce off at hour nine, because the grind has more hours to divide the price into. It literally pays developers to pad, which is the exact opposite of what a budget gamer needs to hear. My first version leaned hard on playtime and completion, and the symptom was stark: games with completion data averaged about 21, games without averaged about 61. It was burying the exact games I most wanted it to surface. Hollow Knight, with a completion rate near 4.6%, looked like a rip-off, which says more about how hard it is than whether it's worth $15.
So I flipped what leads. Reviews set the ceiling now: a Bayesian-shrunk review score is the base of the number, so a handful of glowing reviews can't spike it. Price only moves you down from there, never up past it, with a per-genre length floor so a tight game isn't punished for being tight. Completion dropped out of the score entirely and became something I just show you.
The lesson holds well outside games: check what a metric rewards, not whether it feels intuitive. People optimize whatever proxy you ship, so ship the one that points at what you actually meant.
Every input is on the table. The recipe isn't.
A number worth citing has to be auditable, so every per-game page shows the score and every input behind it.
A game page, Slay the Spire. The score, the verdict, and every input that fed it.The score weighs review quality, price, how long the game really runs, how many players actually finish it, and how aggressive its microtransactions are. The data is pulled from seven third-party sources: the Steam storefront for price, the Steam Web API for achievements and completion, Steam reviews, SteamSpy for ownership, HowLongToBeat for length, IsThereAnyDeal for price history and buy links, and PCGamingWiki for the microtransaction grade. None of them is fully reliable, so the pipeline treats all of them as flaky by default. Every one of them is credited in the sidebar, because the score is only as honest as where it comes from.
It's deterministic and runs zero AI in the scoring path, so anyone can reproduce a score from the inputs. The one thing I keep private is the formula that combines them. A published formula gets gamed in a week and is forgettable the whole time. A secret score people argue about is the thing that brings them back. Why is Hades a 99 and not a 100? Come argue with me. Trust comes from showing every input and the honest gaps, not from handing out the weights.
There's one guard that matters more than any single input. SteamBangers earns a small affiliate cut when you buy through it, which makes the conflict of interest the entire brand risk. So the score is computed before any price or buy link is ever attached to a game. It physically can't bend toward the thing that pays me, and a build check fails the whole pipeline unless every buy link carries its disclosure. The guard, not the buy button, is the monetization feature.
The platform under the number
Here's where it ties back to what I do for a living. I'm a data-platform engineer. At work my team runs an open lakehouse that other teams build their products on, and I own the infrastructure, the deploys, the on-call. What I'd never done is own the whole arc by myself: the question, the data, the product, the thing people actually click and argue with. SteamBangers is that arc, solo, and I deliberately built it on the same shape we run in production.
One game's data, left to right: seven flaky sources in, one honest number out. My day job in miniature, sized down to one person.A claim-based crawler pulls those sources on a tiered schedule. It claims work with SELECT ... FOR UPDATE SKIP LOCKED, so I can run it on several machines at once without two of them grabbing the same game. Every extractor is idempotent, rate-limited from one config file, and wrapped in a per-source circuit breaker, so HowLongToBeat blocking me never stalls the Steam crawl. A page renders with whatever data it has.
The raw payloads land in a bronze layer, and I put that bronze on an Apache Iceberg table in Cloudflare's R2 Data Catalog, their managed Iceberg REST catalog. That's deliberately more than a side project needs. It's the same open-lakehouse pattern my team runs in production: bronze on an Iceberg catalog with schema, snapshots, and time-travel, queryable by any engine without moving the data. I can point DuckDB at my own crawl and query it directly:
SELECT source, count(*) FROM cat.steambangers.bronze GROUP BY source;
-- steam_appdetails 167294
-- steam_achievements 9802
-- hltb 9802
-- steam_appreviews 9802
-- steamspy 9802
-- pcgamingwiki 3899
Building the side project on the pattern we run at work is what makes it the portfolio piece. Raw JSON blobs in a bucket would have needed a custom manifest and bespoke readers to get a tenth of the reach. Iceberg's batch-commit model did bite me once: commits are table-level, so a per-row crawler can't append a row at a time. I buffer per source and flush a snapshot. The right model for analytical bronze, the wrong one if you forget it's there.
From bronze, a transform re-scores everything: dbt on DuckDB, running free inside CI and the hourly rebuild, with 13 data-quality tests and a source-freshness check that block a thin or stale pull from ever promoting into a live score. Silver and gold land in Neon Postgres, the build-time source of record, where the typed tables are tiny: the scores are about 50 MB, and the price history behind them is 589,000 rows and still free. Then a publish step copies gold to a Cloudflare D1 SQLite replica at the edge, incrementally, only the rows that changed. The site reads D1 on every request, so a page load never waits on Postgres, and a rebuild busts the edge cache by tag instead of on a blind timer.
The whole thing runs hourly on Modal and serves from Cloudflare Workers. The same gold table feeds the website, a public JSON API, and an MCP server, so an AI assistant can answer "is this game worth it" straight from the Bang Score.
And I watch it from one place. A private control station shows the pipeline stage by stage, queue to crawl to score to publish to serve, with the live catalog counts, the tier mix, and anything that needs attention. Same instinct as the on-call dashboards at work: if I can't see it, I don't trust it's running.
The owner-only control station. The pipeline, the catalog, and the tier mix in one view. Right now two-thirds of the catalog is a skip at full price, which is the whole point.The stack, and what it costs
Every layer is its own platform, each picked for one job and for staying free until real scale forces a dollar.
Here's the whole thing, and why each piece is where it is, and what it costs:
| Layer | Platform | Why this one | Cost |
|---|---|---|---|
| Ingest + orchestration | Modal (cron) | runs the crawl and the hourly rebuild on a schedule; a recurring free credit covers it | ~$0 |
| Bronze (raw data) | Cloudflare R2 Data Catalog | a managed open Iceberg catalog, no egress fees, readable by any engine | free tier |
| Transform | DuckDB + dbt | a real warehouse plus data-quality tests, run free in CI | $0 |
| Build store of record | Neon Postgres | typed silver and gold, rich SQL for the per-genre percentiles, 0.5 GB free holds it for years | free tier |
| Edge serving | Cloudflare D1 (SQLite) | a read replica at the edge, 5M reads/day free, so a page view never waits on Postgres | free tier |
| Web, API, MCP | Cloudflare Workers (Next.js on OpenNext) | commercial use is allowed, sits right next to R2 and D1, global CDN | free tier |
| Alerts + digest | Resend | the on-sale "this just became a banger" emails and the weekly digest | free tier |
| Domain | steambangers.com | the one thing I actually pay for | ~$12/yr |
Two of those rows came straight from reading the fine print, not the pricing page. Vercel's free tier bans commercial use, buried on its fair-use page, so an ad-or-affiliate site there risks suspension, which is why the site runs on Cloudflare. And MotherDuck, the managed DuckDB I almost reached for, moved its cheap plan to $250 a month overnight, so the transform runs DuckDB and dbt for free in CI instead. Read the fine print, not the headline price.
What makes these pieces safe to run unattended, by one person, is the contract between them. Bronze is immutable and replayable, so I can recompute any score from the raw data without re-crawling. The transform is a deterministic, pure function of bronze. And gold, served from the edge, is the only thing the public ever sees. A bad pull can't corrupt history, and a bad score can't reach the site without passing the data-quality tests first. The crawler can fail, a source can go down, the transform can be rerun, and none of it touches what a visitor loads, because the website only ever reads the published D1 replica, never the build store behind it.
The bill stays near zero because the data is small and the serving is static at the edge. Past the domain, the first forced dollar would be Cloudflare's $5/mo Workers tier, and only if the long tail crosses 100,000 requests a day, which is right about where the affiliate links would start to cover it. Effectively free to run until it works, and self-funding once it does.
The bugs that shipped green
Owning every layer alone means there's no one to catch your mistakes, so the pipeline has to be the second set of eyes. The day I learned that, it cost me the whole catalog.
My first production rebuild shrank the live catalog from 1,491 games to 15. Exit code 0. It printed d1 publish: ok. No error, no stack trace, no failed exit code. The compute image I run the rebuild in was missing one file, the list of every game to score, so the build fell back to a 15-game development seed, scored those 15 perfectly, and published them straight over the live data. Fifteen rows is a completely successful build of the wrong input. Every test passed, the math was right, the publish worked. The pipeline had no idea it had just overwritten 99% of the site. I caught it in the run output, because nothing had technically failed, and since the bronze still held every game, I added the missing file, re-ran, and the catalog came back. A try/except catches crashes, and this did not crash. The guard you actually need for "did the build work" isn't about correctness, it's about magnitude: if today's output is a fraction of yesterday's, stop and ask before you publish.
The same family of bug bit me again, from the opposite side. I added a statement_timeout to bound a rare database hang, the responsible thing to do. Neon's connection pooler rejects that particular startup parameter at connect time, so every connection in the pipeline threw, and the whole thing froze for hours behind green tests. My regression test had pinned the timeout string in the config, not that the connection actually opened. The fix was the direct, unpooled endpoint. The lesson was sharper than the fix: test the behavior, not the artifact. Assert that the thing connects and the catalog stayed roughly its size, because a string match passes over a dead pipeline.
There was a quieter one too. I'd estimated 25 KB of storage per game and measured 235 KB, because I was hauling 100 full review texts per game that the transform never reads. Storing just the summary cut it by about 5x. Measure the bytes you actually carry.
A control station, because it runs unattended
Because the whole thing runs on a schedule with nobody watching, I gave it the same thing I'd want at work: a control station. It's an owner-only dashboard that shows the pipeline's live state, the crawl, the rebuild, the publish to the edge, each one a stage that's running, healthy, lagging, or failed, with how far along it is. It polls the pipeline's health every few seconds and reads the real freshness straight off the data, never a hardcoded pill.
An operator's first question is "is it running, and how far," and the data already implies the answer, so I surfaced it. When the catalog-nuke and the pooler freeze happened, this is the view that turns "something feels off" into "the publish stage is failing" in one glance. It's what lets one person run the whole thing unattended and trust that a bad run shows up as a red stage, not as a quiet wrong number on the site.
My day job in miniature
The hard parts of a product were not where I expected them from inside my one layer at work. None of them was exotic. They were the ordinary discipline of a platform that runs unattended: the output guard, the metric incentives, the bytes carried for nothing, the guardrail that keeps the money from bending the number. That discipline doesn't show up until you own all of it at once.
Running one layer of a platform at scale teaches you depth. Building every layer of a small one alone teaches you where the seams are. SteamBangers is ingest, an Iceberg lakehouse, a deterministic score, an edge-served product, a public API and an MCP server, and the guardrails that keep it honest, all in one head, on the same open-catalog pattern we run at work.
A few rules survived the build, each one a rewrite to learn:
- Judge a metric by what it rewards. Dollars per hour rewards length, so it quietly pays for padding.
- Spike the load-bearing assumption before you build on it. Mine was dead on day one.
- A green build can be the worst bug you ship, because every signal you trust says it worked. Guard the size of the output.
- On a trust site, the guard is the feature. Compute the number before the money is in the room.
The Bang Score is live and free for every game on Steam at steambangers.com. Go look up the last game you almost bought, or find out what your wishlist is actually worth. The formula is secret on purpose, so come argue with a score!
It's the first thing I've built end to end that's entirely mine, and it runs on the same instincts I bring to work every day. The rest of what I build is on the projects page.