Projects

Korean Price Monitor

What Korean consumers actually pay vs. what the official inflation index reports.

Timeline2025 – Present
CompanyIndependent
RoleData Engineer
StackPython · Docker · cron · Streamlit

Overview

CPI is a national average. It hides what households actually pay at the checkout.

Korean Price Monitor is a unified ingestion pipeline that pulls KOSTAT retail prices (weekly XML) and ECOS CPI (monthly JSON) into a single, comparable store. It runs 6 quality checks per run, detects schema drift, and preserves raw payloads so historical analyses stay reproducible.

Technologies

language
Python
infrastructure
Docker cron
front-end
Streamlit
data sources
KOSTAT API ECOS API

The Problem

Official inflation numbers don't match what's on the receipt.

CPI smooths over category-level shocks (eggs, gasoline, fresh produce) by basket weighting. Households experience inflation at the product level, where weekly price volatility is much larger. There's no public, low-friction way to compare the two — so the gap stays invisible.

💡
How might we
build a pipeline that surfaces the gap between official CPI and real consumer prices, weekly, at the category level — and keeps it honest enough to trust?

Product Vision

Inflation, but visible at the basket level — refreshed weekly.

Anyone curious about why their grocery bill keeps climbing should be able to see the answer in two clicks: a per-category trend, a CPI overlay, and a clear delta.

Core architecture

A reproducible 2-source pipeline with QC as a first-class citizen

Schema-validated ingestion of KOSTAT XML and ECOS JSON, normalized into a single category × week schema. Raw payloads are preserved by hash. 6 quality checks run per execution (row counts, schema fingerprints, allele/category coverage, null ratios, monotonicity, source-source reconciliation) and any failure halts the downstream warehouse write.

~600K
products / run
124
categories tracked
6
QC checks per run

My Contribution

Built end-to-end as a solo engineer — from API contracts to dashboard.

I owned every layer: schema design, ingestion, validation, storage, and the Streamlit front-end. The challenge wasn't any single component — it was making the seams between KOSTAT and ECOS honest, so a comparison between them is actually apples-to-apples.

What I work on

  • Schema design for a sign-consistent, source-agnostic category × week store
  • Dockerized ingestion + cron scheduling for KOSTAT (weekly) and ECOS (monthly)
  • Quality-check suite: 6 invariants that block bad data from reaching the warehouse
  • Raw-payload preservation so any historical run can be reproduced from disk
  • Streamlit dashboard exposing per-category CPI-vs-retail deltas

Key Achievements

[ Headline outcome — the one-line "this is what it delivered." ]

[ Slightly longer narrative on the most meaningful results. ]

metric one
metric two
metric three

Lessons Learned

[ The single biggest takeaway from this project. ]

[ What worked, what didn't, what you'd do differently next time. ]

Where It's Going

From a single-country tracker into a reusable inflation-divergence framework.

Next: backfill 5 years of historical KOSTAT data, add a forecasting layer for category-level deltas, and open up the pipeline as a public Korean inflation dataset. Longer-term, the same architecture should drop in for any country with split official-vs-retail price sources.

5 yr
historical backfill
Forecast
category-level deltas
Public
open dataset
← Prev: Visa PERM Analysis Next: PheWeb β-Matrix Builder →