game Β· 2 min
Hot Key Horizon
Data skew as gravity: records rain into partition planets, and the hot key's planet accretes mass until it collapses into a black hole β which is roughly what your Spark stage does at 99%. You get two salts. Spend them well.
Hash partitioning is gravity: every record falls toward the planet its key hashes to. Thatβs fine until one key is 55% of the traffic. Watch the NL planet grow, pull harder, and accelerate its own accretion β skew is a feedback loop, which is why the job that was βalmost doneβ at 99% stays there for an hour.
Your only tool is the real one: salting. Hit the salt button and the hot key fans out into NLΒ·0 / NLΒ·1 / NLΒ·2, redistributing future records across partitions. Salt too late and the planet crosses its capacity horizon β collapse, stage failure, the exact error message you have seen before.
The HUD shows live skew = max/mean, the metric youβd actually check in the Spark UI. Finish the batch and your score is the balance you maintained. Best run saved to localStorage. Single HTML file, no dependencies.