Three F1 questions, one pipeline
A portfolio project about turning messy, biased sports data into fairer comparisons, reproducible data products, and long-form visual stories that a non-technical reader can still follow.
Who's really the GreatestOfAllTime?
Start with the hardest comparison in the sport: driver ability across eras. The model only compares teammates, because same team, same car, same weekend is the cleanest fairness constraint Formula 1 offers.
OK, but who's the LuckiestOfAllTime?
Talent is only half the story. The companion question is who benefited most from circumstance: reliability luck, inherited positions, and safety-car timing, each measured against the other side of the same garage.
Has F1 gotten boring?
The history chapter. Have regulation resets ever changed who won? How did reliability transform the sport? Has this cleaner, safer era made races too predictable?
-
Chapter 1 · Regulation
Do rule changes actually shake up the grid?
Every new rule package promises a reset. A season-by-season view of constructor win concentration asks how often the pecking order really changed.
-
Chapter 2 · Reliability
When did F1 cars stop breaking?
Finish rates climbed from roughly two-in-five to nineteen-in-twenty. It is the least glamorous change in the dataset and arguably the most important.
-
Chapter 3 · Racecraft
Did reliability kill the overtakes?
A grid where everyone finishes is a grid where raw position change gets harder to find. This page follows the link between reliability, predictability, and spectacle.
What this case study demonstrates
The site is designed to show judgment as much as implementation: fair comparisons, reproducible data products, and methodology that stays visible instead of hiding behind the charts.
- Fairness over trivia: the central ratings work because the comparison design is doing real work, not because the chart looks sophisticated.
- One repository, full traceability: every public claim on the site is backed by marts, source code, and a rebuildable data pipeline in the same repo.
- Methodology beside the answer: assumptions, caveats, and judgement calls are published in plain sight instead of being treated as an internal note.
- Presentation as part of the job: the same project owns ingestion, modelling, warehouse design, documentation, and the visual layer that turns the data into an argument.
Judgement calls and tradeoffs are written up on the methodology page — because the modelling choices matter at least as much as the final leaderboard.