13 tutorials

Baseball

MLB Statcast is the richest free data in sports. Pull it with pybaseball and build leaderboards, heatmaps and more.

Level:
Your first real Statcast pull, cached, with an exit-velocity histogram.
Baseball Beginner

Pull Your First MLB Data with pybaseball

Install pybaseball, turn on caching, and pull a week of real Statcast data. End with a histogram of batted-ball exit velocity so you can see the data is genuinely there.

~8 min
A box plot comparing run-differential distributions between the AL and NL.
Baseball Beginner

Box Plots: Comparing Distributions Across Groups

Averages hide spread. Build a box plot to compare whole distributions at once - median, quartiles, and outliers - using American League vs. National League run differential.

~5 min
Wins and losses stacked into a single bar per team to show composition.
Baseball Beginner

Stacked Bar Charts: Win-Loss Composition

Stack two series into one bar to show composition rather than just totals. Build a win-loss stacked bar chart and learn when stacking clarifies a comparison and when it muddies it.

~5 min
A run-differential line shaded green above the league average and red below it.
Baseball Beginner

fill_between: Shade Above and Below a Baseline

fill_between() paints the area between two lines, and its where= argument masks the fill so you can color above-average and below-average regions differently in a single chart - with interpolate=True closing the wedge exactly at the crossover.

~5 min
A pie and a donut of each division's share of league-wide runs, plus when a bar beats both.
Baseball Beginner

Pie and Donut Charts: Showing Part-to-Whole (and When Not To)

A pie answers one question well: what share of the whole is each slice? Build a pie and a donut of each division's share of runs scored, then learn the situation - close or many values - where a humble bar chart is the honest choice.

~5 min
A pitch-location heatmap for one pitcher with the strike zone drawn on top.
Baseball Intermediate

Make a Pitch-Location Heatmap in Python

Use a single pitcher's Statcast data to build a 2-D location heatmap, draw the strike zone from the catcher's view, and read what the hot spots tell you.

~8 min
A hitter's batted balls plotted on a baseball field, by outcome.
Baseball Intermediate

Build a Hitter's Spray Chart

Turn Statcast hit-coordinate data into a spray chart on a field you draw yourself, colored by outcome, to see whether a hitter pulls the ball or uses the whole field.

~9 min
A percentile-ranked bar chart of every team, colored into four quartile tiers.
Baseball Intermediate

Percentile Ranks and Tiers: Locate a Team in the Distribution

A raw rank says '8th of 30'; a percentile rank says '73rd percentile' - a 0-1 position in the distribution that travels across pools of different sizes. Use rank(pct=True) and qcut() to score teams and bin them into tiers.

~5 min