Continuing my pattern of enjoying the festivities around the All-Star Game the most, I've always loved the Home Run Derby. As a kid, when I'd come home from school, the old reruns of Home Run Derby from the 1960s were on. Because I was born in 1978, I got to see a lot of players whose careers I had missed out on, like Hank Aaron, Willie Mays and Mickey Mantle. This was before ESPN Classic was launched and the internet was really a thing.
Recently, in a discussion, I was asked "Dan, can you predict a Home Run Derby?" My first answer was an extremely helpful "Dunno," but it got me thinking if we could predict a Home Run Derby, using both the matchups and results. We don't have a lot of data, but modeling these things always gives an interesting glimpse into the inner workings of things.
So, I found variables for the three most recent Home Run Derbys -- I'd love more years, but the formats shift a lot and I wanted to see first if Statcast data would be helpful -- and started from there. A few variables, some surprising, some not, came up as useful predictors of Derby performance. Sixteen of the 21 matchups from 2015-2017 were won by the player with the higher average exit velocity and the margin tended to be correlated with their difference in the rankings. Also helpful were things like home runs-per-ball hit, and even recent performance seemed to have more predictive value in a Derby than in real life.
I'll spare the nitty-gritty ins-and-outs of dimensionality reduction and go right to the results, plugged in for this year's Home Run Derby, matchup by matchup.