Entering Weeds Following the Bat
Like many other geniuses, I've spent a lot of time slicing and crunching Baseball Savant's new bat tracking data over the past few weeks. And like many other intellectuals, I'm not entirely sure how we'll end up using this wealth of new information. More time, more data, and more brain power are needed to extract any sweeping new facts it may have. I'll write about bat tracking data in a more focused way next week. There are a few things that I think are really interesting; not new information, but the ways bat tracking data can give us solid numbers for things we've already learned. In this article, I will be a little more scattershot. I would like to walk you through how I have processed all the information that has come out over the past few weeks.
First, bat tracking will give us new stats that stabilize faster than existing ones, since that's how granular metrics that distinguish basic skills and results tend to work. In small samples, exit velocity turned out to be a better predictor of overall batting performance than wRC+ or wOBA. Now we have swing speed, which in small samples turns out to be a better predictor of exit speed. Basically, I pulled the data from the first week of at-bat tracking, April 3 through April 9, and compared it to each player's total numbers this season. I excluded any player with less than five plate appearances in the first week or less than 100 PA over the course of the entire season, leaving me with a sample of 295 players. It wasn't a competition. Full-season exit velocity had a stronger correlation with first-week swing velocity (R = .60) than with first-week exit velocity (R = .40). It also predicted full-season strikeout rate better than first-week strikeout rate (R = .66 for swing speed, compared to R = .46 for strikeout rate). If, after week one, you want to know who's going to hit hard for the rest of the season, don't look at exit velocity. Check out the swing speed:
That said, I'm not sure this particular way of looking at tracking data will help anyone. We're probably breaking things down pretty well here. After all, swing speed doesn't have a strong correlation to overall success at the plate, much lower than exit speed. Going back to our first week stats, swing speed has a slightly lower correlation (R = .19) to full-season wOBA than either exit velocity or batting average (R = .21 for both). It can tell us very quickly how strong a player is at hitting the ball, but it is not very quick to tell us how well he can hit it.
Second, I've heard smart people say that this data can prevent injury. If fatigue, tightness, or tenderness is keeping you from moving as hard as you normally would, a careful analyst can see it in the numbers and give you a break before you injure yourself. While this makes a certain amount of sense, I'm skeptical at the moment. People have been trying to do the same thing with pitchers for years, monitoring stride length, extension, release point, speed, rotation rate, and rest for indicators of fatigue or compensation. As far as I know, no one has broken that code yet. For some quick anecdotal research, I examined two prominent players with recent injuries: Ronald Acuña Jr. and Steven Kwan. Not that this means anything, especially with two lower-body injuries, but both Acuña and Kwan have actually been swinging a little Harder facing four seamers the week before they were injured than earlier in the season.
So far, my biggest takeaway is obvious: Bat tracking is incredibly complex. There are many factors that affect swing speed and length, and if you're trying to learn anything, you need to choose your variables, very carefully to make sure you're comparing apples to apples. If you want to analyze swing speed, you need to make sure you account for pitch type. As Ben Clemens has discussed, fast pitches create less volatility. Of course, swing speed is also related to swing length, and swing length is related to position, and position is related to pitch type, and now we're back to where we started. Since the sweet spot of the bat usually starts somewhere above and behind the hitter's back shoulder, it must travel farther to reach a slider that is lower or farther than an above-center fastball. If you swing on the inside, you are more likely to meet the ball in front, which means a longer swing. So a player who chases a lot of breaking balls is likely to swing longer, as does the right-handed Astro who makes a living pulling balls into the Crawford Boxes. One of those is bad, and one of them is part of the reason Jose Altuve and Alex Bregman are perennial All-Stars.
Here is an example of the struggle to find a representative sample. While very intelligent people were pondering the things I just told you, I was wondering about the strength of the relationship between swing length and batter length. After all, there's a reason we expect bigger players with longer levers to hit more power. If you look at Baseball Savant's bat tracking leaderboard, you'll see that Oneil Cruz has one long game, which isn't surprising since he's one of the tallest people in the game. However, when you drill down to find a more damaged sample, things change.
Let's say you're only looking at competitive swings at medium fastballs that result in balls being hit right off the bat. We limited our sample, but we did our best to control the type, speed, and location of the pitch, as well as the depth of communication. Focusing on these pitches, it turns out that when he's not playing breaking balls, Cruz has an incredibly short swing, which is below the major league average in this division. However, this may not be the right way to look at things. Perhaps Cruz's numbers look better once we discard his many whiffs. Maybe you should only look for the whiffs. After all, if we look at whiffs, there is no need to calculate the depth of connection, because there is no connection. That's a lot of flexibility that has been removed. Looking at whiffs on average fastballs, Cruz's swing length was no longer below average, although it was still short for such a tall player.
No matter how I cut it, I usually found that length and swing length had a correlation coefficient between .24 and .35. However, like most of my delving into bat tracking data, I'm not entirely sure how to make all the parts come together into a cohesive whole. In this example, it made a lot of sense to look at whiffs only, but at the same time, it seems ridiculous to judge the speed of a player's swing, which shows how much damage they can do when they make contact, by throwing out they all actually make contact!
I suspect that bat tracking will be used in one way very quickly. We've all read articles about teams telling their pitchers to trust a certain pitch because it's worse than they realize. Now they will be able to identify a specific number. Let's say you're the Rays and you want Garrett Cleavinger to throw his four-seamer more often. He might be able to buy if you tell him that hitters are throwing three ticks less at it than facing his cutter and sinker. Whiffs are great, but knowing that batsmen can't even swing well against a pitch can be a powerful motivator.
As I said above, these are my first takeaways as I sift through the data and sift through what smart baseball analysts have written on the topic. I'll be back with more next week, and for now, I'll keep digging.
Source link