Know Thy Data
An Incredible RBI Season
Dante Bichette had 133 RBI’s in 1999.
Bichette is a former major league baseball player. He spent time with five different franchises, but his longest tenure was with the Colorado Rockies, where he played outfield during the 1999 season.
RBI stands for “run batted in,” and is a baseball statistic intended to measure the number of runs an offensive player is responsible for producing. For instance, if there’s a player on third base, and the batter hits a single, allowing the player on third to score, then the batter gets one RBI.
In 1999, Bichette had 133. This is a large number. It was the 8th highest number of RBI’s in all of major league baseball that year. It’s more than Giancarlo Stanton had in 2017, when he led the majors with 132. In fact, 133 would have been enough to lead the league in each of the last 4 seasons, and 6 of the last 8. Bichette’s 1999 season is the 170th greatest individual RBI season since 1900, which might not seem like much, but consider that there have been over 15,000 individual seasons in that period of time.
The RBI tells us that Dante Bichette produced a lot of runs in the 1999 season, and, after all, the batter’s job is to produce runs. Given that information, it might be reasonable to think that Dante Bichette was a good, perhaps even great, baseball player in 1999.
Taking a Second Look at the Numbers
With all due respect to Mr. Bichette and the Colorado Rockies organization, Bichette was neither great nor even good in 1999. Bichette was a bad major league baseball player in 1999 and produced negative value for his team. The Rockies would have won more games in the 1999 season if they had replaced Bichette with an outfielder from their AAA Minor League affiliate (at the time, the Colorado Springs Sky Sox), or just about any other professional baseball player.
How is that possible? After establishing that Bichette’s RBI total, a number indicative of his offensive production, was elite in 1999, how was he an objectively bad professional outfielder?
It goes without saying that the threshold for playing major league baseball is incomprehensibly high, and calling Bichette a “bad” baseball player is untrue. To judge a player’s value in practical terms, baseball statisticians invented the concept of the “replacement-level player.” Such a player is one that’s readily available to any club that wants him, and, in real terms, can be described as a player at the high level of a franchise’s minor league organization. This player can be called up to the major league club to “replace” another player with little or no cost to the organization. If a player on the team is performing below this level, he shouldn’t be on the team, because he can be readily replaced at a lower cost.
Bichette was performing so far below the replacement level in 1999 that he cost the Colorado Rockies the equivalent of more than 2 wins over the course of the season.
There are many reasons for this, but let’s talk about the RBI first. The RBI is a statistic with good intentions, but those intentions cloud the truth that the metric is not particularly effective at measuring value. Teams win games by scoring runs, so it would make sense to identify not only the players that score those runs, but also the players whose performance at the plate allow those players to score runs. The problem is that in order to get an RBI, unless you hit a home run, it requires context that is outside of the batter’s control: someone has to be on base.
Bichette batted with a lot of players on base during the 1999 season. The on-base ability of his teammates allowed him to balloon his RBI totals, in spite of being roughly league average at the plate. Bichette’s on-base percentage was below league average. His home run total and slugging percentage were high, but he played his home games in Denver, where the thin air notoriously inflates power numbers.
However, teams also win games by preventing runs, something that Bichette did extraordinarily poorly. By defensive metrics, Bichette was the worst defender in all of major league baseball in 1999.
The end result is a complete picture of Bichette’s 1999 season, which stands in stark contrast to only looking at the RBI. The RBI is a statistic that requires accuracy, but it’s a perfect example of why data requires quality and governance, in addition to accuracy.
Do You Know all the Numbers?
Baseball is a game, but it’s also a business. They didn’t know it at the time, but Colorado paid Bichette a veteran’s salary for production that they perhaps could have gotten from someone making the league minimum. Think about your business. Think about all the roles, projects, programs, investments. Do you think that all of those entities are delivering positive value? Do you really know how much value every decision is delivering?
This was a challenge faced by one of our clients. The client had a project that was great in theory. It was a good idea, and good ideas deliver value, right? As demonstrated by the RBI above, knowing a number isn’t enough. Trusting a number requires an understanding of what it means and from where it originates. Otherwise, it’s just a number, and it can make you look foolish. In our client’s case, the number that they were using seemed foolproof.
The client had a Short Messaging Service (SMS) program, where a text message would go out to their customers, with the hope that the message would either prevent the customer from making a call that would cost the client money (e.g. the message would contain an appointment reminder) or generate a call that produces revenue (e.g. the customer calls to make a payment). This sounds like a no-brainer. The concept is solid enough that one could be forgiven for assuming it can’t miss, much like the RBI.
Kenway doesn’t believe in making such assumptions, so we performed a deep analysis of the data, merging SMS databases with Interactive Voice Recognition (IVR) call system databases, looking for call prevention and/or generation. Kenway evaluated a cost calculator that combined labor and materials costs associated with the desired benefits of the SMS program, along with efficacy of the initiative altogether. In doing so, we could understand the data and assign precise cost-based values to every message sent from client to customer. Compare this to baseball, where you can combine context, home-field advantage, performance in other facets of the game, etc., to create a single metric that measures a complete view of a player’s overall value to his team, rather than using one number (e.g. RBI) that sounds good in theory.
Kenway’s analysis confirmed some of the organization’s return-on-investment assumptions, but was also able to shed light on initiatives which may not have been providing the benefit needed to justify their continuation. Then, not only was Kenway able to perform this analysis, but we also implemented a new process that allowed the client to continue to perform audits on their own for as long as the project would continue. The end result is an organization that runs more efficiently and has more accurate and detailed insight into the projects that it is funding.
As for the Colorado Rockies, they lost 90 games and finished 12.5 games out of the playoffs. Short of replacing Dante Bichette with the ghost of Babe Ruth, no recommendation was going to save them.
If you have numbers, but you’re not sure about their meaning or effectiveness, we would like to hear from you at firstname.lastname@example.org.