Free Novel Read

Big Data Baseball Page 5


  As early as 2004, BIS sold data-based recommendations to major league teams to optimize defensive positioning. While the majority of the industry was curious, few of the recommendations were reaching the field. Why?

  “Because when it doesn’t work, it really looks bad,” Dewan said. “It goes against conventional wisdom of playing infielders where they have always played for so many years. It’s hard to break tradition. But by doing the analytics you can see that it has value. You can see the runs you save when the shift is on, the percent of ground balls that are fielded when the shift is on.”

  When a team placed three infielders on one side of second base and a hitter beat the shift, intentionally or unintentionally, to the lightly defended opposite side of the infield, it looked really bad and it upset pitchers. It made infielders question why the heck they were shifting, and it tested the conviction of coaching staffs.

  Still, here was a glaring inefficiency to be tapped into. In October 2012 teams were still only converting 30 percent of batted balls into outs, a rate relatively unchanged since the beginnings of the pro game. By 2012 the Pirates had only slightly increased their use of shifts and remained below average in defense. According to defensive runs saved, the Pirates were -77 in 2010, -29 in 2011, and -25 in 2012.

  The Pirates fielded mostly athletically challenged defenders in traditional alignments, and the numbers demonstrated a stunning inefficiency. They had shifted their infield defense only 84 times in 2010 and 87 times in 2011. As late as 2012, Hurdle was still not buying into wholesale shifting, but perhaps an accumulation of data and indisputable evidence would sway him.

  When Huntington first entered the limestone façade of the Pirates front office, which rests just beyond the left-field bleachers at PNC Park, there was a glaring absence. Four years after the publication of Moneyball, fourteen years after the Mosaic Web browser was released that popularized the Internet, and some thirty years after personal computers began to proliferate and Bill James began publishing his Baseball Abstracts, the Pirates were still in the digital dark ages, lacking an in-house analytics department and proprietary database. Not one employee or proprietary system was devoted to analyzing or processing data.

  The metrics used by the Oakland A’s in Moneyball were relatively rudimentary by today’s standards. The A’s took advantage of the industry’s not properly evaluating on-base percentage, a stat that was easily found on a player’s MLB.com profile page or a college prospect’s Division I athletic site. But now, because of new pitch-tracking technology, computing power, and detailed-data providers such as BIS, baseball was entering an entirely different age, with the amount of data growing exponentially. If a club fell behind in the data-analyzing game, it would be exponentially harder to catch up.

  With so much new data, from thousands upon thousands of batted balls to hundreds of thousands of pitch outcomes, locations, and speeds, the information could not be so easily quantified, processed, or made sense of. It required computer science experts and mathematical minds who could create algorithms and were fluent in database programming language. Analytics-savvy front offices wanted data as raw as possible, so they could spin it into their own proprietary measurements, values, and advantages.

  One of Huntington’s first priorities was to find a data architect to build a database. During his time with the Cleveland Indians the team had created DiamondView, one of the game’s first and most comprehensive databases. With a few mouse clicks the Indians’ baseball operations staff could retrieve statistical trends and projections, scouting reports, injury history, and contract status on thousands of professional players. To create such a system Huntington first had to find someone who could build it. He found Dan Fox.

  * * *

  You would not know Dan Fox was a formidable force in the Pirates front office if you brushed past him during lunch hour on Pittsburgh’s North Shore, where PNC Park is located on the banks of the Allegheny River. Wearing wire-rimmed Oakley glasses and dressed in nondescript khakis and golf shirts, he is mild-mannered and speaks clearly, concisely, and quietly. His most distinguishing characteristics are his height, his lanky frame rising well over six feet, and that he shaves his head completely bald.

  Fox was born in Davenport, Iowa, in 1968 and raised in Durant, Iowa, a small, idyllic Midwestern town that’s a square-mile grid of ten avenues by ten streets and surrounded by cornfields. While Fox’s mother tried to give her two sons, Dan and David, a well-rounded education and a childhood rich with experience, from school plays to participating in the band, Fox and his older brother, David, were obsessed with baseball. They organized games at the neighborhood park, watched the Cubs on Sundays at their grandparents’ home, and, like Dewan, spent hours upon hours playing Strat-O-Matic. Like Dewan, Fox was influenced by Bill James’s writing and research on finding undervalued players and creative strategies.

  The first thing that intrigued Dan Fox was James’s examination of left-right splits, how a player performed against left-handers versus right-handers. You couldn’t find this information anywhere, it didn’t exist, and nobody published it. So James did it himself. He went through The Sporting News and calculated the left-right splits and some other splits, such as home-road and day-night, and published them in his Baseball Abstracts. Fox thought it was revolutionary and was astounded and fascinated that someone had identified a new statistical subgroup upon which important decisions could be made.

  Fox’s other great interest besides baseball was the personal computer, and he had early access to one. His father, a banker, had purchased an Osborne personal computer in the early 1980s. The Osborne, one of the first portable personal computers, weighed twenty-three pounds and was about the size of a suitcase. On that machine Dan and his brother learned how to write computer code and were soon using it to analyze Strat-O-Matic cards.

  Dan attended Iowa State University, where he majored in computer science. After graduating he worked at Chevron and then in the mid-1990s took a job with Quilogy, a Kansas City–based consulting company. He quickly became known not only for his sharp mind and smart presentations but also for his teaching ability. His being articulate and able to process and make sense of numbers gave him a uniquely valuable skill set. His brother, who runs the analytics department for AMC Theatres, attended several of Dan’s seminars on database technology. He was surprised at how simply yet effectively Fox explained complex ideas and applications to those who might otherwise be overwhelmed.

  “People sometimes see IT guys as really down deep in the numbers, but he has a way of getting a thousand-foot view,” David told the Pittsburgh Tribune-Review. “That’s what really helps you communicate. He can grasp the big picture as well as the details.”

  Those that have worked with and know Dan Fox say that he acted as a sort-of translator for customers. He could take all the jargon and the complexity of the tech side and explain it in straightforward terms and relatable analogies. His peers simply did not see many people in the tech industry who had his communication ability.

  Fox understood what one of the first computer programmers, Grace Hopper, knew to be true. Said Hopper in an interview recounted in The Innovators, “It was no use trying to learn math unless [you] could communicate it with people.”

  Because of this innate ability and noting how Fox loved to play around with data and communicate its findings, a coworker suggested he try blogging about software development. Intrigued, Fox created a Web page in 2003 and wrote two blog entries on software development before authoring a piece on baseball. After that he never again wrote another blog entry about software development.

  The blog turned into a baseball-first hobby and attracted attention. In 2003, the Web site TheHardballTimes.com, a hobbyist site, offered Fox a regular writing position. While it didn’t pay, it broadened his exposure. The platform allowed Fox to impress a fringe audience of smart baseball fanatics outside the game. By 2006, Fox, a devout Christian, had moved his family to Colorado, where he began working as a data arc
hitect for Compassion International, a child-advocacy ministry. Fox helped streamline disconnected systems, and the organization grew from $300 million to $800 million per year in donations during Fox’s tenure. Then BaseballProspectus.com writer Will Carroll approached Fox and offered him a paid position with the site, one of the finest and most influential sabermetrics Web sites.

  Fox titled his regular Baseball Prospectus column “Schrödinger’s Bat,” a nod to physicist Erwin Schrödinger’s “Schrödinger’s cat” thought experiment about the conflict between behavior of matter at the particle level versus the behavior of matter observable to the human eye. Fox explained the goal of his writing in an article titled “Wins and the Quantum.”

  “[Schrödinger] made people think deeply about what they knew or thought they knew about the nature of reality itself,” Fox wrote. “And while I’m not pretending that baseball has anything profound to say about such matters (it is, after all, just entertainment), I do hope that through this column, at least now and then, we can devise clever experiments that put to the test both conventional and sabermetric wisdom and help us think more deeply about our shared distraction.”

  Fox wrote a hundred articles for the Web site, focusing on measuring what up until that point had not been accurately quantified. “[Bill James] commented in one of the Baseball Abstracts once that everything we need to measure in baserunning is already known, it’s just that people haven’t gone and quantified it,” said Fox, who then attempted to quantify baserunning. “Then I wrote a defensive system to work with play-by-play data and wrote a bunch of articles about that.”

  BaseballProspectus.com still uses Fox’s formula to evaluate baserunning. He used pitch-tracking data to visualize the three-dimensional shape of Barry Zito’s and Rich Hill’s big-bending curveballs and Derek Lowe’s and Roy Halladay’s sinking fastballs.

  On January 10, 2008, from his Colorado home, Fox investigated defensive shifting theory in an article for BaseballProspectus.com titled “Getting Shifty.” For his project, Fox employed Retrosheet play-by-play data. University of Delaware biology professor David Smith began Retrosheet in 1989 as an effort to capture all historical play-by-play data, including data recorded prior to Project Scoresheet, which began tracking more detailed play-by-play data in the 1984 season. Fox examined batted-ball distribution for all left-handed major league hitters from 1956 to 1960. He found left-handed hitters were twice as likely to pull the ball to right field (48 percent of batted balls) as they were to hit to center (24 percent) or left (28 percent). He found Yankees star Roger Maris was the most extreme pull-hitting lefty of the era—batting 82 percent of balls to center or right, but was never known to have been shifted upon.

  On January 24, 2008, Fox unveiled a defensive evaluation system he called Simple Fielding Runs Version 1.0, in which he created a defensive metric to measure individual player performance by spinning thousands of data points from Retrosheet play-by-plays into a single number to evaluate each major league player’s defensive performance. Wrote Fox of his defensive system, which he created in few hours, “For the software developer in me, the interest in projects like these not only lies in the results (just how bad a fielder was Dick Allen?) but the process (actually the code) through which those results are generated.”

  Now those inside the industry were beginning to take notice. Three months later, Fox’s next job opportunity came from inside the game. The call came from the Pirates, who had been intrigued with Fox’s work at BaseballProspectus.com. Fox had spoken briefly with several teams, but this was his first serious phone call. Huntington wanted Fox as an analyst. He appreciated the way Fox thought about the game and how he quantified things that had previously never been measured. Huntington liked not only Fox’s mind but the way he spoke and communicated. Huntington wanted to marry objective thinking with subjective opinions from coaches and scouts. But the limited resources of the Pirates meant they could only hire one full-time data-related employee in 2008. Fox would have to take on both roles: that of data analyst and system architect. He would have to build the data-gathering and organizational structure he required to do the analysis and investigations that fascinated him. He agreed to take on the challenge, coming aboard early in 2008.

  Fox would have to play catch-up to build such a system from scratch. The first year, 90 percent of his time was dedicated to building the club’s database, which was called MITT, an acronym for Managing Information, Tools and Talent. He had to learn where all the data he wanted originated from, purchase the rights to use the data, write software, and integrate all of these elements into a system. Only then could he begin asking the interesting questions and enjoy the pursuit of answers.

  But Fox was not the only Pirates employee interested in asking questions and challenging tradition. In 2008, minor league infield instructor Perry Hill had begun his own experiment in altering defensive alignment. Based upon his experiences, Hill thought infielders should play hitters more aggressively toward their pull-side. He began positioning minor league shortstops and second basemen deeper in the hole, meaning shortstops played nearer to third base on the left side of the infield against right-handed hitters, and second basemen nearer to first base on the right side of the infield against left-handed hitters. Hill insisted the alignment be uniform across the entire minor league system of the Pirates, and to ensure this standard throughout, he pounded eight pegs made of sawed-off portions of PVC pipe into the ground in the infields of the Pirates’ minor league stadiums. The tops of the pipes were at ground level and were visible only if you knew where to look. These markers would guide the four infielders into the new positions against right- and left-handed batters, and these positions were not to be deviated from.

  Kyle Stark had granted Hill this freedom to explore. Never shy about considering an out-of-the-box idea, Stark found himself in the midst of controversy late in the 2012 season when he was criticized for his part in having minor league prospects engage in some Navy SEAL–style training exercises that left one of the club’s top prospects with a minor injury. Ownership condemned the military-style practices. Then on September 20, in the midst of the Pirates’ second straight last-half-of-the-season collapse, passions in the fan base were inflamed when an unorthodox, motivational e-mail from Stark to minor league coaches and development staff was leaked to the Pittsburgh Tribune-Review and later picked up by national media outlets. But Stark’s willingness to depart from traditional practices was key in allowing Hill to conduct his experiment. And its results were intriguing. The number of ground balls the Pirates’ minor league affiliates had converted to outs had slightly increased in 2008. Stark wanted to know why, so he sought out Dan Fox.

  Fox had just spent nearly all of 2008 creating the club’s information database. His first analysis assignments in 2008 were related to the amateur draft, which was the keystone of Huntington’s strategy for reconstructing the Pirates. Initially Fox didn’t have enough hours to dig into game strategy. But by the spring of 2009, his role was expanding, and while Hill and Stark were interested in tweaking defensive alignment, Fox was curious about a division rival, the Milwaukee Brewers. While Tampa Bay was in the vanguard of shifting in the American League, Milwaukee was the first team in the National League to consistently shift. Stark and Hill wanted to know where balls were most often put in play. Now, with the first version of the club’s proprietary software in place, and with his amateur-draft studies behind him, Fox turned his attention to shifts.

  “Our data was just not very good, not complete at the time,” Fox said. “Knowing Perry would be open to [shifting away from traditional defensive alignment] because of the idea of range, and positioning, and the idea of stakes … it fit that that would be an avenue to explore.”

  Defensive theory and batted-ball research became approved as a key project. Fox began acquiring batted-ball data from vendors and began analyzing every ball put in play dating back to 2004, when the data from companies such as BIS first became more comprehensive. He analyz
ed tens of thousands of batted balls. Like Dewan, Fox was curious in researching players’ batted-ball tendencies. But unlike Dewan, Fox was in the Pirates front office and had direct communication lines with the coaching staff and developmental staff also. He had more sophisticated tools and could analyze the data at a deeper level. He reached the same conclusion as when he’d researched cruder Retrosheet data for BaseballProspectus.com: shifts were dramatically more effective than the conventional alignment. Against certain hitters it did make sense to leave roughly half of the infield undefended and to move the infielders closer to the lines.

  After weeks of research, Fox brought compelling evidence to Stark, Hill, and Huntington and recommended the Pirates change the way they played defense. Not only did he suggest that the Pirates shift much more often, but he found that Hill’s theory was correct: base defensive alignments should also be changed. Infielders should be playing all hitters more toward their pull-side, meaning against right-handed hitters the shortstop and third baseman should play nearer the third-base line.

  The Pirates and major league baseball as a whole had played infield defense conventionally for decades, really since the professional game’s origins. What Fox was suggesting was that they throw out a hundred years of tradition and begin something new.

  “Getting a longtime, respected baseball guy [such as Hill] to say, ‘We might not be in the right spots,’ and have Dan Fox say, ‘Yeah, I have hundreds of thousands of balls in play that show you are not in the right spots’ … to have those two pieces in place was critical,” Huntington said. “To have our coaches talk about it, to see it work, to hear it work, to understand the difference it could make … it was critical.”

  The major league coaching staff was not ready to embrace these findings. Still, Stark wanted to ramp up this plan and thought that the minor leagues should be the first laboratory to test Fox’s theories.