How the cloud gives Major League Baseball a new world of stats
When an outfielder guns down a runner at third base, how fast was the ball thrown When a batter smacks a fastball out of the ballpark, how far did it travel and what was the arch of the ball Where does that home run rank in terms of others hit this year Did the outfielder making that incredible catch take the most efficient route
These are the types of advanced metrics MLB now provides thanks to the league’s Advanced Media division - a subsidiary of MLB. A team of more than 500 developers at BAM (Baseball Advanced Media) has worked over the past two years to create StatCast, an app that connects this advanced information to fans and broadcasters.
The app uses a complex system of game cameras with cloud-based computing to analyze plays in real-time, and give fans an experience they’ve never had before.
+MORE AT NETWORK WORLD: Hottest products at AWS re:Invent | Amazon’s cloud conference – by the numbers +
Developing the StatCast system has been a multi-year process, says Joseph Inzerillo, Executive vice president and CTO at Major League Baseball Advanced Media.
The first issue for BAM is to capture data. To do so, BAM uses a combination of a radar system that was originally developed to track missiles along with a series of other sensors that have been installed at all 30 MLB ballparks around the country.
“It’s a very small ball, and a very large field,” Inzerillo says.
If a big play happens in the game then a BAM official calls on the system to ingest data from around the stadium. Some sensors track the ball, while others focus on a two-dimensional plane monitoring player movement. Raw coordinate data of the players, along with the pixels from video replay are compressed into a file that is sent to AWS’s cloud.
BAM worked with the Polytechnic Institute at New York University to develop software called Baseball Metrics Engine (BME), which converts that raw data into the stats displayed to viewers.
BME runs across various services in AWS such as Elastic Compute Cloud (EC2), Simple Storage Service (S3), as well as other advanced tools like Lambda and Kenesis, AWS’s event-driven computing engine and real-time streaming service. The EC2 engines run various algorithms on the data and through BME spits back out statistics and graphic renderings. All that is sent back down to the ballpark that requested the information and it is shared on television broadcasts or on MLB.com. The entire process is done in about 12 seconds, or just before the next pitch is thrown.
Inzerillo says using the cloud for StatCast was a no-brainer. Not just for offloading the complex and unpredictable computational capacity that is only used a couple times per game, but also in developing the app.
When BAM started creating the app they really didn’t know what it was going to turn into. By running it in the cloud, it allowed the 500 developers to test new concepts at low costs. “To do that and (not have to) build a bunch of on-premises systems that may or may not have been what we needed was great,” Inzerillo says. “The cloud gave us that ability, and specifically AWS gave us that capability, because we were able to knock up services much faster than if we were jumping up services ourselves.”
Now BAM has inked a partnership with the NHL to explore how advanced stats and information can be used in hockey too.