“It is very interesting to take a machine, teach it tennis and have it start to talk to us in real-time and be able to provide our fans with that, as it is happening, in context, in real-time.”
– Murray Swartzberg, ATP SVP IT and Digital Media.
This was a very clear overarching message laid out for the ATP-Infosys program but it involved making the machine learn tennis first and then to have it generate the right content/insight at the right time, without being repetitive.
ATP has been collecting game data via the chair umpire console since 1991. So, why reinvent stats and scores for a game that has changed little since the 1990s with respect to data collected and statistics generated during a game? Turns out, with the proliferation of data coming from external sources and connected stadiums, it is now possible to provide a much more engaging experience to tennis fans all over the world. It is no longer limited to numbers or win percentages. The consumption models for tennis fans have changed dramatically since the 1990s and 2000s with fans now watching the game on mobile devices and sharing their opinions on social media. The consumption model has evolved for players and their coaches too! Players can now wear devices that can track their service games, movement and provide a personalized coaching plan. There is need to present contextual, real time insights and tell the tennis story with visuals and videos.
The first step was to get the brain behind the machine to learn the game of tennis, so the insights that would follow were not mere regurgitation of ingested data.
ATP already had 26 years’ worth of data collected from 1991 and enhanced statistics data from 2015. We started the journey by first taking 5 years of ball tracking data and 1 year of chair umpire data from the world tour finals. 5 years of ATP World Tour Finals data from Hawk-Eye were also studied for serve placement, winner ratios on forehands and backhands, spins and speeds on serves and shots, etc. and interesting insights were shared with players, coaches and fans. Working with a treasure trove of such data, we were able to bring forth a range of new features for tennis fans and experts.
Infosys ATP Trends
Using the Infosys Nia platform, top 3 pre-match insights – first serve return percentage, double faults and forehand placement – were shown at the ATP World Tour Finals 2015 in London. Parameters such as holding serve and breaking, and double faults at key moments were studied yielding point-by-point insights.
The graphic below was one of the first analyses done on ball spin for Top 8 players.
It showed Nadal had an average shot RPM of 2,597 in 2013 compared to all other Top 8 players.
At the Sydney International 2016, trends were re-imagined only using chair-umpire data, and players from any tournament could be compared against the Top-8 for all data from 2015 January updated till the week prior to the tournament kick off. We found for example, that Troicki serves up an excellent Aces to Double Faults ratio, even better than anyone amongst the Top 8 players!
In April 2016, ATP Leaderboards were created with a new ability to mine insights. Advanced statistics/indexes similar to other sports like NBA, NFL, and MLB were created with 25 years of data from 1991 and putting it into a consumable form to answer 3 simple questions: Who is the best on serve, return and under pressure? The leaderboards also help fans drill down to the details to see individual statistics like 1st serve % won, etc.
We had already deployed an Analytics portal for bloggers and commentators at the Sydney International 2016, providing comprehensive ability to slice and dice the data. By the end of the ATP season in 2016 we were able to go further with a live commentary module that was provided atop the Infosys Nia Data capability. It provided real-time commentary and insights during the Barclay’s ATP World Tour Finals 2016. Real-time data ingestion, statistics calculations, and textual/graphical insights generated by Infosys Nia proved to be game changing capabilities. The module was hosted live and fans were excited to see it for the first time in the game of tennis with just quick trip to the official website! The Live Commentary feature is visible across all the pages and the scores section. This was a big step in trying to make the machine not only do story telling but tell the most relevant story during a point in the game. The machine was now able to automatically generate text commentary along with basic insights. This was further enriched for the ATP World Tour Rome Masters 2017, where we deployed an extended message scores piece. This feature overlays the right graphics to supplement facts and provides a more engaging medium for fans to consume the information.
With all these new features enabled for tennis fans and experts, the storytelling aspect during and after a game had evolved on the ATP website with a more visual approach and with more engaging and consumable pictures that were more easily understood by fans and shared on social media. New ways of looking at old statistics were yielding deep insights. It required work to train the machine to sift through the statistics and come up with insights that would be most relevant for a fan. Infosys Nia had served up an ace.
With Infosys Nia, we had multiple options to acquire the data (both streaming as well as batch) and had various options to store the data. For most of the structured data, there are a number of relation stores available out-of-the-box, including MySQL and Postgres. For unstructured documents, news and social media based data there is provision to use a Hadoop file system and for several other read performance efficient scenarios there is HBase and Cassandra available. With a variety of data stores, there is no limitation on having a fixed schema for the data coming in or the need to have additional processes aimed at transactional integrity for all data. The querying part is made extremely fast by leveraging Spark SQL and/or using Cassandra based queries.
Key portions of Infosys Nia have been used for developing the technology behind advanced statistics and insights. The technology however is not one-size-fits-all. The stack provides inexpensive and redundant storage with fault tolerance and high availability. The real value provided was to mine through the information once it was collected, cleaned and pre-processed.
So what is the future of tennis scores and stats and what are the next steps and ways that fans, bloggers, sponsors, and coaches can use this information? The core Infosys Nia platform can be leveraged for multiple aspects including acquiring data, analyzing, and presenting it. Infosys Nia is capable of working with advanced machine learning algorithms which can be leveraged to predict outcomes of various scenarios pertaining to matches. With a data platform and an adaptable engine at the core, it is now possible to engage the entire tennis ecosystem and to drive synergies in harnessing the data for more effective story telling.
These capabilities position ATP World Tour to keep evolving the art of telling the story of tennis as it unfolds. In the words of Murray Swartzberg, “Our sport is a story and the better you tell the story, the better it is for the sport and the fans.”