Sports competition has changed a lot in recent years. And it still has a lot to evolve. The global sports data analysis market was estimated at 774.6 million dollars in 2018. Growth expectations were estimated at 31.2% per year at that time, between 2019 and 2025. For these calculations, it is taken into account that big data in sports is used above all to obtain intelligence from rivals, put together specific strategies and attract talent.
With this perspective, the analyst firm Grand View Research set the market volume at around 1,000 million dollars for 2020. For the year 2025, the number would rise to 4,589 million. These are medium-term estimates, but today the collection of data on sports performance and its treatment has become key. David R. Saez, CEO of Sports Data Campus, sums it up like this: “Big data and advanced analytics are revolutionizing the world of sports and the profile of the data analyst with big data is becoming essential in many sports entities, always in order to find improvements and competitive advantages”.
Daniel Perez, a soccer analyst, agrees with this approach. His personal story reflects the growing importance of big data in sports. A career engineer, he worked as a data analyst for more than a decade for multinationals. He left this path to do a 180 degree turn. He decided to apply everything he had learned by processing data to one of his passions, sports, more specifically soccer. “What the use of big data does is optimize performance around the industry, in this case, sports. It allows you to have details and information that you were not able to control before. And if you are able to make good use of big data, you will have added value at your disposal”.
The sports that use big data the most
The movie Moneyball, released in 2011 and starring Brad Pitt, is widely applauded as reflecting the start of big data in sports. The action takes place in 2001, in a team from the American baseball league. The analysis techniques used to create a balanced team on a tight budget were pioneering.
Hence, baseball is one of the sports where analysis techniques have most penetrated. Deloitte has calculated the penetration of sports big data in the United States, where this technology has been adopted the fastest. In Major League Baseball (MLB), 97% of teams use professional analytics and consultants. In the NBA, 80% of sports franchises use it, while in the NFL (National Football League) the percentage is significantly lower, 56% of teams use data analysis techniques.
Sports big data in Spain
The sporting idiosyncrasy in Spain is very different. So it is not surprising that the sport where it is used the most is another. “In Spain, as in many parts of the world, football is the sport in which big data techniques and strategies are being applied the most, both for aspects related to what happens on the field of play and for those more related with management or marketing”, explains Saez.
The indisputable dominance of football in the field of sports shows does not mean that other disciplines do not appear. Saez mentions basketball, cycling, tennis, paddle tennis and rugby as rising stars in the use of big data.
And of course badminton. Telefonica is, in fact, the technological partner of the Olympic champion Carolina Marin. In this sense, a close collaboration is established with the technical team of the Spanish athlete and the data unit of the teleoperator for the use of advanced analytics in relation to competition and training data.
What is big data used for in sport?
Sports big data applications can be tracked both on and off the pitch. Tactics and strategy are a fundamental part, but so is the care of athletes and signings.
Know yourself and know the rival
One of the aspects that has been key since the beginning of big data in sport is the creation of a data-based strategy. “Multiple options are presented essentially related to the analysis of the own team and the rival, commonly known as competitive environment. In this sense, multiple metrics are used, some more personalized and others less, which serve to objectively describe game models, systems, space occupation and, of course, squad and player characteristics”, says Saez.
The CEO of Sports Data Campus points out that video analysis is gaining more and more interest. There are many reports that rely on viewing video images to draw conclusions. The new image analysis technologies allow us to obtain increasing information from these documents.
Daniel Perez focuses on individual sports, normally easier to analyze. “In cycling, for example, [big data] is widely used to measure individual performance live. In this way they can try to predict the efforts that the cyclist needs to make at all times to achieve a certain performance. Performance in cycling is somewhat simpler than in soccer. For this reason it is much more effective today.”
Avoid player injuries
The care of sportsmen and athletes is another of the important applications of data analytics. “It is used in order not only to extend his sports career, but also to minimize the risk of injury,” says Saez, referring to football in this case.
Here, the information comes from very diverse sources, which influence the health of the athlete. “Sports, biometric, physical, genetic, and chemical data are gathered. And they all take advantage of it to design training models with personalized load management, with the aim of preventing injuries, especially those caused by muscle overload”.
Both in football and in other sports, the role of advanced analytics in transfers stands out. According to Perez, in the king of sports it is currently the main application: recruiting talent.
“In soccer you can have a player with certain characteristics that has worked well in your team, that is, in a context where he is related to 10 other players on the field,” says Perez, not without pointing out that the rival also plays a paper, which deserves separate analysis. “If you have to replace that player you can try to find someone with similar characteristics to reduce the margin of error. Big data helps you find that right player by making use of the right variables”, he concludes.
Data measurement tools and devices
To collect all the necessary information, from the competitions and the performance of the players, different techniques are used. “The data can be obtained both internally, which would be generated by the entity or sports club itself, and externally, through external analyzes or broadcasts in the media,” says Saez.
The CEO of Sports Data Campus stresses that the information is obtained from IoT devices, sensors and also from mere observation. “The most widely used range from wareabels, such as watches and bracelets, to vests that can integrate both GPS and biometric devices, which measure heart rate, blood pressure or other parameters. Other devices are integrated into the shin guards and, in addition to carrying GPS, in these cases they usually integrate accelerometers that provide data related to accelerations, hitting power, hitting leg”, explains Saez, always referring to soccer.
“At stake, applications such as Mediacoach from LaLiga integrate tracking data plus eventing data. This combination is technically known as RAW Data and is an inexhaustible source of work for the sports data scientist”, Saez delves into the difference between these two types of information: tracking and eventing . “In the case of MediaCoach, the tracking data is collected through the optical cameras with which all the LaLiga Santander and LaLiga Smartbank stadiums are equipped. The eventing onesThey collect the actions that are generated with the ball, such as goals, passes or corner kicks, as well as all the distinctions that can be made from the events themselves”. With this last sentence, the expert refers to events such as long passes, short passes, passes that break the pressure line and any other category that may be of interest.
Although many sports share some measurement tools, there are also others specific to each discipline. “In cycling, bikes are real computers and, for example, they measure the watts of power used in competition by the cyclist”, says Perez. “They are capable of quite correctly predicting the energies and the time that a certain athlete can spend pedaling with a certain power. With that information, you can help the cyclist with the race strategy.”
After collecting the information about the game and the performance of the players, it must be processed. This is another crucial part, as it defines what matters and to what degree. At the same time, this phase is where the weight of factors that are not so fundamental to the final result is moderated.
“The cleaning and treatment of the data is usually done through programming languages such as R or Python. PySpark stands out for the design or use of machine learning models and algorithms, focused on the design of analytical, predictive or artificial intelligence models”, says Saez.
Once the treatment has been carried out, depending on the user who is going to work with the results, these are presented in one way or another. “ There are multiple tools for the presentation of reports or the design of dashboards. The part known as ‘visualization’ is key for the process to be efficient”, says Saez.
On many occasions this visualization is aimed at people such as coaches, sports directors, physiotherapists. In short, people who have to make decisions about certain aspects of the competition and the athletes. But they are not specialized data profiles, so the information has to be transmitted in an easy-to-see way. “In my view, the two most widely used tools to work on visualizations in the world of sports are Tableau or Microsoft PowerBI,” says Saez.
Informed decision making
From here is when the conclusions drawn from the entire process come into play. They become a factor for decision making, one that is increasingly relevant. “ Coaches, scouts or sports directors are increasingly using big data tools in their decision-making processes”, says Saez. But data analysts are not separate from these sports professionals. “It is essential that the data scientist who works in the sports environment have clear concepts of the game because the analysis of it, from the genesis phase, will have much more value.”
After all, we must remember that big data is used as a complement to the knowledge of sports professionals. Its use is increasing but still uneven between different disciplines. Perez distinguishes especially between individual and team sports. “In an individual sport you have fewer variables to analyze, the environment is much more controlled. In a team sport the possibilities multiply. The key and the complexity is the interaction. Hence, it is more difficult to analyze ”, he maintains.