Research Article

Published: 19 July 2018

Authors: Tom Decroos, Jan Van Haaren, Jesse Davis

https://research.ebsco.com/c/462pgp/search/details/gacfv5vnrb?limiters=None&q=Automatic Discovery of Tactics in Spatio-Temporal Soccer Match Data&searchMode=boolean not peer reviewed


Article Aim


The Dataset & Approach

The dataset consisted of english premier league teams from the 2015/2016, with the data being manually collected by human annotators using video footage and annotation software. It contained 652,907 events and 39 different types of events recorded, with an average match having 1,700 events. Each event recorded would include the timestamp during the match, the location on the pitch by x and y axis, the type of event, the player and any additional information.

The most common event types were pass (368,426), Ball (Out) (48,046) and ball recovery (41.448) however these were then broken down further into special events such as pass has the subtypes cross and corner.

The approach taken to help improve how the data was processed and anaylsed is as follows:

  1. Dividing event sequences into phases
    1. This would be identified as when a specific team had possesion of the ball or there was a build up of play. A new phase would begin when the other team regains possession or there is a pause in play
  2. Phase Clustering
    1. Cluster phases by spatio‑temporal similarity to group similar attacking patterns. This reduces thwe space to be anslysed further down the line.
  3. Cluster Ranks
    1. Ranking their relevance to the overall game/ importance to the coach moving forward
  4. Pattern mining
    1. Identifying within each cluster who frequent patterns occur using an algorithm
  5. Pattern ranking & interpretation
    1. Ranking these patterns in order of relevance to use going forward, the length of these patterns and the importance of them in relation for future games. The top patterns will then be validated and potnetially used going forward