We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

How Scientific Computing is advancing the world of Football

00:00

Formal Metadata

Title
How Scientific Computing is advancing the world of Football
Title of Series
Number of Parts
115
Author
Contributors
License
CC Attribution - NonCommercial - ShareAlike 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
In the ultra-competitive sporting world, the value of data and computing has been on the rise. When the difference between winning and losing, success and failure hinges on the smallest of margins, being able to calculate and implement optimal, timely decisions is literally game-changing, and potentially life-changing. Football, or Soccer to others, as the most popular sport in the world, perfectly illustrates the importance of computational data-empowered decisions, both on and off the pitch. This talk, accessible to beginner Pythonistas with intermediate football knowledge, provides the audience with an overview of popular applications of Python Scientific Computing & Data Science within the ecosystem of typically well-funded clubs, but emphasising the opportunities available to democratise such enabling tech more inclusively, and benefiting a more diverse base. Example takeaways are real-world use-cases involving spatio-temporal tracking & event data, Computer Vision/video AI, tactical analysis, and player insights. This talk ultimately supports the audience discovering this fast-growing domain and the role Python has and continues to play in advancing it globally.
19
Thumbnail
43:26
Computer virusComputational scienceTrailVirtual machineMeeting/Interview
Noise (electronics)TouchscreenMetric systemBinary codePresentation of a groupComputational scienceOpen sourceGenderNeuroinformatikBackup
Gateway (telecommunications)Gamma functionComputational scienceMathematicsMereologyEvent horizonChromosomal crossoverMassContext awarenessRight angleGateway (telecommunications)Group actionQuicksort40 (number)ConsistencyComputer animation
Computational scienceDimensional analysisPrime idealPlanningGame theoryOpen sourceMatching (graph theory)Term (mathematics)QuicksortEngineering drawingDiagram
Physical systemComputer networkUniqueness quantificationMessage passingGraph (mathematics)SoftwareGraphic designComplex systemTheoryQuicksortMatching (graph theory)Computer animationLecture/Conference
Graph (mathematics)Demo (music)Mountain passComputer networkSoftwareDisk read-and-write headSelf-organization
SoftwareHill differential equationSummierbarkeitPoint (geometry)MassMathematicsUniform resource locatorBlock (periodic table)QuicksortSoftwareAnalytic setVideoconferencingMatching (graph theory)Computer animation
Open sourceMatching (graph theory)1 (number)Overhead (computing)Computer animation
AngleGoodness of fitMatching (graph theory)Graph (mathematics)Message passingWeightComputer animation
Distribution (mathematics)CodeBitSoftwareWeightCoordinate systemPattern languageQuicksortCentralizer and normalizerVisualization (computer graphics)Message passingGroup actionMatching (graph theory)Computer animationDiagram
Performance appraisalVideo trackingGame theoryComputer networkTemporal logicEvent horizonVideoconferencingGroup actionAddressing modeJoystickHypermediaTape driveContent (media)Social softwareMathematical analysisPoint (geometry)Hand fanArithmetic progressionHypermediaKey (cryptography)Content (media)View (database)Computational sciencePerspective (visual)Hand fanFile formatGroup actionMatching (graph theory)Broadcasting (networking)Representation (politics)QuicksortSelf-organizationSheaf (mathematics)Point (geometry)Bookmark (World Wide Web)Ocean currentNatural numberMessage passingTwitterComputer animation
Source codeAnalytic setVisualization (computer graphics)TwitterHand fanYouTubeContent (media)
Game theoryLine (geometry)Mathematical optimizationLatent heatFitness functionTransformation (genetics)Linear regressionEndliche ModelltheorieSource codeMaxima and minimaGradientCombinatoricsLinear mapWireless Markup LanguageDecision theoryHypermediaWordAutomationBroadcasting (networking)RoboticsSystem programmingOperator (mathematics)TensorDataflowSoftware developerMachine visionGroup actionTrailMatching (graph theory)Product (business)VideoconferencingAssociative propertyOcean currentAlgorithmSoftware developerProcess (computing)Machine visionDecision theoryWave packetQuicksortOperator (mathematics)Self-organizationComputational scienceCombinatorial optimizationSelectivity (electronic)Heat transferOpen sourceRight angleField (computer science)Slide ruleBitAnalytic setVulnerability (computing)GradientMathematical optimizationComputer-assisted translationPredictabilityGroup actionPhysical systemEndliche ModelltheorieVirtual machineNeuroinformatikView (database)HypermediaPoint (geometry)Computer animation
Group actionField (computer science)QuicksortSoftware developerMatching (graph theory)AngleDisk read-and-write headComputer animation
FreewareSource codeOpen sourceRange (statistics)Cartesian coordinate systemFreewareTouch typingComputational scienceReal numberKey (cryptography)DemosceneComputer animation
BuildingMathematical analysisQuicksortStrategy gameVariety (linguistics)Analytic set1 (number)Online helpIndependence (probability theory)Latent heatBuildingGame theorySheaf (mathematics)Mathematical analysisSource codeFocus (optics)Numerical analysisGraph (mathematics)WeightSelf-organizationArithmetic meanMessage passingDemo (music)Graph (mathematics)Perfect groupGauge theoryLink (knot theory)Open sourceInteractive televisionEmailData miningArithmetic progressionBitStructural loadLecture/Conference
Lecture/Conference
Transcript: English(auto-generated)
Hello. Hi, Jen. Am I pronouncing your name right? Yeah, that's perfect. I'm sorry. Is it Vaibhav? You can just call me VB. Yeah. It's all right. All right.
Good morning to everyone who is joining us on the data science track. And I hope you all have a great day. And we have a jam-packed session of, you know, a lot of data science, machine learning, deep learning talks lined up throughout the day and throughout the week ahead. So I'm really psyched to see what you all learned.
And yeah, so now talking about the current talk. So the talk is how scientific computing is advancing the world of football. And this will be delivered by Chin, a little by Chin. She is an award winning technologist from London
with 14 years of experience in computational data innovation. And she's delivered multi-million pound value across industries, government and open source sector. She's also an open source community leader. And she founded the club for gender minorities within London meetup,
within which is a 900 member meetup. So I'm personally really psyched to know about what you talk over here, Chin. Over to you. Just a quick note for everyone else. If you have any questions, feel free to put those on the metrics channel of Parrot
so that I can collate those and then we can talk about those towards the end. All right. Over to you, Chin. OK. You can see my screen. Is that OK? OK, I'm going to see. That's fine. So hi, everyone. Welcome to your Python 2021. This talk is, of course, how scientific computing is advancing the world of football.
My name is Chenoweth Tan. My pronouns is she or her. I am female presenting. But just a note, I am in fact, I'm on binary. Just to make that clear. OK, so before we get started, a little caveat. A, it is Wednesday morning in London and they are collecting the rubbish as a bit of noise. B, I had a last minute accident with my computer.
So I'm using a backup and the audio visual is a little bit shaky. So apologies for that in advance. So if you hate sport, then this has been a really bad summer for you. I'm sorry. If you love sports, then you're welcome.
OK, so TLDR, this talk is basically going to be giving you actionable insight of scientific computing that is possible. So basically in supply scientific computing that is being adopted. So in demand and that is within the football ecosystem today.
So a lot of people probably know me or people who do know me would know me as a tech entrepreneur and data scientist. What people don't know is that I have actually been watching football, following football since I was a teenager. And actually, it was only in 2016. In fact, at this event, which was an Our Ladies London event I was running is this is when.
Oh, there's a rubbish collection. Real world here. So this is when I became aware of the STEM sport crossover and STEM as an acronym for science, technology, engineering and math. And actually, my current interests, which is really the context of this talk is.
Well, that's loud. It's about using that STEM sport crossover for social impact. So I'm going to take just two minutes just to tell you about the social impact I think is possible because I think this is actually really important. So I think you can use sport as a big opportunity to use sport as a gateway to STEM.
So this clip is from a show, Netflix Last Chance, which is basically around disadvantage used, whose sort of last shot at kind of turning the lives around is sport or in this case, basketball. What I see is basically a bunch of young people who are really data savvy. So watch this and tell me if you later if you think you can see your own right.
I just need you to read your own cross right here. Tell me what you don't like. I think I'm feeling when I see 65 percent and 40 percent. That's wrong. That's kind of where my mind is like this. I don't know if this kid I know this kid. And that's what it kind of way. Yeah, that's what I'm feeling now.
If I watch you play. Do I think you can do. All right. So now what am I thinking? Like, what's the problem? The problem is the consistency. Right. We'll see for LJ. What do you what do you think you see here, LJ? Fifty two, forty one and then you. Sixty something. What's that? Tell us that you can shoot the ball with two weak mentally when you go to line.
OK, so that's one sort of massive opportunity, I think. And then the other is about using STEM to empower sports. So in particular, I'm thinking about women's football, which a lot of people don't know. I didn't realize this actually until quite recently was actually banned for a large part
of the 20th century in some really big countries, including UK, where I am now. And in Spain, for example, doctors telling women players that they would not be able to have children because they play football. And as you can see from this picture about Morgan, that is absolutely untrue.
So opportunities there. OK, so let's get into the scientific computing that is possible. So when we kind of turn around our thinking of football or in terms of matches anyway, from something like this into this, you can suddenly see we are in prime scientific computing territory.
So the pitch itself has specified dimensions. You can basically model it as a 2D space, a plane with X and Y axis or a 3D space, X, Y, Z axis, which is of course also the height. And then you've also got the time dimension. So football as a sort of
game is very much spatio-temporal in the data and obviously analysis and such forth. So very easily, we can suddenly move into Python. And so this is a 2D UEFA pitch represented in Map.lib, but using the MPL soccer open source Python package.
So, OK, that's pitch. Great. How about some players? So if you look at the left hand graphic, so a team can actually be represented, I mean, really sort of naturally as a network, a graph network.
And on the right hand side, you can see so the teams in a match, so two teams in a match can be represented as a complex system. But actually, let's move from, you know, basically sort of purely graphics, graphic design into doing a graph theoretic Python powered pass network.
And so this is one I made myself just to show you again what's possible, even if you're not a big club organization. So I wanted to do a passing network of one of my favorite games, which was Chelsea Women's semi-final win over Bayern Munich in the UEFA Champions League. Really sorry, any Bayern Munich fans? Really quick heads up.
So, yeah, no change. Sport is still a massive pain point if you're trying to use sports analytics. So obviously the popularity of sport and then it's sort of commercialization means that really if you want to get access to data, you have to pay.
So just give you a little bit of insight. There's a really big topic, so I can't cover it all. But just again, for background, as we are data scientists for a lot of people watching. So these companies, they have their own proprietary data entry software. So people will basically kind of annotate matches based on the video footage, live or recorded.
These companies also have these GPS vests. So you can see that little small block device is the GPS device, which is obviously tracking the location data, but also monitoring other things, velocity, fitness, excuse me, things of the players, which then gets integrated together.
Now, who's actually doing this entry? A ton of analysts are being hired. So again, presumably that's, you know, they've got to make money. The data is expensive to collect. So they do, however, they have open source. Some of these big companies have open source, some days set.
But of course, they may not be the ones that you want. Even if you pay, they may not be the matches you want anyway. So my best advice here is that what is possible is if you can find a recording, hopefully of this kind of overhead tactical camera, then you can do it yourself. Although it is tiring.
Anyway, so that's what I did for the Chelsea match. And it was tiring. So I had my collected my raw data. Then I, of course, read it in pandas like any good data scientist. So this is just an extract of data so you can see the kind of columns that we have that I chose to capture, knowing I was going to be wanting to make a graph.
So using NetworkX and feeding in these weighted edge lists, which I created. So here we go. This is full back to full back Ericsson to bright. So directed edges. And that 21 is representing the passes. And then I go back into our MPL soccer pitch,
because it's really slick and thin plotting some players as nodes. And this is based on a 4-3-3 formation, which is what I understood the match the Chelsea lineup to be. So four defenders, three midfielders and three forwards plotting sort of X and Y coordinates.
They, of course, have names. So there are people who can label that we can plot just edges, no weights. So directed edges, of course. So you can race sort of start to see some patterns. But of course, the real insight is when you give them weights. So now you can really see lots of action between the fullbacks.
You can also see some really interesting distribution from this rightful back to the left midfield there, G. And then if you just want to be a little bit more snazzy again, totally possible. You don't need to have nodes. You can put images instead.
So obviously this is helpful if you know who the players are. But anyway, you can tell yourself once you've got that code to create a pass network. If you can, they're collecting more data for other matches, then really quickly you can create all visualizations.
But of course, because it's made in Network X, you can do all your applied network analysis stuff, centrality and so forth. OK, cool. But that's really just the tip of the iceberg. So you can just see here. I just took this little snapshot from recommendations in the nature portfolio.
There's a lot of current Python powered research, often focusing on matches because that's obviously where all the action happens. Also, I guess the important action, most important. Anyway, but also there is a lot more. So you can go beyond match data. So, you know, we talked about the GPS devices which are tracking, you know, obviously speeds.
Things around the players, obviously biomechanics. So lots of stuff. OK, let's look at the scientific computing that is being adopted. So, of course, as we're all aware, our stakeholders don't always adopt everything that maybe would benefit them.
So just because something is possible and good doesn't mean they're going to actually use it. So this is really interesting to see what's actually in place. So I'm going to do this section from the point of view of key stakeholders in the football ecosystem and in the format of agile user story, because that's kind of geek I am.
So we're going to look at the perspective from fans, from coaches, which therefore probably also representative of football clubs and then also from kind of wider football greater football organisations. So that could be sort of governmental, under governmental bodies.
It could be sports broadcasters like Eurosport, ESPN or kind of football federations. So as a football fan, I want to potentially do various things. So these are kind of user stories.
So I may want to watch matches. I may want to access topical football content. I may want to engage with my favourite players and teams, social media activity. I may want to bet on football. May didn't say obviously I'm not necessarily endorsing. I'm just saying that is not a unpopular activity.
So let's look at some scientific computing adoption that gives fans access to insightful content that furthers their football knowledge. I'm just going to take a quick drink because I'm going to get coffee. OK. So this is an example of sort of a popular football content blogger, content creator, Madame Al Bahana.
This is an example of a popular tweet, football related tweet about Harry Maguire, his passes for Manu over the last season. So you can see this is kind of effectively you'd say this is fulfilling.
You know, a lot of football fans use stories. You can see actually Madame only joined 10 months ago and already has twenty two thousand followers. Not that I'm jealous, but actually she is a pythonista. So I know that from this YouTube video which I watched.
And so, in fact, everything that she is making, any tweets that you see, football content tweets, which are sort of beyond the capabilities of Excel and Python, Excel and PowerPoint are, in fact, Python. Because she actually started she started doing her visualization analytics in Excel and PowerPoint.
And then in January this year started learning Python and is doing amazing stuff. Again, makes me feel I'm put to shame here. So she's using Seaborn and perhaps for the K-means clustering, I would expect that, you know, imagines like SciPy where we like it. Anyway, there we go. Python being adopted. Fans don't realize it, but Python is rocking the world.
So as a football coach, I want to do various things. So I probably want to figure out the best tactics, lineup formation, player roles. I want to continuously monitor weaknesses in the squad. I want to plan and supervise optimal training sessions and so forth.
So let's give an example of scientific computing adoption that helps coaches decide on transfer targets and team selection. So, OK, this is a little bit of a content slide. But there is a really interesting piece of scientific computing enabled research
on player chemistry or sort of really it's kind of joint performance, I would say, by Lotta Branson and Jan van Haaren, which has achieved buy-in from FC Barcelona. So this research is referenced in Barcelona or Barca Innovation Hub's latest football analytics guide. You can see on the right hand side, which promotes advances in the field.
So therefore, what they think are advances in the field. So given that basically Branson and van Haaren used quite a few open source Python packages, you know, I basically see this as Barcelona's endorsement of Python for this kind of player chemistry research,
which helps, as we said, things like transfer targets and team selection. So specifically, there is a open source Python package soccer action, which was used. They used to do some data transformation. They used the open source cat boost
package, which is a gradient boosting toolkit for the training of the machine learning models. And then they also use the Polk package to do effectively combinatorial optimisation. So basically around assembling the sort of best, best line up for a squad from a squad.
Anyway, I would recommend that you read the paper because that's it's really a lot richer than what I've just said today. But anyway, it gives you a flavour of, again, like I said, the scientific computing that is in demand. So then lastly, moving to the point of view from a football organisation. So as a football organisation, depending on who we are, we may want to predict player international performance.
We may want to automate officiating. We may want to use robotic systems for broadcasting matches. We may want to develop assistive technology that helps people who are blind or visually impaired watch and play football. And we also may want to run national media campaigns with
popular football role models, promoting participation in grassroots sport, amongst other things. So example, a scientific computing adoption that automates video coverage of matches is is actually a example from this company, Pixelot.
I have no association with them and I'm not going to the marketing for them, but they do seem to be very popular. They have popular automated sports production solutions which are used around the world, both professionally and in amateur sport, not just football. And they have developed computer vision and deep learning based algorithms that track action on the pitch.
So, for example, to operate these unmanned cameras, an example here. And there they use Python and C++ for their development, which I know because I have looked at
their they've kind of made that obvious in their current job specs for some of their algo developers. So they are using Keras, TensorFlow and PyTorch. So, again, yet again, Python sort of rocking the world of football organisations. Of course, we have to note it's not perfect.
There is room for development. So scientific computing football solutions have some imperfections. I'm going to show you one. So this is a Pixelot camera in operation.
This was a match last year in I think it was November. So in the Scottish Professional Football League. So football or bald head. Remember, their algos are supposed to be tracking action, presumably the football on the pitch.
So if you haven't seen this, it's quite funny. Not looking at the action. So hopefully you'll have seen that.
The camera thinks that basically this bald head is the football. Apparently Pixelot said that their excuse justification was that the ball used in this match is yellow. And that the referee's head sort of looks like it's sort of on field from this angle.
So anyway, you can obviously see they need to train on yellow balls. But yeah, there's obviously scope for development. And then a slightly more somber note, but I think really important that hopefully in the future,
scientific computing can help us better understand athlete health. And, you know, hopefully the really horrible scenes maybe some of you saw in the Euros with Christian Erikson. Hopefully we won't see that again because scientific computing will have helped us better predict athlete injury.
So in conclusion, hopefully you've seen that Python is powering real world tech solutions that deliver value for many key football stakeholders. And there is huge scope, I think, for sports in general to adopt scientific computing that fulfills a range of user stories.
And access is not entirely limited to those with big budgets, although clearly it does help. Free and open source resources that enable a range of DIY do it yourself applications of scientific computing sports are available. And finally, please do get in touch regarding any interest in collaborating on STEM sport social impact initiatives.
And thank you very much. That is the end of my talk.
All right. That was a that was a fantastic talk. I really loved it. So thank you so much again for your talk. I'm just going to put a couple of questions up and however many we can answer, we will answer those. If you're not able to answer those, then we we basically take them over to the breakout room in Barrett.
So first things first, which I really which is also a personal question of mine is what is the most challenging bit of this project? According to you, you mean the sort of doing sort of football data science? Yeah. Yeah. If you're not a if you're not a really big club who has lots of money, you know, it's it's very much it's data.
It's always the data, because I mean, yeah, I think I was just trying to lead to. So it's like, you know, it's a really difficult like it's very difficult enough just to have access to watch games, you know, the popular games.
So, you know, you can see obviously because they're trying to get you to pay to watch games because they know how popular they know people will pay. So and then, of course, that means that even if you you know, you are willing, for example, to do the data collection yourself,
which I am willing to do again, the geek that I am. But, you know, if you if you can't find if someone doesn't make a recording, you know, publicly available, which of course, you know, if they think they can make money from it often, they won't make it publicly available.
Then that's kind of it. You know, like you really have no other option. So, I mean, I don't really I don't have a solution to it. But yeah, I think that is I think that's the thing. You know, it's that the scientific excuse me, the Python, you know, all the tools are there, but we just don't have necessarily as, you know, a kind of independent researcher, for example, you just don't have the tools.
So, you know, I think that's that's I mean, not a plug. You know, I'm really hoping to, you know, I'd really love to collaborate with some like big clubs who basically own the data. For example, you know, if they're doing outreach and then, you know, they're kind of people who want to get involved, you can sort of help on the kind of innovations that are aside.
You know, then I think it really is better for everyone. So, but yeah, hopefully things are changing. I mean, some of these some of these organizations are data collecting vendors. I would call them data vendors. You know, some of them come from like research backgrounds.
So they're kind of more aware that the community can't innovate if they don't give them any access. So the some of these companies have made open source data sets, have open source data sets. But again, you know, it's not necessarily the ones you want or, you know, but anyway, it's in progress.
I think is the answer. Perfect. All right. So on to the next one. Do you have any recommendations for any analysis or papers which work specifically around the strategy building? But it's a strategy building. Oh, I see.
OK. So, yeah, like football strategy. Yeah, football strategy. I actually don't have any specific recommendations because it's actually tons. OK. So, OK. So I wouldn't recommend any specific paper. I think what I recommend is I'd recommend sources of places where you can find tons of references. So the MIT Sloan Sports Analytics Conference, which is really popular, obviously in that community.
So they have they basically have like a research paper competition and loads of those papers are around the strategy, the tactics, you know, sort of offensive strategy, defensive strategy for football.
But because it's a sports analytics conference, it's for, you know, a variety of sports, baseball, basketball. So, I mean, that may or may not be interesting to whoever asked the question. But, yeah, I'd say MIT Sloan Analytics strategy. So Barcelona, I'd say actually they seem to be the ones who are doing a lot in the data field.
So that football analytics guide, which is available online. That definitely has the latest version, definitely has, you know, large sections voted strategy because, of course, that's what the focus is.
Strategy to win matches, more matches, won more money. So that's right. Yeah, my best recommendation. Fantastic. So all right. So one last question. And I know there were a couple more, so we can probably take those in the breakout room next. So one last question is, how would you determine the weights for the graphs?
Wait, I should put that up. So how would you determine the weights for the graphs of the footballers and what would they mean? Oh, so, OK. Yeah. OK, so. If you mean so for the example which I showed with where the weights were the number of passes.
So, yes. So the meaning in my demo was passes. And in fact, they were successful passes because that sort of is more intuitive. But of course, you could do unsuccessful passes. But yeah, the weights in the graph.
So, in fact, the edges. So there could be things between footballers, whether interactions. So it could be passes. It could be, you know, like a header. I mean, obviously has to be like from a player to a player. So the weights, how would you determine it? I'm just trying to answer the question. Or if you mean how you determine it in terms of from the data.
I'm afraid that is the manual. Oh, maybe that's it. Sorry, I'm trying to reverse the question. OK, so because I just showed it. Yeah, it's more around like what all you can you can use as as as like the weights. And like what would be your criteria to sort of gauge which one's better or which one's not.
When you're when you're trying to sort of come up with with a deterministic score for a footballer. But I know that this might be too complex to answer right now. So how about we move this over to the breakout.
And I'll just tag the link to the breakout rooms for everyone who is tuned in. And again, thank you so much for your time, Chin. It was a fantastic and a lovely talk. Thank you very much. See everyone in the breakout room. Bye bye.