Thursday, 12 June 2014

Beyond Big Chances - Shot Segmentation

This was initially going to be a much bigger piece but with the World Cup only hours away, I wanted to get something out before I get sucked into 3 games a day for the next couple of weeks.

There's a lot of work going on around Expected Goals, in any match you may lose but have had better quality chances and if the match were replayed infinitely with the same chances for each side you'd win more than you lose.

The aim of this is to remove the 'luck' from individual games and focus more on longer term performance.  If on a regular basis, based on expected goals you 'deserved' to lose, then even if recent results have gone your way over the longer term things are likely to go sour.  "The only stat that matters is the score" is fine in a cup final but longer term it's more important to track the underlying numbers and not just results.

How 'long' is long term will depend, just as you could theoretically toss a coin 10 times and get heads each time, some players or clubs could be performing above expectation due to reasons beyond their control rather than due to any real above average ability.

Anyone interested in doing this based on data in the public domain is usually working with one hand tied behind their back as a lot of context around a shot (e.g., defensive pressure) is missing.  I'm not moaning that Opta don't go around giving away for free all the data they spend money collecting, but it does make things trickier.

The only stat that gives context to a chance (beyond location and body part) is Opta's Big Chance stat which they define as where there is a 'reasonable' expectation of a goal.  These chances convert at around 40% (excluding penalties) compared to around 5% for other shots.

This gives a good split between 'big' and 'not so big' chances, but just having two types of shot is a bit limited to differentiating between shot types.  To get beyond this I've used Big Chance data along with shot type/location data to try and break chances into a few more meaningful segments.

There is more work behind this but for sake of brevity I'll leave that for another time, this is just a basic first pass where only distance to the nearest point on the goal line is looked at rather than angle.

The segments are a hierarchy so group h for example are shots inside the box that are not included in groups a-g.
Shot Type Summary - Even within Big Chances, there's the opportunity to split further although this does start to create small segments
With 9 different types of shot you're arguably spreading yourself too thin so I've grouped the chances based on the colour types above to give a grade to each shot:

There's plenty more that can be done with this with a bit more time (and a bit more data) but the above shows that all shots are far from equal and that a qualitative measure such as 'big chance' can help give more understand of chance quality than location alone, although if you were doing this for real, you'd do a lot more work grouping the chances based on video.

Thursday, 29 May 2014

Lawro - Smarter than Goldman Sachs?

Last year I had a look at how you would fare betting using Mark Lawrenson's predictions where I was surprised to see you'd end up in profit.  I also revisited it earlier in the season and found he was still in the black for the 2013/14 season.

Now the season's done, it possible to look to see how he performed over 380 games (well, 371 as 9 games were rearranged so don't have a prediction - or at least one made just prior to the match).  Figures use Lawro's predictions collated by My Football Facts, and odds data for Pinnacle (usually the best odds provider) from
I feel a bit daft looking at this considering I did the analysis last season and he was comfortably in profit last season as well but I chose to ignore it.

There's no fancy stats behind Lawro's predictions, an example of this can be seen in the fact that for 17 of Chelsea's 19 home games Lawro went for a 2-0 win (1 other v Cardiff he pushed the boat out and went for 3-0, with the only points Lawro expected them to drop at home were defeat against Man City).

This means that things such as Man Utd's expected position based on his predictions looks a bit daft, but was probably built on the assumption that things would 'eventually' come good for them so he keeps predicting them to win, similarly in almost any individual game you would probably expect Man City to win but that gives you a predicted points total far above what is really likely.
Lawro only predicted 1 win for Cardiff over the whole season
It's easy to mock Lawro but he's proved again that he's capable of turning a profit over an extended period of time.

With the World Cup only a few weeks away there have been a rash of pre-tournament predictions including one from Goldman Sachs which gives Brazil a 48.5% chance of winning compared to 25% from the bookies (or even less in reality once margin is taken into effect - total probabilities from bookies odds will be over 100%).
Goldman Sachs comparison of their probability and bookie odds
The Goldman Sachs document goes into more detail about the level of modelling/simulation etc., that has taken place but for me giving a nearly 50% chance is ridiculous in a 32 team tournament where a side needs to get out their group then be victorious in 4 sudden-death games

Looking at their % likelihood of getting past each stage, Goldman Sachs have Brazil at 91% likelihood of getting to the semis assuming they get to the quarters and an 80% chance of winning a final if they were to make it that far.

Ultimately either through modelling, gut feeling or a mix of the two, profitability comes from finding value that the market hasn't appreciated, but there's no point in a complicated model if what comes out at the end seems unreasonable, although this paragraph towards the end made me laugh considering the amount of description put into the describing the workings behind the model.

Wednesday, 28 May 2014

Ashenomics - Some figures behind Williams' contract talks

Pretty much nobody wants Ash to leave and there's a reasonable chance he'd be happy to stay so it would seem to be a fairly simple case of agreeing terms and everyone's happy.

In reality, things are far more complicated than that, Ash turns 30 at the start of next season so his next deal (whether a 2 year extension at the Swans or a 3 year deal elsewhere) will take him up to his 33rd birthday so this next deal will probably be the last 'big' one for him.

Below are some real back of a fag packet figures but show the kind of things he (and his agent) are likely to be thinking about.

Based on little other than speculation and guesswork, I'd imagine Ash is on somewhere around £40k a week and if he had 2+ years left on his contract it'd take offers of at least £10m+ for us to consider selling and I'd consider this his 'market value' if he was tied in to a long term contract.

I've seen rumours of Arsenal bidding £3m but at that price there's absolutely no financial point in selling (I'd understand the argument of letting him go for 'cheap' as a thank you for his service if he were looking to drop a couple of divisions to join his boyhood club or something, but not to join one of the richest clubs in the world).

There's also been the likes of Napoli mentioned where he could sign a pre-contract in January to start there in the 2015/16 season, for me this is an even bigger worry for Pablo than Ash as obviously it's far easier to imagine a situation where Pablo is happy to run his contract down and go back to Spain than it is for the upheaval of Ash moving to Italy.

The new TV deal that kicked in the season just gone meant than Swansea got £74.2m this season vs £47.6m in 2012/13 from Premier League money despite finishing lower in the league, so if income has gone up by 50% it wouldn't be surprising for players to want a rise that reflected this (or have already asked for it at the start of last season), although once you give one player a big rise it risks opening the floodgates and it's not just one player's wages you're increasing.

Even forgetting the lure of Champions League football, there is a serious amount of money at play: extend at Swansea on an improved deal and get around £7m+ over the next 3 years or run your contract down, and get a signing on fee and improved wages for following 2 years at a Champions League club (e.g., £2m Swans wages, £2m signing on fee then £3m p.a. wages giving £10m over next 3 years).

These figures are nothing but educated guesswork but I'd be surprised if Ash wouldn't be several million pounds better off over the course of the next 3 years by running his contract down.

I get the 'if he doesn't wan't to play for us, he can bugger off' argument but I can see why at the very least he's considering his options.  Some may say Scott Sinclair faced a similar situation and took the money but I'd argue that in his situation it was a big leap to see himself as a regular first team starter at Man City although I'd imagine to succeed as a footballer you have to have a pretty strong belief in your own ability but I'd expect Ash to challenge for a first team place even if not a guaranteed starter .

Situations like these often descend into a game of bluff with accusation and counter accusation with a mixture of rumour, fact and speculation but from a negotiation point of view, every week that goes past strengthens Williams' hand.  I hope I'm wrong but given the sums of money involved I don't see anything happening on this before August (if not January).

Monday, 12 May 2014

Sunderland 1 - Swansea 3 Stats and Chalkboards

After the drabness of the Southampton game, it was good to see out the season on a high with this game arguably finished after 15 minutes with Swans scoring from their first two shots.
Sunderland might have outshot Swansea 20-8 but the two early goals meant far less of an issue for Swansea to try and force the game. 
Monk mentioned that 2-0 was a dangerous score but presumably he hasn't read this by Prozone's Omar Chaudhuri which shows that even when a team goes two up with 75 minutes left, they still win 90% of the time.

It was good to see Jay Fulton get a start, there's definitely a feeling that Monk is planning on being here for the long term and there's a reasonable chance that there will be an even more integration with other areas and not just the first team.  It'd be unfair to overly criticise Laudrup's tenure, but if you're only going to be around for a couple of season then areas beyond the fringes of the first team are less likely to be of concern.
Fulton's Pass Map
If rumoured figures are correct then the estimated £200k for Fulton and £175k for Adam King maybe cost less than the wages for David N'Gog who played a total of 42 league minutes for Swansea.

In some respects this is like questioning why you bothered buying home insurance as you're house hasn't burnt down.

With injury problems for Michu, if anything had happened to Bony and Ngog hadn't have been bought, then you'd be left with either Emnes up front (which given his form since rejoining wouldn't have been a disaster) or Vazquez, who despite being hugely disappointing only played a total of 412 minutes in the league, of which only 3 minutes were under Monk and 747 minutes overall which in all competitions relates to just over 8 full matches.

The season may be over but that means a chance to fill the long few weeks before Bony takes the World Cup by storm with some analysis of the season as a whole without the worry that your stats end up out of date after a couple of days.

I'm also going to be talking about the season in general (for all teams not just Swansea) at a 'Football and Data' event in London on May 21st.  If you fancy coming along use the code JACK to get a ticket for a fiver.

Twitter: @we_r_pl
Match Stats: Created using Statszone , Whoscored  and Squawka

Saturday, 26 April 2014

Swans 4 - Villa 1 Match Stats and Analysis

That's it, We Are Premier League for another season.  Norwich and Fulham can't catch us and as things stand only 1 of Sunderland or Cardiff could mathematically get as high as 39 points so another season in the top flight is guaranteed.

Obviously this game will be remembered for a long long time for that finish from Shelvey (the amount it dips at the end is incredible, everything about it is absolutely perfect).

He's certainly not afraid to have a pop from distance, last season Swansea only scored two league goals from further out than the D but Jonjo has done it 4 times himself this season (along with another from the edge of the box v Fulham).
Left: All Swans goals in 2012/13 with only two outside box: Michu's v QPR on opening day and De Guzman's free kick against Stoke.  Right: Shelvey's league goals this season
Given the goal and Premier League survival it seems a bit churlish to be over analytical of the performance, but the fact that Swansea scored from their first two shots (and only had 3 overall in the first half) suggests that it was one of those days where things go your way.
Shots by Minute: Villa with the majority of shots in first half (although most from distance) but offered relatively little in the 2nd half
Detailed analysis can wait, I'm going to just watch that Shelvey gif a few more times...

Other Posts
Away Support: How Clubs spend the £200k Away Fans Fund
Match Predictions: Are you Smarter than Lawro 

Twitter: @we_r_pl
Match Stats: Created using Statszone , Whoscored  and Squawka

Sunday, 20 April 2014

Newcastle 1 - Swans 2 Match Stats and Analysis

It's fair to say like most, I went through a fair range of emotions watching this game.  After around 30 minutes I was thinking 'At least it'll still be in our own hands at the Sunderland game', after an hour it was more 'A point here's a good result, beat Villa and we'll be safe' then at the end after an ice-cool finish by Bony it was 'We're staying up!'

Over a whole season, performance is the most important thing, luck comes and goes but if you're putting in the work you should get your rewards.  When there's only 4 games left however the concept of 'the long run' goes out the window, this was a game where the performance was good but not incredible but the difference that injury time penalty makes in enormous.
Newcastle scored with their first Shot and Swansea with both of their efforts on target
A nice stat from Infostrada Sports was that the last player in the Premier League to score in injury time in both the first and second half (as Bony did today) was Bony in the game against Man City.

Bony now has had a goal or assist in 11 of his 19 games since the start of the year (10 goals, 3 assists).

In terms of control of the game, this match swung back and forth with Swansea having the better of the match early on but then conceding, Newcastle also had a fair amount of possession in the parts of the second half.
Passes over time, no dominant team with control changing throughout the game
In terms of individual performances, Bony's goals will deservedly get the headlines but again I thought Routledge put a shift in again, especially tracking back and defending.  Emnes again did a tidy job although I'm glad he got hacked down as I didn't have the greatest confidence in him being able to finish and it looked as if the ball was going to get stuck under his feet.

I felt earlier in the week that 36 points would be enough but if Sunderland can beat Chelsea there may still be the odd twist left although I would expect any side suddenly rising from the relegation zone to do so at the expense of Norwich or West Brom (or possibly even Villa).  A win against Villa will almost certainly make us mathematically safe (bar a 30 goal, goal difference swing over the last 2 games).

Other Posts
Away Support: How Clubs spend the £200k Away Fans Fund
Match Predictions: Are you Smarter than Lawro 

Twitter: @we_r_pl
Match Stats: Created using Statszone , Whoscored  and Squawka

Sunday, 13 April 2014

Swansea 0 - Chelsea 1 Match Analysis

Any analysis of this game is largely academic as it became a case of Attack vs Defence after the first quarter of an hour once Chico had been sent off.  Both offences in isolation are yellow card fouls, but usually the ref will give a 'any more and you're off' warning to someone, especially so early in the game.

The second foul was very similar to Chico's foul on Barkley against Everton that led to a penalty, not a hugely dangerous area and just gets sucked in, it's the kind of situation where the risk associated with a mistimed challenge greatly outweighs the reward if the attempt is successful.

It's a strange situation given his influence last season that pretty much everyone seemed to want Michu not to start this match, but until he's match fit the team is arguably better off with Pablo playing the attacking midfielder role and there are insufficient games left for him to play his way to fitness.  I'm hoping his ankle didn't suddenly start feeling sore yesterday after the starting line-up was announced as the though of relying on N'Gog from the bench is a scary one (would have preferred Emnes/De Guzman to come on in that situation).
Shots by Minute: Post the sending off, Chelsea had 23 shots to Swansea's 5 with Swansea having only 2 efforts in the 2nd half (Routledge's shot and Bony's header from subsequent corner).
Possession wise it was even more one-sided, especially after the break and there was a 15 minute period just before the Chelsea goal where Swansea were under huge pressure and attempted only 6 passes.  This makes it even more galling to concede from a situation with a throw from fairly deep in the Chelsea half.
After an even 1st half (pass wise), apart from a little flurry after the goal Swans were unable to string any passes together
You can still get odds as big as 20/1 on Swansea going down (most bookies offering around 10/1) and there are 4 games left where there is a reasonable chance of getting points and with our goal difference 36/37 points would probably be enough, but going into the final game against Sunderland needing a result doesn't bear thinking about, not least because I'm starting to get images of Fabio Borini relegating us (thank God Ki won't be eligible to play as that'd be even worse).

Positives from the game were Wayne Routledge's performance along with some great touches and backheels from Bony (i'm going to be a big Ivory Coast fan in the summer after lumping on him being Top Scorer at 500/1).

Other Posts
Away Support: How Clubs spend the £200k Away Fans Fund
Match Predictions: Are you Smarter than Lawro 

Twitter: @we_r_pl
Match Stats: Created using Statszone , Whoscored  and Squawka