As a card-carrying member of the football and Celtic statistics nerd community since 2019, I have been able to lean on over 25 years of experience in using statistical models in other domains to assist with my learning curve.
That experience informs me that the old aphorism ‘all models are wrong but some are useful’ is generally correct. Part of learning how to make statistics and associated models useful is to understand how they are built, including strengths and weaknesses, what questions they are built to try and answer and how good or bad they may be at providing answers.
My goal with this column, therefore, is to use a comparison of various statistics and models for Oh Hyeon-gyu and Kyogo Furuhashi to better explore these concepts. The first glaring issue with this comparison is the limited sample size for Oh, which is an important consideration in any analysis.
Here is an attack-oriented radar for the two strikers:
Expected goals (xG) is a model of which there are many variations depending upon the data vendor.
My own assessment is that StatsBomb’s is currently the most accurate available among public vendors as it takes more vital variables into account. Measuring xG is an attempt to model the relative probabilities of a shot being a goal, or as it is often referred, the quality of a chance.
StatsBomb’s model incorporates things like the positioning of opposition defenders and goalkeepers and, for aerial crosses, the height of the ball when the attacker makes contact to shoot.
For example, here is their graphic of the moment of Kyogo’s headed shot just after the 30-minute mark versus Hibernian on March 18 2023:
We can see from the image that their model assigned an xG value of 0.12 while Wyscout, whose model incorporates fewer variables, assigned an xG value of 0.28.
With the SPFL Premiership having one of the highest rates of shots taken from aerial crosses in Europe, this sort of difference between models is vitally important to incorporate into one’s analysis.
However, xG is just one of many statistics and models. The first report above also includes open play xG assisted, or what other vendors often call expected assists. It credits the player who delivered the pass that immediately preceded a shot.
Again, with the caveat that Oh’s sample size is far too small to draw any material analytical conclusions, based solely upon his 0.16 xG assisted compared to Kyogo’s 0.07, some could reasonably see the disparity and state that Oh is the ‘more creative’ player.
StatsBomb’s data is event-driven, meaning it does not track things like the sprint speeds of players or GPS-captured positioning of players through time. Rather, as shown in the graphic of Kyogo’s shot, we can think of StatsBomb’s data as more like a series of still photos rather than a fluid motion picture.
Comparing the two players across the various event statistics offers additional information. For example, Oh’s volume of passes in the opponent’s final third and touches in the box have been nearly double the rate of Kyogo’s but he also turns the ball over more and has been dispossessed at a dramatically higher rate.
So Oh has been on the ball far more, creating more chances for team-mates, but those actions have come at a ‘cost’ of sorts. How could one throw all this into a mixer to try to figure out whether it’s all been ‘worth it’? Enter on-ball value (OBV):
The goal of the OBV model is to assign value to every single event during a game through the currency of xG. To put it simply, how much does each player event increase or decrease the probabilities of either creating a chance or conceding a chance?
I believe this framework is a very useful one, if still flawed like all models. We can see from three of Oh’s OBV components (pass, defensive, and dribble & carry) that this high-action playing style as compared to Kyogo has not paid off per the OBV model.
For example, despite his higher degree of chance creation for team-mates, the negative aspects of turning the ball over in those efforts have worked out to a -0.09 pass OBV. Similarly, his higher-action defending reflected in a much higher tackling and interception rate than Kyogo is offset by more than double the rate of fouls. The picture is similar with his higher volume of dribbles and carries being offset by being dispossessed at more than five times the rate as Kyogo.
Of course, none of this is intended as some sort of declaration on Oh – the sample size is way too small. This comparison is intended to try to display the potential value of using the OBV model as part of a broader and robust analytical process.
Prior to the launch of OBV by StatsBomb, one way in which I attempted to back into similar conclusions was to look at various ratios to try to deduce things like player IQ.
For example, players that have relatively high average xG per shot taken, average xG assisted per key pass and key passes versus turnovers could all be indicative of a player that makes smart decisions.
The OBV model and its components provide an additional set of tools in order to attempt to measure – and potentially answer – that same question. For example, the shot OBV component measures two things:
1. The difference between the xG of a shot and post-shot xG?
2. Did the decision to take the shot increase the chances of scoring relative to what available passes to team-mates would have?
As a hobbyist with limited access to StatsBomb data, I do not have the ability to see how each of these two contributes to the metric, so am left with trying to deduce. Once again using Oh’s data solely to explain the analytical process, his shot OBV of -0.13 is relative to having a total xG of 2.02 and post-shot xG of 2.26 on his grand total of 12 shots so far.
On those 12 shots (again, a ridiculously small sample) he has marginally improved the overall chance of scoring via his finishing, which should be a modest positive relative to shot OBV.
However, given that the metric is severely negative at -0.13, I think it is a reasonable deduction that his decision-making has not been very good and is worthy of further inquiry. To use two extreme examples in his ridiculously small number of shots, here was one of his 12 versus Hearts on March 8:
With an xG of 0.02, with two defenders directly in front of him and a keeper fairly well positioned, almost any pass to a team-mate would likely have been a better decision here.
Using a different angled image, here was a shot off a throw-in late in the April 2 match at Ross County:
Oh did a terrific job creating space for himself to receive the throw-in by shielding off a defender but, as he turned, he was confronted with another and the result was an aimless deflected shot into the arms of the keeper.
As we can see from the graphic, Sead Haksabanovic was wide open in space in a dangerous central location available to receive a pass. While an xG of 0.12 is not a poor quality chance, the probability of Haskabanovic having a far higher quality one had the decision to make a pass been made is likely to have been a negative contribution to his shot OBV metric.
The post-shot xG of 0.62 actually highlights one weakness within the xG model used to generate that probability, which is to say it does not account for any shot characteristics (i.e. was the shot hit solidly, mis-hit, etc.). Seeing that value after watching the video of the shot could understandably lead a person to dismiss the model altogether.
StatsBomb is in the process of further enhancing the model to account for various shot characteristics, though, and I expect this one to be an example that is revised much lower as a result.
While no one should draw any analytical conclusions about Oh as a player from this exercise, it offers a window into the framework that I will deploy to monitor his progress over the coming weeks, months and, hopefully, years.
A vital part of that framework is to figure out how to make all the flawed tools available useful.
Comments: Our rules
We want our comments to be a lively and valuable part of our community - a place where readers can debate and engage with the most important local issues. The ability to comment on our stories is a privilege, not a right, however, and that privilege may be withdrawn if it is abused or misused.
Please report any comments that break our rules.
Read the rules here