Correlation and Causality: where are your real areas for improvement


Ask any statistician and they will tell you that the biggest hurdle in data analysis is confusing correlation and causality. Getting these two ideas mixed up can cause you to focus on the wrong thing waisting time and not yield any improvement. Let's get our definitions in order first.

Correlation - a mutual relationship of stats - that is that they vary together.

Causality - the outcome of one stat directly effects the second stat - that is the first causes the variance in the second.

Let's get a little less abstract and look at a few stats to put this into perspective - Score to Par and GIR Percent. These two stats tend to move together, which makes sense - the idea being that if you have a high GIR Percent your Score to Par should be pretty good as well. The opposite is also true if you have a low GIR Percent your Score to Par will tend to be higher. There is a correlation between these two stats, but is there a causal relationship? While it may be true that a players Score to Par may be changed because of their GIR performance, it isn't always the case. Good up and down performance can overcome missing a green in regulation and still make par. By only looking at Score to Par and GIR Percent it could be assumed that to improve Score to Par a player should work on their GIR shots, but that may not be the whole story.

The first place I would check is the Performance DNA (or strokes gained if your stat program has it) to see where the most stokes are being gained. It may turn out to be that the players driving performance is the real issue, causing both the GIR Percent and Score to Par to be less than desirable. If that is true it would show that the driving performance has a causal relationship (causality) with GIR Percent and Score to Par, and that GIR Percent and Score to Par are correlated.

When looking to improve skills, it's important to look deep enough to see what is causing the performance gaps and what is correlated. This means looking into the stats enough the see the whole picture and trying to use the numbers to support different possible scenarios to see which is most likely. Always look for the cause instead of assuming it - one stat may turn out to be correlated and not causal.

Where do the comment author
images come from?


No comments yet. Be the first!

Add a Comment
9 + 2 =