Home Court Advantage
Who has the best home court advantage in college basketball? Duke? Kansas? Gonzaga? This can make for a fun bar conversation, mostly because you can make an argument for so many of the top teams. These arguments are typically qualitative - “Duke’s student section is the best in the country” or “Allen Fieldhouse is insanely loud. I mean seriously, insanely loud.” Since I have the box score data from every regular season college basketball game in the past seven years I’m going to take a more statistical approach.
To start, let’s try to quantify how many points playing at home is actually worth. As usual, we start with the standard imports.
Next we define a quick helper function to make it easier to import the data, which is stored in a separate file for each year, then pull each year’s data into a separate frame, and finally concatenate them all
Now for some actual analysis: for each Divison I team let’s go through and find the difference in average points per game (ppg) for home and away games. The raw data does contain some non-Division I teams, so we have to be careful to avoid including them or we’ll skew our results (lower tier teams are much more likely to play away from home, and much more likely to get blown out. Taken together these two facts could make away games seem worse than they actually are.) We’ll generate a list of every team that shows up in the data, and then filter out those that only show up a few times.
Let’s take a quick look at the distribution of (home points per game ) - (away points per game).
Mean : 5.8327450862
StdDev: 4.52692408694
At first glance it looks like home court advantage is worth about 5.8 \(\pm\) 4.5 points. However, we’re actually ‘double counting’ the advantage. If we make the simplifying assumption that home court advantage is the same for all teams and arenas (which it certainly isn’t, but it should work to zeroth order), then we can say \[ \text{home score} = \text{neutral score} + \Delta \\ \text{away score} = \text{neutral score} - \Delta \] then \[ \text{home score} - \text{away score} = 2 \Delta \]
The real situation is obviously more complicated than this, but even with our simple analysis we get a home court advantage of 3 \(\pm\) 2.3 points, which is pretty close to the consensus of ‘around four points’ link link link.
This is pretty neat aggregate result, but we haven’t said anything about the advantage conferred by any individual arena yet. One possible approach is to see which teams have the largest spread between home and away ppg, and attribute that to the advantage of the home court. However, there are a lot of factors besides the advantage of the arena itself that can contribute to a team having better stats at home than on the road. For example, high caliber programs often play a lot of lower-tier teams at the beginning of the season to tune things up a bit. These types of games are almost always played at the higher-tier team’s home court, which inflates the home statistics of some teams and deflates the away statistics of other teams.
Let’s check if measuring which teams score more points at home than away is a good proxy for the advantage conferred by the team’s home arena. If it is, then an ordered list of teams, ranked by how much better they are at home than away, should be somewhat consistent year to year. If the rankings aren’t particularly correlated from year to year then this is probably not a good way to measure home court advantage.
year 2009 2010 2011 2012 2013 2014 2015 mean
team
Quinnipiac 30 180 6 2 8 3 10.0 34.142857
St Francis (BKN) 3 20 117 74 19 12 57.0 43.142857
Rider 43 14 23 172 3 25 40.5 45.785714
LBSU 47 151 33 6 41 28 58.0 52.000000
Manhattan 7 50 80 17 92 14 118.0 54.000000
Not looking too good so far - when I think ‘intimidating home court’ the first name that springs to mind is definitely not Quinnipiac. Let’s go ahead and plot a heatmap showing the Spearman’s rank correlation coefficient between the rankings for each year. We’ll also plot a scatterplot with the rankings from two different years so we can get a more visual sense of the correlation.
Definitely not very correlated. Looks like measuring which teams perform better at home than away does not give very consistent results year to year, so it’s probably not a good proxy for the advantage conferred by indivdual arenas.
Time for a new plan: let’s compare how visiting teams do at each arena to how they did on average in away games. If, at a given court, the visiting teams score 10 points fewer than they do on average in away games, that might be a better indicator of the actual advantage of the home court. We can do this for both offense and defense, i.e. we can compare visiting points per game and visiting points allowed per game to away averages.
As before, we’ll plot a correlation matrix comparing all the years, and a scatter plot comparing two of the years. We’ll do this for both visiting team points per game and points allowed per game.
Alright, that’s quite a bit better than the first attempt! However, home team quality is a pretty big confounding variable here - take the 2015 Kentucky team for example. Kentucky was unanimously considered the most talented team in the country in 2015, so it stands to reason that visiting teams both scored fewer points and allowed more points than their average when visiting Kentucky, just because they were playing Kentucky. This doesn’t really speak to Kentucky’s home court advantage, just the quality of the team.
To mitigate the effect of this confounding variable, let’s combine the information from each factor we’ve considered so far (PPG and PAPG vs away averages, for both the home and visiting teams). Dominant teams shouldn’t be too affected by their opponents’ play, so differences in PPG / PAPG between their home and away games are more likely to reflect a home court advantage. Likewise, weaker teams shouldn’t affect their opponents’ play too much, so differences in visiting team PPG / PAPG are also more likely to reflect a real home court advantage.
We’ll combine the rankings in each category by taking a simple average.
Not too bad! It’s especially encouraging that the correlation seems to reach a somewhat steady state after a few years. I’m inclined to take this as evidence that we’re actually measuring a (mostly) static property of the arena, and not the temporary effect of a particularly good team one year.
To recap, we calculated, for each team, four different metrics that might reflect home court advantage:
- the difference between the team’s average PPG at home and on the road
- the difference between the team’s average PAPG at home and on the road
- the difference between visiting teams’ average away PPG and their PPG at this team’s home court
- the difference between visiting teams’ average away PAPG and their PAPG at this team’s home court
For each season, we ranked each team according to each of these metrics and then averaged the four rankings. This gave us a ranking of each school’s home court advantage. Finally, we looked at the Spearman’s rank correlation coefficient between the ranked lists for each pair of seasons, which is what is plotted above. The rankings are moderately correlated year to year, even after six years which implies that we’re seeing something that persists on that time scale. Team quality fluctuates quite a bit over six years, so I’d say we captured at least some information about the real home court advantage.
No analysis of home court advantage would be complete without listing the Top-N, so here are the 15 best home courts according to our model:
VCU 1
Louisville 2
Michigan State 3
Georgetown 4
Gonzaga 5
Villanova 6
Wichita State 7
Temple 8
Syracuse 9
Kansas 10
Kentucky 11
Pitt 12
BYU 13
UNLV 14
Wisconsin 15
And in case you were wondering, Cornell comes in at 161 (no joke).
You can find the ipython notebook for this page here, the box score data here and the game score data here.