I find the fan fascination with recruiting fascinating. While you’ll never hear me argue against recruiting’s importance – after all, the coaches put so much emphasis on it and they are the true experts – I also don’t subscribe to the theory that it is all about the Jimmy’s and Joe’s and not the X’s and O’s. I think, based on every detailed analysis I and others have done on recruiting, that coaching is the key factor in winning. Not the only factor, but the number one key.
The purpose of this analysis is not to explain every single variable that contributes to winning (SOS, Coaching, home field, randomness, etc.). The point is to isolate the discussion on recruiting across several dimensions. It is often helpful to isolate a variable in order to understand how it is part of a bigger system.
That being said, recruiting is strongly correlated with winning percentage. I analyzed the direct linear relationship between 57 Power 5 teams since the from 2005 through 2017. I tallied up each year’s recruiting data. Then, I parsed each year for each team out along these dimensions: Number of players in year’s class, 3-star players in class, 4-star players in class, 5-star players in class, Blue Chip percentage (calculated by taking the percentage of 4 and 5-star players relative to all of the players recruited in a class), and the average rating of those players. Next, I averaged each team’s scores across each of those dimensions over the time span. First up, Blue Chip percentage (BCp):
The scatterplot above shows the winning percentage for each team on the vertical (y) axis and the BCp on the horizontal (x) axis. A quick visual of this chart indicates that higher BCp is associated with more winning at the P5 level. It looks as if there is a strong positive linear relationship. Next, I added a fit line to the graph:
In this second chart, the line confirms the initial suspicion: As BCp goes up, winning will go up as well. The regression equation here shows that if you were to have, say a BCp of 79%, the model would predict you to win 78% of your games (y=0.44+0.43*.79, y= 0.7797). Beyond that, however, the model was statistically significant (p = .000, a= 0.05, R= .699, R2= .488). For the non-stats crowd, these numbers basically mean that there is less than a 1% chance that these findings are due to random chance, and that about 49% of winning percentage experienced in this sample is attributable to BCp and other unknown factors accounting for the other 51%. So, we have a strong positive relationship and we know how much of that relationship is due to BCp. So far, so good.
But, there was something about this chart (look at the first one without the line) that immediately caught my eye- there is an obvious curve in the lower quadrant. This lets us know that BCp and, its relationship with winning, is different for different teams. It looks to me like the strongest correlation occurs when a team is above 50 BCp or so. When we apply smoothing (LOESS), we can see this visually:
Things get loose in the 30- 40 range. They look chaotic to me when BCp drops below 30%:
When BCp gets low, it only accounts for 15% of winning percentage (in this sample, which is 34 team averages over a 13-year period). Intuitively, this makes sense. How can blue-chip players help you win if you don’t have any? That doesn’t mean you can’t win:
That little guy way up there is Wisconsin. They’ve won 76% of their games with an average BCp of 17%. Props, Badgers. There’s a flip side to that as well… UCLA has had an average BCp of 50% while winning only 54% of their games on average. I’m sure things will get better with Chip running the show…
A Better Recruiting Metric
While BCp has a clear and strong relationship to winning percentage, the individual recruit rating (RR) using 247 Composite is even better (R=.722, R2= .522, p=.000, a=0.05). Where the BCp model accounted for 48% of the variance and correlated with winning percentage at 69.9%, average rating accounts for 52.2% of the variance and is positively correlated with winning percentage at 72.2%. Here is that chart with a LOESS curve applied.
An Even Better Model
Having looked at recruiting’s relationship to winning percentage along these two dimensions (Blue Chip percentage, and recruit rating), I wanted to look at the variables that comprise these two dimensions. In this attempt, I used multiple linear regression. The dependent variables used are (range averages) number of recruits in the class, 3-stars in class, 4-stars in class, and 5-stars in class. What I found was even better than the previous two simple linear models (all assumptions of the MLR were met).
The correlation is .755, or 75.5% positive, with 54.6% of the variance (adj. R2). The table below shows how each variable scored:
All 57 Teams
Here is how all of the teams included stacked up.
Teams that were at or near the line generally performed as one would expect given their average RR. Since that chart is a bit cluttered, here are all the teams in list format:
|Team||Avg Rating||Average W%|