The recruitment of highly rated high school football players by college programs is big business. Major college football programs expend considerable budget in their efforts to recruit the most talented high school players. Blue-chip players, those rated by scouting services as 4 or 5-star players are relatively rare. As of October 10th, 2019, there were 347 Blue-Chip prospects according to the Composite 247 Ratings, the established industry standard for prospect scouting. There are approximately 1,006,013 high school football players in the United States (https://www.statista.com/statistics/267955/participation-in-us-high-school-football/
Some basic math lets us know that Blue-Chip players make up far less than 1% of all players. While validation for the rating methodology appears to be scant, the fact that so many big-time programs are recruiting these players offers logical support for the conclusion that these are indeed the better prospects. Coaches obviously want the best players, as that makes winning easier. The goal for this paper is to explore the degree to which recruiting success plays into winning on the field.
Among college football fans, it generally presumed that there is a direct (i.e. linear) relationship between recruiting Blue-Chip players and winning. In popular college football blogs and multiple articles, there are countless “analyses” of how “it is all about the Jimmy’s and Joe’s and not X’s and O’s”. Many die-hard fans appear to judge their teams’ recruiting rankings as a sure-fire indicator of whether the team is good or bad or even if the coach should be fired. The reality is that recruiting does not have a linear relationship to winning. But it does have a significant impact on winning- usually. This analysis found that in 3 out of 5 Power 5 (the most powerful) conferences in college football the on-roster talent was not statistically significantly correlated with winning percentage.
To conduct this study, roster talent ratings were obtained from the Composite 247 website for the top 50 most talented teams from 2016 to 2019 (as of October 25th, 2019). Each of the teams’ corresponding win percentage was then recorded. Each team was codified according to year and talent ranking. For example, the highest-rated team from 2016 was assigned a code of 2016_1. The 30th most talented team in 2018 had a code of 2018_30. And so on. Descriptive statistics and exploratory data analysis showed some interesting things at the conference level.
What the above table shows us is the number of teams that were represented and their mean (average) roster talent score. So, over the last four years, the ACC, with 14 teams, has a possible number of 56 representatives in the data set (top 50 most talented teams over 4 years). They are represented 41 times (73%). The Big 12, with 10 teams (what does the ’12’ stand for again?) has 26 out of 40 possible representatives (65%). The Big 10 (14 representatives) has 39 out of 56 (70%), the PAC 12 (which actually has 12 teams- crazy) has 35 out of 48 (73%). The SEC has 55 out of 56 (98%). So the SEC has more, quantity-wise, talented teams.
The mean roster talent ratings are next. The Big 12 was the least talented group at 86.2, but virtually in a 3-way tie with the ACC and Big 10. Again, the SEC leads the way. Knowing there are some very talented teams in the other conferences (non-SEC), I wanted to look at how that talent is distributed among the conferences.
The above graph is a histogram for each conference. Not surprisingly, all of the other conferences are multi-modal whereas the SEC is approximately normally distributed. What this means is that the other conferences have more than one peak and valley, whereas the SEC is generally more bell-shaped (though quite a bit flatter here; low kurtosis but decent skewness). While there are certainly the ‘haves’ (Alabama, Georgia, LSU) and ‘have nots’ (Vanderbilt, Arkansas) in the SEC, the separation between the ‘haves’ and ‘have nots’ appear to be greater in the other conferences.
A regression analysis was then conducted for the entire sample. A simple linear regression found that There was a statistically significant correlation between roster talent and win percentage (p < 0.001, R2= 0.178).
While the statistical significance indicates the correlation between roster talent and winning is unlikely to be a result of chance, the low R-squared value (a measure of effect size) indicates roster talent has a weak impact on winning percentage. It also indicates that the relationship may not be exactly linear. This was suspected to be the case, as the impact of roster talent may not have the same effect for each conference. Therefore, a polynomial regression analysis of the same data was conducted and found to be an improvement of the linear regression (p < 0.001, R2= 0.206).
The polynomial regression was an improvement over the linear regression; however, the findings still indicate that roster talent has a weak correlation to winning on the field. At this point, it was necessary to look at the conferences individually to explore the relationship more granularly.
Conference by Conference
The below scatterplots show the same graph as above but filtered to only one conference at a time. The statistical output (significance and effect size) for each is below the graph:
(ACC: p =0.08, R2= 0.164)
(BIG 12: p =0.308, R2= 0.147)
(BIG 10: p =0.017, R2= 0.247)
(PAC 12: p =0.386, R2= 0.092)
(SEC: p <0.001, R2= 0.508)
Out of the 5 conferences, roster talent was only statistically significantly (at a standard threshold of 0.05) in the Big 10 (0.017) and the SEC (<0.001). The effect size for the Big 10 is 0.247, or 24.7%. For the SEC, it is .508, or 50.8%. This means that on-roster talent accounts for 24.7% of the wins in the Big 10 and 50.8% in the SEC from 2016 to present (October 25th, 2019). The key takeaway is that SEC fans are justified in fretting over recruiting success much more so than the other conferences.
A Deeper Look at the Southeastern Conference
The findings led me to want to look deeper into what is going on in the SEC. The plot below is the same as the SEC plot above but drilled down to the team level:
What this chart shows is that SEC teams generally perform to the relative level of their talent except for Tennessee and Florida. Tennessee generally underperforms while Florida generally overperforms except for 2017 when the Gators went 4-7. Kentucky’s 2018 year was a high achievement. An analysis was performed to explore how each team performed relative to their roster talent level for each of the years.
When summing the overall achievement for each team over the study period, Florida is the most over-achieving team in the sample.
If there is any doubt about Florida Head Coach Dan Mullen’s ability to maximize the talent on hand, it should be erased. The Gators are winning the most relative to roster talent even though they were 4-7 in 2017. Furthermore, Mississippi State is 3rd. A quick look at the SEC scatterplot shows us that MSU was overperforming when Mullen was a coach there but has since dropped off. Florida has done the opposite.
There is no doubt that the best place to be in is where Alabama is: maxed out on Talent and Wins, so there is no chance at really overachieving. After all, winning is what it is all about. And, in the SEC at least, recruiting success plays a big role in that. If you can’t recruit at an elite level, you’d better have a coach like Dan Mullen to give you a chance.
As requested by Reddit user stevejust, I have added the team name and year to each of the conference scatterplots:
The purpose of this study was to examine the linearity or lack thereof, between talent rating and winning percentage. The purpose is not to find the best model possible to predict winning- that is a worthy and separate endeavor! The primary curiosity was to explore if college football fans overstate/understate/state just perfectly their concerns about the recruiting performance of their favorite teams. The interesting find concerned the general lack of linearity between quantified talent level and winning. I have received considerable feedback (virtually all of it positive, which is awesome) and several suggestions regarding other potential variables to consider, models and methods that may be superior or an improvement over the polynomial regression. Each of these that I have read has been, to varying degrees, perfectly legitimate and well-thought-out. However, it is important to emphasize, that this study was not about the model of best fit- it was about the supposed linearity between talent ratings and winning. Thank you to all who have commented and provided excellent ideas for future approaches. I am learning from you and gathering great ideas. Cheers!