The Composite Rating for High School football players as prospects for college football is generally considered the gold-standard of recruiting evaluations. Virtually every peer-reviewed research paper on the topic cites this metric. But one of the key aspects to evaluating a metric is how predictive that metric is of future success. According to (https://www.cougcenter.com/wsu-football-recruiting/2013/2/5/3956800/rivals-scout-espn-247-star-rating-system-national-signing-day), the NFL draft is a commonly-included standard for success of the top High School recruits in the nation- the 5-stars and top 150 players for each class.
I decided to look at how each of the services and the Composite have done in predicting which players will get drafted.
I used the ratings for each of the services previously mentioned. I took the top 150 players from each service between the years 2012 and 2015. This time frame was selected because it was modern and included all draft-eligible players (of note, a few 2015 recruits, such as Gators’ Receiver Van Jefferson are draft eligible this year, but that number is likely to be very low and won’t impact this study).
Each services’ top 150 players were logged and reviewed to see if they were drafted by an NFL franchise and if they were drafted, which round and overall pick where they selected. I applied the final probabilities to the 2020 class here: https://thefaircatch.com/2020/02/03/probability-of-elite-2020-class-recruits-to-be-drafted-by-an-nfl-franchise/
Players that were listed among the top 150 recruits for each of the services (ESPN, Rivals, 247, and Composite) were given 4 “votes”. Players that were in 3 were given 3 votes and so on. After that, analytics began. Of note, the blue arrow line next to each table indicates the direction in which the heatmapping flows. So, in figure 1, the color scale would be interpreted per column as the arrow is vertical. For tables that have a horizontal arrow bar, the color scale is applied to rows.
Breaking Down Outcomes by Recruiting Service
Figure 1 shows how many players in each position group were ranked by the individual services in the data set. The math there adds up (150 x 4 = 600 each, 4 services x 600 = 2400). Though there were 2400 data points, many of these players were included in multiple services (more on that later). This left me with 908 different individual players overall. Figure 2 shows how many players out of each position group and service were drafted. There were 816 players drafted from the overall data set, for a group accuracy of 34%. Figure 3 depicts the accuracy percentage below.
As we can see in figure 3, ESPN has the lowest overall accuracy rate and 247 has the highest. They are all fairly close, but ESPN is behind here. To further explore this, I created a quick chart to map out the differences in mean (average) of each service’s top 150 to get drafted:
Put to scale, we can see that ESPN has done a relatively poor job in including players in their top 150 who would go on to get drafted. ESPN did tie with 247 for the most accuracy in predicting DBs drafted, so gotta give them that.
Breaking Down Outcomes by “Votes”
I was curious as to the variance among the services in putting different players in their top 150. I created a ‘vote’ count by simply tallying up which players were included in which service ranking. A player that was in the top 150 for all 4 services got 4 votes, a player in 3 of the 4 got 3 votes and so on. It became very clear to me upon charting the data that players that were consensus top 150 players (those with 4 votes) got drafted at a much higher rate.
Figure 5 shows that if a player had 4 votes, they were drafted 45% of the time, up significantly from the group average of 34%. The more votes a player received, the more likely they were to be drafted. Interestingly enough, out of the 908 unique players in the study, 272 (30%) received only 1 vote. So, there is definitely some variance in how each service evaluates prospects.
Figures 6 and 6a above shows the draft count data for each position group by the number of votes. In terms of pure numbers, Defensive lineman were the most drafted position with 163 and Tight Ends had the least with 29. Of note, the red arrow line on the right indicates how you should interpret the color scale for the total column on the right side of the graph only.
Figure 7 above shows the relative proportions of drafted players by the number of votes and positions. It is clear that having 4 votes was a much stronger indicator of draft potential for each position group.
Charting Vote Count Accuracy
When I initially started the analysis, I charted the data and utilized a classification tree to try to see if there was any delineation in the data that would bring to light any significant groups (clusters). An initial scatter plot was pretty busy but did show some clustering in the lower left quadrant:
This scatterplot showed me that there was probably something going on with the association between being highly ranked and drafted early.
Once it was clear something was going by in terms of the vote count, as set forth above, I looked a little more into this. A scatterplot of the data as in Figure 8 but parsed out by vote count shows just how impactful this metric is:
It is easy to see in Figure 9 that those players with 4 votes bunch up toward the higher draft picks and have overall higher counts, as we’ve seen. The lower the vote count, the further to the right the data drifts.
On the horizontal axis (“x” axis), which is labeled ‘Avg Rank’, this is the recruits’ rank averaged across all of the services in which he was ranked within the top 150. On the vertical axis (“y” axis), labeled ‘Pick’, this is where that recruit was ultimately drafted overall (i.e. 35th pick in the draft). The different colored numbers show how many votes that recruit received coming out of high school.
The Florida Gators
Of the current UF commits in any services top 150, here is how many votes they have and from each service:
Figure 10 shows that Gervon Dexter, Xzavier Henderson, and Jahari Rogers are each ranked in the top 150 by all four services. Derek Wingo has 3 votes. Ethan Pouncey and Issiah Walker have 2 votes and Antwaun Powell and Jaquavion Fraziars each have one vote.
One Step Further
I have continued to play around with the data. I built a machine-learning algorithm to see if the data is helpful in predicting which players will get drafted and which players will not. I used a naive Bayes classifier on binary outcomes (UD = ‘Undrafted’, Drafted = well, drafted). The model was impressive at predicting who doesn’t get drafted but was not necessarily helpful in predicting who will get drafted. However, it was an overall accurate model at 71%, which is pretty cool.
The resulting confusion matrix was:
The way you interpret a confusion matrix is diagonal. The model predicted 22 ‘drafted’ correctly and 38 incorrectly. It predicted ‘UD’ correctly 113 times and was wrong 18 times. This makes sense when you look at the corresponding graphs. There is a lot of overlay among the variables contributing to the ‘drafted’ status.
In Figure 12 above, it is easy to see that the probability of getting drafted is much higher when 4 votes are obtained. The drafted vs undrafted lines really start to separate at about 3.5 or so. However, a large chunk of drafted players are still under 3 votes, but the disparity at the tails is marked.
If you are interested in getting a copy of the R Markdown (go to https://rpubs.com/SORT14/servicesreviewbyVote). For the data set to play with, just contact me and I’ll send it to you.