Analytics Posts

Recruiting services evaluation: Rivals

I recently looked at how each of the recruiting services has performed in terms of NFL draft status as a measure of success. That analysis is here:

Now, I will look at each of them in turn in terms of how predictive they are of on-field success. As I have already found, recruiting and winning is generally non-linear, especially once you start including teams without much a blue-chip percentage.

In this first look, I used Rivals recruiting class ranking data from 2017 through 2019. I took the average ranking for the top 100 teams for those 3 years. Teams that averaged outside of the top 25 were dropped. 23 teams remained. The SEC was heavily represented by 9 teams. Here is how each of the conferences counted up:

Rivals performance by conf

Using a simple linear regression to determine how much of 2019 win % was predicted by average Rivals recruiting ranking, I found the correlation was fairly high at 63% (inverted because ‘higher’ rank numbers are actually worse). The graph below shows that lower (i.e. better) the ranking was, the higher the winning percentage in a pretty linear way.


Drilling down a little further to the team level and creating quadrants on average score allowed for some evaluation. In the graph below, the teams in quadrant 1 (upper left) had a) winning percentage above average in 2019 and b) recruiting classes above average as ranked by Rivals from 2017 through 2019. This is the good quadrant. Quadrant 4, directly below quadrant 1, is the bad one. These teams had good recruiting but bad winning %.

performance vs expected.rivals

All of the teams in Q4 have negative distance scores. The percentage next to each team’s logo is how far away they were from the expected (theoretical) win % as predicted by Rivals class rankings. Florida’s 2019 win % was 12% over expected. FSU was the worst at -27.8%. For the stats people, the linear equation is:

Win perc 19 = -0.021 * Avg Rank + 0.961

So, to take UF for example. The Gators’ expected 2019 win percentage would be

-0.021 * 11.33+0.961

-0.234+0.961=0.726. Florida’s actual win percentage was 0.846, which is how we get 12% over. Below is a table of each team’s data.


Ultimately, using Rivals class rankings is not very predictive of success (margin of error is 31%). As a descriptive measure, however, it is interesting to look at. The overall regression model was significant (a<0.001) with a medium effect size (.406). So all in all, I would say Rivals did a pretty good job. I just wouldn’t fret too much about next year’s win percentage going by Rivals’ rankings alone.

All of the Other Services plus Aggregate Measure

Instead of creating new posts for each service, I decided to just put everything here. The main reason is that there isn’t that much of a difference between the services. But there are some interesting points so here is each service’s data and outcome with an aggregate summary.

Aggregate Summary

When looking at each of the service’s rankings and how that relates to on-field production, they are all pretty equal. Analysis of variance indicated no statistically significant difference among any of the four groups (in terms of variance). As such, I wouldn’t infer that any single service is better than another (because the data doesn’t indicate that) but instead look at this analysis as like a report card for the 2019 season. It doesn’t tell us how good (or bad) a service is overall, just how they did this year. That being said, here is how things turned out:

service table

The above table is the overall accuracy of each service. This statistic is the correlation between each service’s average 3-year class ranking and that team’s on-field win % in 2019. When combining all 4 services into one aggregate “service”, the accuracy is 74.3%, so still not as good as the composite here. Other insightful statistics such as effect size, standard error, and confidence intervals were all better for the composite (than the aggregate) by very slim and insignificant margins.

Relative to some other metrics, however, these did pretty well.  SB Nation’s blue-chip ratio ( was not very strong at predicting 2019 outcomes. The correlation for BCR was 44% with only 19.5% of that correlation attributable to BCR. BCR was a quasi-significant predictor of outcome (p = 0.08). To be fair to BCR, however, only a sample (the data that was on the above-linked post) was used so with larger n, it may do better

The final scatter was a bit more thorough in placing teams in their respective quadrant. Ultimately, the Seminoles were the furthest off of expectations overall.

all services expected vs win

Below is a table that outlines how each team did in the aggregate.


The above table is the most interesting to me. The right-hand column displays how many games above or below expected each team won according to the combined rankings for all 4 services. LSU, VT, UNC, Clemson, and Oregon all won 2 (when rounded) games more than expected. ND, The Florida Gators, Penn State, and Ohio State all won 1 more game than expected.

Texas, Nebraska, Texas A&M, Miami, and USC all lost 1 game more than would be expected while South Carolina and blue-ribbon loser FSU lost 2 more games than they theoretically should have.

Below, the data are broken down by each service provider for posterity:



247 table

performance vs expected.247




performance vs expected.espn


comp_class by conference

composite table

performance vs expected.composite



Reviewing SEC 2019 QB Performances: Burrow and Tua were great. Watch out for Kyle Trask next year.

In this analysis, I took the opportunity to look at how the SEC QBs performed in 2019 when controlling for the disparity in games played. As is evident in the table below, of SEC QBs with a minimum of 250 passes, there is a bit of disparity in the number of games played.sec2019rawperformance

Joe Burrow obviously had a fantastic year. But to get a sense of the year each QB had on a scale with each other, I took the average number of games played for these 10 qualifying QBs (12.1) and projected each of their statistical performances over that number of games. For my analysis, I only used the categories in gray in the above table. This is to avoid unnecessary repetitiveness, as completions and attempts comprise the category of “Pct” (completion percentage). Y/Comp uses the completion data, so I kept that.

Below is a look at how each of these QBs projected numbers would look:


I then standardized each of the statistical categories except completions, as I no longer needed that. Of note, standardization worked here because each of the categories had data that were approximately normally distributed. Now, the only categories I was interested in were completion percentage, yards, TDs, Interceptions, and Yards per Completion. The standardized score with color-scaling is below:


To see how each of the QBs did relative to their peers, I simply summed each of the standardized scores to achieve an aggregate score. I then graphed each of these to give a sense of proportion to each performance. Burrow and Tua were on a completely different level overall:

* = standardized performance

Other than Tua and Burrow, only Kyle Trask and Jake Fromm had a net positive rating. Kudos to both. Below are the rankings and aggregate score for each QB:


This analysis serves to highlight the magnitude of the year Burrow had and Tua would likely have had. Furthermore, it shows that Kyle Trask, who started the year as a backup, really did have an outstanding season. To play that well with such limited experience indicates to me that he will potentially have a great year next season. Trask should be considered the SEC’s leading QB going into 2020 in my opinion. Of course, no QB performs in a vacuum, but looking at the 2019 performances from a statistical standpoint is certainly encouraging for Florida fans and possibly the Cincinnati Bengals (Burrow) and Miami Dolphins (Tua).

A Peek at Things on a National Scale:

I also decided to apply the same process to the top ~100 QBs nationally ( Here, I included only the top 25 Power 5 QBs:


This shows us again how impressive Trask was in 2019. He ranked 16th nationally among P5 QBs. Of note, the aggregate score changed because it is based upon the relative national scores instead of the relative SEC scores as in the previous section. Furthermore, this table shows how Tua and Burrow were both dominant at the national level as well. Other takeaways for me were Sam Howell of North Carolina performing so well as a true freshman and Trevor Lawrence being *only* at number 10.

As always, let me know if there are any errors. Go Gators.

Probability of Elite 2020 Class Recruits to be Drafted by an NFL Franchise

Using a naive Bayes machine learning model that I constructed on historical data with 71% accuracy (, I put together predictions for top 150 ranked players by recruiting services (ESPN, Rivals, 24-7, and the Composite). The table below shows the model’s probability percentage for the player to be drafted. Of note, I’ll update missing/incorrect college team once NSD is over and everyone is settled in.

Player Pos. Group College P(Drafted)
Bryan Bresee DL Clemson 90.76%
Myles Murphy DL Clemson 88.01%
Jordan Burch DL South Carolina 87.35%
Gervon Dexter DL Florida 86.28%
Kelee Ringo DB Georgia 85.05%
Bijan Robinson RB Texas 82.73%
Jalen Carter DL Georgia 82.47%
Chris Braswell DL Alabama 82.31%
Zachary Evans RB TBD 82.29%
Will Anderson DL Alabama 81.66%
Bryce Young QB Alabama 80.75%
Demarkcus Bowman RB Clemson 79.64%
Elias Ricks DB LSU 78.77%
Paris Johnson Jr. OL Ohio State 78.20%
Broderick Jones OL Georgia 77.39%
Justin Flowe LB Oregon 76.53%
Julian Fleming WR Ohio State 76.50%
Jaylon Jones DB Texas A&M 74.53%
DJ Uiagalelei QB Clemson 73.36%
Arik Gilbert TE LSU 71.62%
Noah Sewell LB Oregon 71.27%
Demonte Capehart DL Clemson 71.19%
Jaquelin Roy DL LSU 66.68%
Tank Bigsby RB Auburn 66.30%
Dontae Manning DB Oregon 64.96%
Desmond Evans DL North Carolina 63.27%
Demorie Tate DB Florida State 61.71%
Marshawn Lloyd RB South Carolina 59.48%
Drew Sanders WR Alabama 59.44%
Kayshon Boutte WR LSU 58.46%
Demond Demas WR Texas A&M 57.07%
Jase McClellan RB Alabama 55.73%
Trenton Simpson LB Clemson 55.54%
Darnell Washington TE Georgia 54.98%
Donell Harris DL Texas A&M 54.70%
Rakim Jarrett WR Maryland 54.41%
Sav’ell Smalls LB Washington 54.15%
Tate Ratledge OL Georgia 53.50%
Justin Rogers DL Kentucky 53.44%
Mekhail Sherman LB Georgia 52.21%
Timothy Smith DL Alabama 51.68%
Avantae Williams DB Florida 51.66%
Jaxon Smith-Njigba WR Ohio State 50.69%
Kendall Milton RB Georgia 50.30%
Curtis Jacobs LB Penn State 49.55%
Alfred Collins DL Texas 49.20%
McKinnley Jackson DL Alabama 48.67%
Demouy Kennedy LB Alabama 47.67%
Jordan Johnson WR Notre Dame 46.68%
Clark Phillips III DB Utah 46.05%
Fred Davis II DB Clemson 45.86%
Zykeivous Walker DL Auburn 43.85%
Chris Tyree RB Notre Dame 43.69%
Michael Mayer TE Notre Dame 43.54%
Chantz Williams DL Miami 43.50%
BJ Ojulari DL LSU 43.37%
Omari Thomas DL Tennessee 43.11%
Roydell Williams RB Alabama 41.18%
Turner Corcoran OL Nebraska 40.88%
CJ Stroud QB Ohio State 40.59%
Brian Branch DB Alabama 39.95%
Walker Parks OL Clemson 39.61%
Myles Hinton OL Stanford 38.88%
Jaylan Knighton RB Miami 38.04%
Gary Bryant Jr. WR USC 35.41%
Keshawn Lawrence DB Tennessee 35.23%
Jordan Toles DB LSU 35.09%
Jahari Rogers DB Florida 34.99%
Sedrick Van Pran OL Georgia 34.73%
Antwaun Powell DL Florida 33.87%
Phillip Webb LB LSU 32.93%
Antonio Johnson DB Texas A&M 32.91%
Jahmyr Gibbs RB Georgia Tech 32.77%
Hudson Card QB Texas 32.23%
Marcus Rosemy WR Georgia 31.36%
Tre Williams DL Clemson 31.28%
Derek Wingo LB Florida 30.76%
Arian Smith WR Georgia 29.72%
Trey Wedig OL Wisconsin 29.16%
Andrew Gentry OL Virginia 28.91%
Quandarrius Robinson LB Alabama 28.54%
Paul Tchio OL Clemson 27.11%
Jacobian Guillory DL LSU 26.71%
Garrett Hayes OL TCU 26.42%
Xzavier Henderson WR Florida 26.16%
Ja’Quinden Jackson QB Texas 26.12%
Ej Smith RB Stanford 25.69%
Cody Simon LB Ohio State 25.42%
Jalen McMillan WR Washington 25.33%
Luke Doty QB South Carolina 25.14%
Jay Hardy DL Auburn 24.97%
Antonio Doyle LB Texas A&M 24.87%
Quentin Johnston WR TCU 24.51%
Vernon Broughton DL Texas 24.41%
E.J. Williams WR Clemson 24.33%
Don Chaney Jr. RB Miami 23.81%
Tyler Baron DL Tennessee 23.05%
Jermaine Burton WR Georgia 23.04%
Gee Scott Jr. WR Ohio State 22.60%
Daniyel Ngata RB Arizona State 21.76%
Andrew Raym OL Oklahoma 21.72%
A.J. Henning WR Michigan 21.66%
Devon Achane RB Texas A&M 21.36%
Blake Corum RB Michigan 21.15%
Jalen Berger RB Wisconsin 20.88%
Theo Johnson TE Penn State 20.59%
Kedrick Bingley-Jones DL North Carolina 20.38%
Myles Murphy DL North Carolina 20.38%
Jalen Kimber DB Georgia 19.56%
Jacolbe Cowan DL Ohio State 18.74%
Darrion Henry DL Ohio State 18.04%
Nate Anderson OL Oklahoma 17.96%
Jo’Quavious Marks RB Mississippi State 17.19%
Kevontre Bradford RB LSU 16.79%
Seth McGowan RB Oklahoma 16.76%
Evan Prater QB Cincinnati 15.78%
Wesley Steiner LB Auburn 15.74%
Ja’Qurious Conley DB North Carolina 15.60%
Jack Nelson OL Wisconsin 15.09%
Chris Morris OL Texas A&M 15.04%
Cole Brevard DL Penn State 15.04%
Marcus Dumervil OL LSU 15.00%
Peter Skoronski OL Northwestern 14.99%
Tosh Baker OL Notre Dame 14.99%
Luke Wypler OL Ohio State 14.97%
Mitchell Mayes OL Clemson 14.94%
Harrison Bailey QB Tennessee 14.79%
Jalen Rivers OL Miami 14.26%
Enzo Jennings DB Penn State 14.09%
Myles Murao OL Washington 14.05%
Chad Lindberg OL Georgia 13.94%
Xavion Alford DB Texas 13.83%
Antoine Sampah LB LSU 13.55%
Ze’Vian Capers ATH Auburn 13.03%
Haynes King QB Texas A&M 12.71%
Davin Vann DL NC State 12.68%
Reggie Grimes LB Oklahoma 12.58%
Jordan Botelho LB Notre Dame 11.81%
Ethan Garbers QB Washington 11.78%
Mookie Cooper WR Ohio State 11.71%
Josh Downs WR North Carolina 11.64%
Kevin Pyne OL Boston College 11.64%
Michael Carmody OL Notre Dame 11.57%
Braiden McGregor WR Michigan 11.48%
Zavier Betts WR Nebraska 11.48%
Jah-Marien Latham DL Alabama 11.47%
Malachi Wideman WR Florida State 11.38%
Johnny Wilson WR Arizona State 11.36%
John Humphreys WR Stanford 11.27%
Josh White LB LSU 11.09%
Thaiu Jones-Bell WR Alabama 11.08%
Keyshawn Greene LB Nebraska 10.98%
Kobe Hudson WR Auburn 10.90%
Chris Thompson Jr. DB Auburn 10.86%
Jay Butterfield QB Oregon 10.70%
Logan Jones DL Iowa 10.49%
Major Burns DB Georgia 10.38%
Nick Herbig LB Wisconsin 10.08%
Van Fillinger DL Utah 9.99%
Prince Dorbah LB Texas 9.90%
Dominic Bailey DL Tennessee 9.87%
Kevin Swint DL Clemson 9.78%
Patrick Jenkins DL TCU 9.78%
Morven Joseph DL Tennessee 9.75%
Rylie Mills DL Notre Dame 9.73%
Aaryn Parks OL Oklahoma 9.73%
Henry Parrish RB Ole Miss 9.38%
Cameron Roseman-Sinclair DB North Carolina 9.25%
Tirek Murphy RB Purdue 9.22%
Kitan Crawford WR Texas 9.12%
Daijun Edwards RB Georgia 8.97%
Jalen Harrell DB Miami 8.81%
Lawrance Toafili RB Florida State 8.74%
Ethan Pouncey DB Florida 8.53%
Jacobe Covington DB Washington 8.43%
Ayden Hector DB Stanford 8.33%
Elijhah Badger WR Arizona State 8.27%
Myles Slusher DB Arkansas 8.23%
Ozzy Trapilo OL Boston College 8.06%
Jordan Morant DB Michigan 8.05%
Darion Green-Warren DB Michigan 7.96%
Dwight Mcglothern DB LSU 7.80%
Lathan Ransom DB Ohio State 7.75%
Jahdae Barron DB Baylor 7.66%
Miles Brooks DB Georgia Tech 7.66%
Ryan Watts DB Ohio State 7.56%
Emmanuel Forbes DB Mississippi State 7.43%
Keshawn Washington DB Miami 7.32%
Roger Rosengarten OL Washington 7.16%
Jerrin Thompson DB Texas 7.07%
Andre Seldon DB Michigan 7.05%
Ladarius Tennison DB Auburn 7.05%
Akinola Ogunbiyi OL Texas A&M 6.94%
Issiah Walker Jr. OL Florida 6.29%
Jeff Sims QB Georgia Tech 6.15%
Kourt Williams LB Ohio State 6.06%
Kalel Mullings LB Michigan 5.99%
Malik Hornsby QB Arkansas 5.98%
Mohamed Kaba LB South Carolina 5.92%
Jake Majors OL Texas 5.91%
Tyler Van Dyke QB Miami 5.87%
Jalin Conyers TE Oklahoma 5.86%
Edgerrin Cooper LB Texas A&M 5.64%
Shane Illingworth QB Oklahoma State 5.63%
Damian Sellers LB UCLA 5.58%
Michael Alaimo QB Purdue 5.56%
LV Bunkley-Shelton WR Arizona State 5.55%
Bryn Tucker OL Clemson 5.53%
Drew Pyne QB Notre Dame 5.50%
Zak Zinter OL Michigan 5.49%
Kris Hutson WR Oregon 5.32%
Jaquavion Fraziars WR Florida 5.32%
Loic Fouonji WR Texas Tech 5.32%
Eric Shaw WR South Carolina 5.30%
Javon Baker WR Alabama 5.25%
Jalin Hyatt WR Tennessee 5.10%
Chubba Purdy QB Florida State 4.97%
Max Johnson QB LSU 4.84%
Darin Turner WR Arkansas 4.83%
Sergio Allen LB Clemson 4.82%
Michael Redding III WR Miami 4.71%
Kaden Johnson LB Wisconsin 4.54%
Jackson Bratton LB Alabama 4.47%
Len’Neth Whitehead LB Tennessee 4.47%
Bryan Robinson WR Florida State 4.34%
Jordan Addison WR Pittsburgh 4.27%
Koy Moore WR LSU 4.27%
Michael Henderson WR Oklahoma 4.20%
Rome Odunze WR Washington 4.14%
Jimmy Calloway WR Tennessee 4.05%
J.J. Evans WR Auburn 4.01%
Maliq Carr TE Purdue 3.74%
Luke Lachey TE Iowa 3.71%
Kevin Bauman TE Notre Dame 3.61%
Matthew Hibner TE Michigan 3.50%

Reviewing the Recruiting Services… How do they Stack Up?

The Composite Rating for High School football players as prospects for college football is generally considered the gold-standard of recruiting evaluations. Virtually every peer-reviewed research paper on the topic cites this metric. But one of the key aspects to evaluating a metric is how predictive that metric is of future success. According to (, the NFL draft is a commonly-included standard for success of the top High School recruits in the nation- the 5-stars and top 150 players for each class.

I decided to look at how each of the services and the Composite have done in predicting which players will get drafted.


I used the ratings for each of the services previously mentioned. I took the top 150 players from each service between the years 2012 and 2015. This time frame was selected because it was modern and included all draft-eligible players (of note, a few 2015 recruits, such as Gators’ Receiver Van Jefferson are draft eligible this year, but that number is likely to be very low and won’t impact this study).

Each services’ top 150 players were logged and reviewed to see if they were drafted by an NFL franchise and if they were drafted, which round and overall pick where they selected. I applied the final probabilities to the 2020 class here:


Players that were listed among the top 150 recruits for each of the services (ESPN, Rivals, 247, and Composite) were given 4 “votes”. Players that were in 3 were given 3 votes and so on. After that, analytics began. Of note, the blue arrow line next to each table indicates the direction in which the heatmapping flows. So, in figure 1, the color scale would be interpreted per column as the arrow is vertical. For tables that have a horizontal arrow bar, the color scale is applied to rows.

Breaking Down Outcomes by Recruiting Service

Figure 1 Top 150 Players by Position and Service
Figure 2 Drafted Players by Position and Service

Figure 1 shows how many players in each position group were ranked by the individual services in the data set. The math there adds up (150 x 4 = 600 each, 4 services x 600 = 2400). Though there were 2400 data points, many of these players were included in multiple services (more on that later). This left me with 908 different individual players overall. Figure 2 shows how many players out of each position group and service were drafted. There were 816 players drafted from the overall data set, for a group accuracy of 34%. Figure 3 depicts the accuracy percentage below.

accuracy by position groups
Figure 3 Accuracy of Draft Picks by Position Group and Service.

As we can see in figure 3, ESPN has the lowest overall accuracy rate and 247 has the highest. They are all fairly close, but ESPN is behind here. To further explore this, I created a quick chart to map out the differences in mean (average) of each service’s top 150 to get drafted:

Figure 4 Mean Plot of Drafted Player Count by Service.

Put to scale, we can see that ESPN has done a relatively poor job in including players in their top 150 who would go on to get drafted. ESPN did tie with 247 for the most accuracy in predicting DBs drafted, so gotta give them that.

Breaking Down Outcomes by “Votes”

I was curious as to the variance among the services in putting different players in their top 150. I created a ‘vote’ count by simply tallying up which players were included in which service ranking. A player that was in the top 150 for all 4 services got 4 votes, a player in 3 of the 4 got 3 votes and so on. It became very clear to me upon charting the data that players that were consensus top 150 players (those with 4 votes) got drafted at a much higher rate.

successbyvote count
Figure 5 Overall Draft Count by Votes.

Figure 5 shows that if a player had 4 votes, they were drafted 45% of the time, up significantly from the group average of 34%. The more votes a player received, the more likely they were to be drafted. Interestingly enough, out of the 908 unique players in the study, 272 (30%) received only 1 vote. So, there is definitely some variance in how each service evaluates prospects.

Figure 6 Drafted Players by Vote Count and Position.
Figure 6a Proportional distribution of drafted vs undrafted by position group.

Figures 6 and 6a above shows the draft count data for each position group by the number of votes. In terms of pure numbers, Defensive lineman were the most drafted position with 163 and Tight Ends had the least with 29. Of note, the red arrow line on the right indicates how you should interpret the color scale for the total column on the right side of the graph only.

Figure 7 Relative Proportions of Players Drafted by Number of Votes and Position.

Figure 7 above shows the relative proportions of drafted players by the number of votes and positions. It is clear that having 4 votes was a much stronger indicator of draft potential for each position group.

Charting Vote Count Accuracy

When I initially started the analysis, I charted the data and utilized a classification tree to try to see if there was any delineation in the data that would bring to light any significant groups (clusters). An initial scatter plot was pretty busy but did show some clustering in the lower left quadrant:

initial scatter
Figure 8 Initial scatter plot of data by Draft Pick and Avg Service Rank.

This scatterplot showed me that there was probably something going on with the association between being highly ranked and drafted early.

Once it was clear something was going by in terms of the vote count, as set forth above, I looked a little more into this. A scatterplot of the data as in Figure 8 but parsed out by vote count shows just how impactful this metric is:

Figure 9 Scatterplot depicting Draft Pick by Avg Service Rank parsed by Vote count.

It is easy to see in Figure 9 that those players with 4 votes bunch up toward the higher draft picks and have overall higher counts, as we’ve seen. The lower the vote count, the further to the right the data drifts.

On the horizontal axis (“x” axis), which is labeled ‘Avg Rank’, this is the recruits’ rank averaged across all of the services in which he was ranked within the top 150. On the vertical axis (“y” axis), labeled ‘Pick’, this is where that recruit was ultimately drafted overall (i.e. 35th pick in the draft). The different colored numbers show how many votes that recruit received coming out of high school.

The Florida Gators

Of the current UF commits in any services top 150, here is how many votes they have and from each service:

Figure 10 Florida Gator Commits by Service.

Figure 10 shows that Gervon Dexter, Xzavier Henderson, and Jahari Rogers are each ranked in the top 150 by all four services. Derek Wingo has 3 votes. Ethan Pouncey and Issiah Walker have 2 votes and Antwaun Powell and Jaquavion Fraziars each have one vote.

One Step Further

I have continued to play around with the data. I built a machine-learning algorithm to see if the data is helpful in predicting which players will get drafted and which players will not. I used a naive Bayes classifier on binary outcomes (UD = ‘Undrafted’, Drafted = well, drafted). The model was impressive at predicting who doesn’t get drafted but was not necessarily helpful in predicting who will get drafted. However, it was an overall accurate model at 71%, which is pretty cool.

The resulting confusion matrix was:

Figure 11 Confusion Matrix Output from Naive Bayes classifier.

The way you interpret a confusion matrix is diagonal.  The model predicted 22 ‘drafted’ correctly and 38 incorrectly. It predicted ‘UD’ correctly 113 times and was wrong 18 times. This makes sense when you look at the corresponding graphs. There is a lot of overlay among the variables contributing to the ‘drafted’ status.

Figure 12 Density plot of Draft Status by Votes

In Figure 12 above, it is easy to see that the probability of getting drafted is much higher when 4 votes are obtained. The drafted vs undrafted lines really start to separate at about 3.5 or so. However, a large chunk of drafted players are still under 3 votes, but the disparity at the tails is marked.

If you are interested in getting a copy of the R Markdown (go to For the data set to play with, just contact me and I’ll send it to you.



Blue-Chip Recruit Migration 2020 and THE Florida Gators.

With the 2020 recruiting cycle almost in the books, I reflected upon the migratory patterns of the elusive Blue-Chip Recruit (BCR) with a focus on how it relates to the state of Florida and more specifically, the University of Florida.

First up, I wanted to see how the committed (does not include uncommitted recruits as of January 10, 2020) BCRs are distributed (as of Christmas, 2019). Here is the geographic breakdown according to the Composite ratings:

overall BC count 2020

Ok, cool. Florida is doing its thing here. Next, I wanted to see how many of these commits were staying in-state or migrating elsewhere:

exporte bcrs

So, out of the 57 BCRs from the state of Florida, 31 of them have been exported (54%). How well does that compare to the other states? Let’s look:

perc exports

The above map shows the percentage of BCRs that are exported from each state(Of note, states that have no information on them didn’t produce any BCRs, whereas states with a 0.00 percent produced at least one BCR but that recruit didn’t leave the state). Here is a table of how each state that exported a BCR breaks down:

all state export


Now, I wanted to look at how each state is importing BCRs, as just looking at exporting doesn’t provide a good understanding of the overall migratory patterns. The map below shows how many BCRs were imported by state:

total imports 2020

And the net difference between imports and exports:

net bcrs

Florida had the largest number of BCRs migrate away from the state. South Carolina, Alabama, and Ohio had the largest influx of BCRs (not coincidentally the homes of Clemson, Alabama, Auburn, and Ohio State).

It is plain to see that the state of Florida is getting poached heavily from schools in other states. But does this mean UF is not handling their business or is it simply because FSU and Miami suck and once Florida fills its class, the rest of the quality BCRs are simply avoiding the Seminoles and ‘Canes? I took a quick look to see if UF is taking care of business in the state:

in state keeps

Out of the 26 BCRs that remained in the state, UF had the most and the highest rated on average, as the chart above shows. This is good. Now, I wanted to see if UF was holding its own against other states when it comes to recruiting the state of Florida.

florida bcrs for all states

All in all, UF, is doing pretty good here. They are retaining a lot of highly-rated recruits. The above table shows, however, that Alabama, LSU, Clemson, Georgia, and surprisingly Nebraska are all having some success picking up good players from Florida (thanks, Bowman).

UF is also importing players at a good clip. Here is a comparison of the big 3:

state of florida imports

FL imports breakdown

Ultimately, it looks as if UF recruiting is going fine. The state of Florida produces a significant amount of football talent, as is well-documented. Schools from all over are going to get in on that. However, UF is doing its fair share to retain talent and they are also picking up good talent from out of state.

As always, if you see any errors, just let me know. I’ve also broken all of this down by position and ratings, but that will have to wait for another post. Until then, Go Gators.

2019 SEC Offensive and Defensive Performance vs Expectations

I recently looked at how teams did relative to their overall roster talent in terms of winning percentage. You can check that out here:

In this analysis, I drilled down a level and looked at offensive points for (PF) and defensive points against (PA) and compared the season average for each SEC team relative to their offensive and defensive talent rating. The findings kinda confirm what could be seen in watching the games play out. But there were some surprising finds as well.

Offensive PA and Roster Talent (Offense Only):

off talent vs pf

This linear regression model was statistically significant and met all assumptions. However, the goal wasn’t to form a predictive model here, but instead to see where each team’s performance fell relative to their peers and talent level. From the chart above, we can see that LSU way surpassed expectations. Georgia had the biggest negative disparity in points expected (the top number next to each team logo) vs actual PPG (bottom number). Doesn’t mean they had the worst offense- that was Vandy with 16.5 PPG. It just means they were further below expectations than any other team. Florida and Auburn performed right at expected levels.

Defensive PF and Roster Talent (Defense Only):

def tal vs pa

In the above graph, we have the inverse from the offense. In this one, a good performance is below the line. For example, Florida was expected to allow 22.59 PPG, but only allowed 15.46. LSU and Alabama, the Blue-Ribbon winners on offense, allowed a few more points than one would expect given the defensive talent ratings. Arkansas just had a bad season overall. Interesting to me, Missouri outperformed expectations in both metrics.


The small sample size is highly subject to variance. PF and PA as a stand-alone metric are not likely to be sufficient to determine the overall quality of the offense or defense. Overall roster talent allows for the inclusion of players that didn’t play (redshirts, transfers, injured, etc.) to influence the expectations but not the performance.

Some Details

Both regression models were fairly strong. The correlation between defensive talent and points allowed was 57%, with 32% of the variance in points against attributable to the model. Offensive talent was correlated with points scored at 69% with 47% of the variance in points scored attributable to the model. What this means, in general, is that relative to this sample, 68% of what goes into points allowed are variables other than overall defensive roster talent and 53% of what goes into points scored are to other variables than overall offensive roster talent.