There is less than a month to go before Selection Monday, the day that 33 at-large teams are selected to join 31 conference winners to make up the 64-team field for the NCAA Tournament. When the Selection Committee meets it will pore over all kinds of data, but at the center of their deliberations will be the Rating Percentage Index (RPI), an arcane system the NCAA uses for ranking teams in various sports.
The RPI is a simple formula. A team’s RPI is a percentage based on three factors: The team’s winning percentage (25%), the winning percentage of its opponents (50%), and the winning percentage of the team’s opponent’s opponents (25%). It does not consider margin of victory, only wins and losses. In recent years the RPI has come under scrutiny and as a result some modifications have been made, primarily for men’s basketball, to account for home and road wins and losses.
The system is supposed to be a jumping-off point for discussions, a supposedly neutral system that makes an initial ranking. This ranking is then used to come up with what is called a “nitty-gritty” report, breaking down a team’s records against the Top 25 teams, teams ranked 26-50, 51-100, 101-200 and under 200. The report also includes information breaking down conference and non-conference records, home-road performance, and other data.
So how is the RPI doing this year? There is no question that the number one team in the nation is Connecticut. Not only are the Huskies the unanimous Number One in both polls; they are also first in the three major computer ranking systems (Sagarin, Massey and Dolphin). And in the RPI they are … Number Two, behind an Oklahoma team that they beat by 28. Oops!
Or take the example of Rutgers and California. California is 20-3, ranked sixth in the nation, and beat Rutgers by 14, leading by as much as 28 in the second half. Cal has a 3-2 record against the RPI Top 25 is 5-3 against the RPI top 50. Rutgers is 1-8 against the RPI Top 25 and 5-8 against the RPI Top 50. And against their only common opponents, South Florida and Stanford, Cal is 2-1 while Rutgers is 0-2. But until Saturday when Rutgers beat a 9-17 Providence squad while Cal beat Oregon State (14-10), Rutgers was rated higher than Cal. Cal is now 19 while Rutgers is 22.
These are just two examples of where the RPI has (or had) it wrong, and neither one will have an impact on the selection process or the seeding. UConn will be the overall Number One, Cal will probably be a two seed depending on how it finishes the season, and Rutgers will also be in, probably as a 7-10 seed depending on how it finishes.
But if the RPI can’t get these things right, how can you rely on it to properly rate the teams between 40 and 70, the bubble teams? And what are the reasons why what is seemingly obvious to any fan is not obvious to the RPI?
While the RPI formula is simplistic, it initially seems to be fair. With 75% of the formula based on scheduling it would appear to favor teams that play better teams. Unfortunately those initial impressions are not entirely correct.
The formula would be fair if scheduling were random. But because teams play in conferences, the quality of the teams in each conference dramatically impacts each team’s RPI. And because there are six major conferences (ACC, Big 10, Big 12, Big East, Pac 10 and SEC) and two other conferences (Atlantic 10 and Mountain West) that are well above the rest, the scheduling dynamics are perverted.
It is actually better for the top conferences to schedule non-conference games against the weaker conferences. This is because as the major conferences pile up wins against the weaker conferences, the advantage received is far greater than just one more win. Each win by Seton Hall, for example, helps every other Big East team because the victory increases the winning percentage of the opponent when it comes time for those Big East teams to face off against Seton Hall. And it thereby also helps the winning percentage of Seton Hall’s opponents’ opponents, because each team in the Big East plays each other.
To look at how this works more closely, suppose you have an eight-team league. Assume every team plays each other four times and plays no nonconference games. For any team within such a league, the winning percentage of the team’s opponents would be 50 percent. Why? Because in such a closed universe, every time a game is played, one of the team’s future opponents will win and another will lose. Regardless of the win-loss record of each individual team, the cumulative winning percentage of the league as a whole (and therefore of every team’s universe of opponents) must be 50 percent. And because every team in such a closed league is also the opponents’ opponent of every other team, the winning percentage of opponents’ opponents is also identical for every team—50 percent. As a result, the three quarters of the RPI formula based on strength of schedule would be the same for every team in such a hypothetical league.
This means that if every team in the league wins half its games and loses the other half, the RPI of each team in the league would be .5000. Even were one team to win all 28 of its games, its RPI would be only .6250 --i.e., RPI = (.25 * 1.00) + (.50 * .50) + (.25 * .50).
Now suppose that there are two-eight team leagues and that the teams now play everyone in their own conference three times while playing the remaining eight games against the other conference. Because the first conference is far better, it wins 90% of the games between the conferences. Under this scenario, the average RPI of the first conference jumps up to .5666, and the record of an undefeated team would be .663.
Finally, suppose now that there are three leagues and that the teams play everyone in their own league twice while playing the remaining games against the other two leagues equally. The first conference wins 90% of its games against each of the other two leagues, while the other two leagues win 50% against each other. In this case, the average RPI of the first conference goes up to .586. An undefeated team from the first conference would boost its RPI to .666.
The point of these examples is that just by playing teams from a poorer conference the average RPI of the teams in the stronger conference goes up. (And, if you do the math, the RPIs in the weaker conferences go down accordingly.)
I have used a 90% winning percentage because that is the winning percentage of all games played this season between the top six conferences and all other conferences, excluding the Atlantic 10 and the Mountain West. The averages range from a high of 94.7% for the Big 12 to a low of 83.3% for the Pac 10. The example shows that the major conferences can improve their RPI by beating up on weaker conferences. (This will result in fractional victories, but it does not affect the overall conclusion.)
Among the top eight conferences, the Big 10 and Pac 10 each play 18 games within their conference, the Big East, Big 12 and the MWC each play 16, and the ACC, SEC and A10 each play 14. The Big 10 also plays nearly 50 percent of its nonconference games against the other seven listed conferences, meaning that 80 percent of all its games are against top conference competition.
And because of this, the Big 10 is the most disadvantaged conference in terms of the RPI. According to College RPI.com (a wonderful site run by Jerry Palm which is the source used for the current RPI rankings throughout the article), the Big Ten was the number two conference in nonconference play, but based on the full schedule, the conference now ranks fifth in RPI. The conference average has dropped from .5995 to .5714. A team with a .5995 rating would be ranked 35, while a team with a .5714 rating would be ranked 63. This becomes even more significant because the conference average for the ACC, Big East, Big 12 and SEC have all gone up.
Geographical location also plays a role, because teams often schedule nonconference games against regional opponents. A team in the east can play teams from several minor conferences. The Patriot, Ivy, Northeast, Metro Atlantic, MEAC and the America East are located in close proximity to the teams in the Atlantic 10 and Big East. As a result, the larger conference teams can cherry pick the better teams in the weaker conferences and as a result get wins without playing sub-.500 teams. The RPI actual values a win over a team such as Lehigh (ranked 102) or North Carolina A&T (124) more highly than a win over Nebraska (66) or North Carolina State (84).
In the west there are fewer low-rated conferences and as a result the teams in the Mountain West and Pac 10 can’t cherry pick the better teams. This means that they wind up playing a broader cross-section of good and bad teams, and their opponent’s record suffers.
There is also an aspect of luck that is involved. This year the most absurdly low rating is Utah. The Utes are clearly deserving of an at-large bid based on their body of work, but are only rated 70 at this time. They are leading the Mountain West by two games. While they had an inexcusable loss to Weber State, that is their only loss to a team rated lower than 53. They have wins over Marquette, TCU, San Diego State, New Mexico and BYU. But they had the misfortune of playing Santa Clara. Santa Clara is 2-24 this year after winning 20 last year. A win over Santa Clara this season is worse than a loss to Santa Clara last season. The Air Force Academy has hurt all of the Mountain West rankings. Air Force is usually bad, but not 2-21 bad. Since Utah has played them twice, this hurts their rating dramatically. Add in a win over winless Norfolk State and that wipes out all the high quality play that the Utes have shown.
On the flip side of the coin is Temple. They are rated 36, even though they haven’t beaten a team in the RPI Top 40. They are 5-7 against teams ranked in the RPI Top 100 and also have a loss to Massachusetts, ranked outside the top 200. But Temple started the year with Bowling Green and Auburn, who have a combined record of 48-4. And even though Fordham is every bit as bad Air Force, the Rams scheduled some major patsies and as a result won seven games outside of the Atlantic 10. The bottom four teams in the Atlantic 10 each scheduled at least five teams outside the Top 250 nonconference, so their overall records did not drag down the RPI of the better teams in the conference.
If the Selection Committee wants to make their job easier (and their selections more sound), they should throw out all the RPI data. Instead use one of the computer ratings, or a blend of all of them, to develop an initial ranking. Then calculate the “nitty-gritty” report using that initial ranking so that the data is more useful. When the Committee is trying to sift through mountains of data in a short period, it is imperative that the data is meaningful. But when one team beats three teams that are rated between 51 and 60 while another beats three teams that are between 41 and 50, the difference is significant only if you can have confidence in the ratings themselves.
Between now and the start of next year the NCAA should get some mathematicians from their institutions of higher learning to develop a rating that is more accurate. That new rating should not penalize conferences for playing more conference games or playing more high profile intersectional games. The rating should never penalize a team for winning a game and should never make a loss over a good team better than a win over a bad team, unless margin of victory is considered.
There will always be disagreements over the last few teams that make the tournament field, but if the Selection Committee has better data, better decisions will be made.