Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: This Should Encourage Juan Mata

The Pairwise Guide

The PairWise Rankings(PWR) are college hockey's answer to the BCS. I know, I know, with an endorsement like that, what could possibiy go wrong? While most of the sports world will focus their attention of the decisions of a select few men in a smoke-filled room in Indianapolis when it comes to selecting the final at-large spot in the NCAA basketball tournament, the NCAA Hockey tournament committee takes all the surprise out of selection day by relying solely on a mathematical system to determine who will fill out the field of 16 that will compete for the NCAA hockey championship.

 

How Does It Work

We'll start with the very basic of how the system, since that's probably what the unfamiliar are most interested in, and work our way down to bitching about it later on.

The first step in this process is to calculate the Ratings Percentage Index(RPI) of every team in college hockey. College basketball fans are probably familiar with the RPI. The RPI takes a team's winning percentage, their opponent's winning percentage, and their opponent's opponent's winning percentage and factors them together to come up with a team's RPI. The only difference between the two is the weighting of each category. College basketball still uses 25%-50%-25%, while college hockey weights their RPI 25%-21%-54%. I'm not sure why. I guess they felt it gave a more accurate result, but the numbers seem to change every year. College hockey's RPI also factors out the occasional game against a really bad team where winning actually hurts a team's RPI.

Once every team is ranked 1-59 based on RPI, the top 25 teams are seperated out into what are known as Teams Under Consideration(or TUC). Each TUC is compared against every other TUC on the basis of four categories. The team that wins the most comparisons against TUCs is ranked number one, and teams are ranked so on down the line.

The four categories used in each individual comparison are RPI(worth one point), record against TUCs(worth one point), record against common opponents(worth one point), and head-to-head record(each win is worth one point). If there is a tie, the team with the higher RPI wins the comparison.

For example, let's look at the comparison between Boston University(currently #1 in the RPI) and Notre Dame(currently #2 in the RPI)

RPI: BU-.5976 Notre Dame-.5819--One point for BU

Record vs. TUCs: BU-14-3-3 Notre Dame-6-5-0--One point for BU

Record vs. Common Opponents: BU 7-1-1 Notre Dame- 5-2-0--One point for BU

Head-to-Head: BU and Notre Dame didn't play each other, so nobody gets a point

BU wins the comparison over Notre Dame 3-0.

As it stands now, BU would win the comparison against all 24 other teams under consideration, and thus, they are ranked number one with 24 comparison wins. Notre Dame is second with 23 comparison wins. Michigan is third with 22, and so on down the line.

Why This System?

A long time ago, college hockey used to go with the "smoke-filled room" style approach like the NCAA basketball tournament. And like with the basketball tournament seemingly every year, there was the occasional controversial decision made between two teams on the bubble. College hockey is a smaller community than college basketball, and those controversies had a more profound effect.

The tournament committee began going with a more mathematical approach in the 90's. They didn't use this exact system, but they used some sort of rough grid that compared teams at or near the bubble with other teams near the bubble. Eventually a group of people got together, analyzed the committee's decisions, and created the PWR system to mimic the tournament committee's selection process.(This is what happens when schools like Cornell and Harvard are major players in your sport). The PWR system correctly predicted every at-large berth for a number of years, and fans loved the ability to see exactly what was happening in regards to the NCAA tournament and not having to worry about any surprises on selection day.

Eventually, the PWR was codified in the rule book as the only thing the selection committee is allowed to look at when determining the field.

So Why Does It Suck?

And now for the bitching. It takes a whole lot of numbers to come up with the 10 at-large bids for the NCAA tournament. The layperson looks at the PWR and thinks "Boy that's a lot of numbers, it must be pretty accurate". The problem is that most people well-versed in math will tell you that it sucks and it some cases, doesn't make a lot of sense. Personally, I have two main problems with the system.

1. The TUC Cliff

The PWR draws an abitrary line at the top 25 teams in the RPI. Anyone on the right side is annointed a "good team", anyone on the other side a "bad team". The thing is, there are only 58 teams in college hockey, meaning the top 25 encompasses 43% of all teams.

In the TUC category of a comparison, a win against the #1 team counts the same as a win against the #25, even though there is a huge gap between the two.

This can create some wild fluctuations in the calculations or a week-to-week or even game-to-game basis. If a team sweeps a season series and goes 4-0-0 against a team near the TUC cliff, having those four games on their record could be enough to flip a lot of comparisons depending on which side of the cliff that teams ends up. Proponents of this system argue there is no cliff because the system is designed to only be looked at once, at the end of the season, and thus, there are no fluctuations, but regardless, teams still gain a disproportinate benefit if a team they beat ends up 25th rather than 26th.

One of the most interesting TUC cliff examples came last season. Notre Dame was on the outside looking in at the NCAA tournament late in the season until Northern Michigan won a couple games and moved into the top 25. Adding their games against Northern Michigan was enough to push Notre Dame back on the right side of the NCAA tournament bubble.

It just so happened that Notre Dame met Northern Michigan in the third place game of the CCHA tournament. Northern Michigan needed to win the game in order to remain a TUC. A loss or tie would have knocked them out of the top 25, but a Notre Dame win would have been enough to get them in the tournament regardless of Northern Michigan. So the situation was simple: either a win or loss by Notre Dame got them into the tournament, but a tie would knock them out of the NCAA tournament.  Northern Michigan tied the game at one apiece in the third period, but then eventually took the lead two minutes later.

To their credit, Notre Dame gave an honest effort to try and tie and hopefully win the game, and a potential game-tying shot rang off the post late in the third period, but they ended up losing their way into the tournament, where they went on a nice run to the NCAA championship. It brought up interesting hypothetical questions, however, about whether it would be worth shoveling a puck into your own net in overtime, if it meant securing an NCAA bid.

2. Outliers

The reason the NCAA doesn't solely use the RPI is that they don't really believe it's a perfect measure of a team's season. So they created a system that uses the RPI heavily, but would also modify it a slight bit in certain instances.

The problem is that sometimes those unweighted modifiers produce some goofy, or just plain illogical results that completely override a season's worth of work measure by the RPI, rather than simple modifying it a little bit.

Take this year, for example. Currently, Vermont has the 5th best RPI in the country at .5606, while Minnesota has the 16th best RPI in the country at .5308. That's a pretty substantial gap, and you'd be hard-pressed to find any person that would say Minnesota has been better than Vermont this season. But the computer says Minnesota wins the comparison against Vermont.

Vermont has the huge advantage in RPI, so they get that point. But Minnesota has a one game advantage in TUC record--Vermont is .500, Minnesota is one game over .500--so they get that point(Although Minnesota has a ton of games played against teams right on the TUC cliff so that could fluctuate a lot). And Minnesota has the advantage in common opponents thanks to their 2-0-1 record against common opponents with Vermont.(And Vermont has six common opponent games with Minnesota, and most people would tell you it's much tougher to go 4-0-2 against a set of teams than it is to go 2-0-1). They never played head-to-head so there's no point there. So basically, a huge advantage over the course of a 34-game season gets completely wiped out by about four games.

What's a Better Solution?

A lot of mathy people endorse the KRACH rating system. Some of you might know it better as the "Bradley-Terry" system, which gets brought out a lot when debating the merits of the BCS' selections.

The math may be better, but the results it spits out don't draw many fans. Because of college hockey's insular schedule and high proportion of league games to non-conference games, the teams in the strongest conference seem to get a huge boost in the KRACH system. There have been instances where teams with a losing record have been in line for the at-large NCAA tournament bid with the KRACH system, which is something that wouldn't go over well.(The NCAA has banned teams with losing records from getting at-large tourney bids in hockey, though Wisconsin got in last year with a losing record before that rule took affect).

Overall, I think they've got the right idea by starting with the RPI and trying to modify it.  The issue is finding a more accurate way to modify the RPI numbers. The TUC and Common Opponent categories need to have some way to weight for number of games played and the strength of the opponents.

And if there has to be a cliff, it should be in a more logical spot. NCAA tournament fates shouldn't be decided on whether a team finishes in the 43rd or 44th percentile of college hockey. They should go back to the old way where the teams under consideration were actually teams under consideration for an at-large spot.

This all seems pretty complicated, perhaps overly so, but there's also no denying its importance. The NCAA hockey isn't like the NCAA basketball where odds are heavily stacked against a double-digit seed. With just 16 teams, in a single-elimination format, every game is pretty much a coin flip determined by which team gets better goaltending and a lucky bounce or two. Each team in the tournament has a legitimate chance to win it all, so getting in is huge.

For further reference, College Hockey News does a job of explaining the PWR.

Comment 5 comments  |  0 recs  | 

Do you like this story?

More from Western College Hockey Blog

2010-2011 Top 50 NHL Prospects

Feb 2011 by WCHBlog - 3 comments

Linkorama

Aug 2010 by WCHBlog - 6 comments

More Dirty Tricks from CHL

Aug 2010 by WCHBlog - 39 comments

Icing Rule Dies; All Others Pass

Jul 2010 by WCHBlog - 0 comments

Comments

Display:

Good analysis, though I’ll just add one caveat to the post, which actually has little to do with the driving force behind your argument.

It was not some genius at Harvard or Cornell who developed the PWR, but the creators of U.S. College Hockey Online who created the PairWise. As a former college beat writer for a WCHA school, I had several chances in college to attend the Final Five with the benefit of a press credential, and it was always interesting watching these guys (Scott Brown was one), hold court during press conferences after nearly every game to explain how the latest result affected the PWR.

by Lafavs on Mar 9, 2009 1:34 PM PDT reply actions  

People may not like the answers that KRACH gives, but there isn’t any way to modify RPI that deals with its flaws. The problem of limited out of conference games damages RPI far more than it does KRACH. By definition, all teams in a conference will average a .500 record in conference play, regardless of how strong the conference is. Since RPI only looks at raw Won/Loss totals (the team’s, the average of the team’s opponents, and the average of their opponents’ opponents), it can’t wash out the effects of differing schedule strengths. You can adjust it such that, under ideal conditions, it goes assymptotically close to zero, but “ideal conditions” vary depending upon who is playing who. It can’t really be predicted.

You also touch on a problem with TUC, aside from the cliff, but don’t really examine it. I’ll illustrate it using common opponents, which has the same problem. The system assumes that, by using the same opponents, you artificially create schedules of equal strength. This is a false assumption. As an extreme, let’s take the common opponent schedule of hypothetical teams in Hockey East and the WCHA. Their common opponents are Michigan Tech, and BU. The WCHA team played Tech four times, winning three, and lost one game to BU in a holiday tournament; this gives them a COp record of 3-2. The Hockey East team went 1-2 against BU, and beat Tech in one game in a holiday tournament, giving them a COp record of 2-2. Clearly, these two synthetic schedules aren’t remotely comparable in strength, but PWR treats them as if they are. Now, the common opponents problem has an easy fix (the TUC problem doesn’t), which is that you could normalize the synthetic schedule to one game against each. In the above example, the WCHA team went 1-0 against Tech, and 0-1 against BU, for a .500 winning percentage, while the Hockey East team went 1-0 against Tech, and 0.33-0.67 against BU, for a .667 COp winning percentage.

RPI is broken, and it’s not really fixable within its basic concepts. KRACH gives a much better idea of who the best teams are. I’m not up in arms to change things, because I think that there are considerations for selection to the NCAA Tournament beyond getting the 16 “best” teams, or even the 14 "best’ teams after automatic bids. Spreading it out some is fine by me. My gripe comes about since I have a degree in statistics, and am a math nerd; I don’t mind the particular numbers that people are using, but I really wish the folks at the NCAA took the time to understand the numbers they are using.

by J. Michael Neal on Mar 9, 2009 2:00 PM PDT reply actions  

RPI change

The RPI was changed from the traditional 25-50-25, to the hockey specific 25-21-54 for no other reason than to reduce the number of games that would have a negative effect on a teams RPI…there was no other reason for making that change, which shows the flaws in the RPI to begin with…

by Shirtless Guy on Mar 10, 2009 9:53 AM PDT reply actions  

Mathematically, this is particularly silly, because, as you move away from first order winning percentage of opponents (i.e. the winning percentage of opponents’ opponents, or opponents’ opponents’ opponents) either converges very rapidly to .500 (if you don’t remove games from the calculation that have been used for one of the prior calculations) or has too few pieces of data to be at all meaningful (if you do remove them). It’s just a worthless calculation.

by J. Michael Neal on Mar 10, 2009 4:03 PM PDT up reply actions  

As one of the “mathy” people for lack of a better term I see the whole thing as kind of the quick argument for democracy. Its the worst method excepting all the others.

The reality is there’s no perfect way to shoe-horn an entire season’s worth of work against the backdrop of a massively unbalanced schedule. This goes for any of the college sports sponsored by the NCAA. I think for what it is, the Pairwise system, should be a model of the idea of what you want college athletics to use for all their sports.

I don’t say this to say “look at how awesome the PWR is”… of course not, it has many substantial flaws. That being said, in general its tough to gauge teams even with experienced eyes… and who is to say whose experienced eyes are more deserving than others to stand in judgment. This is why I favor the PWR or other numeric methods. Worst comes to worst it stands there and we know exactly what it is.

Some people talk about infusing the system with the Bradley-Terry method (KRACH)… I think Bradley-Terry has its own issues and it is not in itself a perfect system. Is it better than the RPI? Definitely. But it seems to have the “Tennessee Effect” of overweighting SOS… granted the Bradley-Terry model is the strongest just from a heuristic standpoint. Last year if we used it the NCAAs would have admitted 7 or 8 WCHA teams. That in itself is probably against the best interest in the sport… but then again who is to say whether or not its deserving.

Shirtless… in principle I agree with you on the RPI re-weight… but I have to imagine, with such an odd weighting that maybe there was some reason behind it (maybe an internal study done with somebody with a math/stat background????)… its an odd break down and I’d have to think they came to this after some sort of analysis.

IMO, I have a hard time understanding why the RPI is different by each sport. There is no real fundamental understanding to be had in the RPI… its just another measure.

As for the TUC cliff… I think there’s some ways to adjust that, but I don’t think any proposal i have in mind would be “nice”.

—Patrick

by Patman on Mar 10, 2009 8:57 PM PDT reply actions  

Comments For This Post Are Closed


User Tools

News and Views on what is happening in both the Central College Hockey Association and Western Collegiate Hockey Association as well as throughout the hockey community. Please email any questions/comments/information to westerncollegehockey@gmail.com

FanPosts

Community blog posts and discussion.

Recent FanPosts

+ New FanPost All FanPosts >


Managers

Western-lg_small WCHBlog