At the simplest level, it means a lower-seeded team beating a higher-seeded team. This can happen for two reasons. First, the committee may have "blown" the seedings -- as they arguably did with Texas / Cincinnati and Purdue / St. Mary's this year, two games that most of the machine predictors thought would be upsets. Second, an upset can happen when the weaker team plays well and/or the better team plays poorly. College basketball teams don't play at their mean performance every game. Some games are better and some are worse, and this can lead to an unexpected result. This understanding suggests that upsets may be more likely when two inconsistent ("volatile") teams meet.

Imagine two hypothetical teams that played the same schedule. Team A averaged 84 points per game and scored between 81 and 88 points every game. Team B also averaged 84 points per game, but scored between 28 and 96 points. Now both these teams play Team C, that averaged 70 points per game against the same competition. Which is team is Team C more likely to beat? It seems reasonable to guess Team B.

So how can we identify these "volatile" teams? The obvious method is to measure something like the standard deviation of a team's performance over the course of the season. But we have to be careful in how we do this. For example, measuring the standard deviation of points scored might be very misleading because of pace issues.

Fortunately for me, I already have a good measure of team performance that includes standard deviation: TrueSkill. This probably isn't a perfect proxy for measuring a team's consistency, but it's certainly good enough for a quick investigation into the merits of predicting upsets by looking at consistency. (It's easier to think of this measure as volatility rather than consistency, so that the higher values mean more volatility.)

I took all of this year's first round games and ranked them according to the combined volatility of the two teams involved and then identified the most volatile game at each seed differential to see how well this predicted upsets:

Seeding | Most Volatile Game by Seed Differential | Upset? |
---|---|---|

8-9 | Kansas St. - Southern Miss | N |

7-10 | St. Mary's - Purdue | Y |

6-11 | Murray St. - CSU | N |

5-12 | Vanderbilt - Harvard | N |

4-13 | Wisconsin - Montana | N |

3-14 | Marquette - Iona | N |

2-15 | Missouri - Norfolk St. | Y |

1-16 | Syracuse - NC Asheville | N |

One problem with this approach is that seeding is a rather broad measure of team strength. For example, Duke was by far the weakest of the #2 seeds. It might be productive to use a more accurate measure of the strength differences between the teams. We can use the mean TrueSkill measure for each team to do that, and rank teams according to the sum of the standard deviations divided by the difference of the means. That results in this table:

Seeding | Most Volatile Game by Strength Differential | Upset? |
---|---|---|

8-9 | Creighton - Alabama | N* |

7-10 | St. Mary's - Purdue | Y |

6-11 | SDSU - NC State | Y |

5-12 | Temple -USF | Y |

4-13 | Michigan - Ohio | Y |

3-14 | Georgetown-Belmont | N |

2-15 | Duke - Lehigh | Y |

1-16 | North Carolina - Lamar | N |

* One point win for Creighton

This works remarkably well for this year's first round -- especially considering that there were no upsets in the 3-14 or 1-16 matchups. Of course, identifying the most likely upset at a particular seeding isn't quite the same as identifying the most likely upsets across the whole bracket, so let's look at the top 8 upsets predicted by this metric across the entire first round:

Seeding | Most Volatile Games Overall | Upset? |
---|---|---|

5-12 | Temple - USF | Y |

6-11 | SDSU - NC State | Y |

7-10 | Notre Dame - Xavier | Y |

7-10 | St. Mary's - Purdue | Y |

8-9 | Creighton - Alabama | N* |

7-10 | Florida - Virginia | N |

6-11 | Cincinnati - Texas | N |

8-9 | Memphis-St. Louis | Y |

* One point win for Creighton

Again, this is pretty good performance -- 75% correct in the first four picks and 50% correct in the first eight.

To a certain extent, a good predictor is going to capture some of this anyway (the Pain Machine identified the three correct upsets in the first four picks), but looking at the volatility of team performance may be good additional information in predicting tournament upsets.