|
Post by karplets on Nov 29, 2011 11:04:32 GMT -5
I'm going to attempt a thread based on an Elo-like (or relatively Elo-like) rating, the Massey ratings and how it relates to the tournament bracket.
While I don't think it's as good a prediction tool as some other ratings (to wit: Pablo. Or even the second rating that Massey publishes, the "Power" rating), I think Elo ratings are a useful part of the discussion for a few reasons. Number one being that there are plausible reasons why an Elo rating could be acceptable for use by the NCAA in future where Pablo may not. Other reasons include that , once you get used to it, an Elo rating can be easier to use and grasp and make quick estimates with compared to a system like Pablo -- it may serve even as an intermediate step in grasping a system like Pablo.
I'm going to start off by listing the ratings of the teams in the tournament but keep in mind I've converted Massey's numbers in a crude, unscientific way to a ratings scale I'm familiar with, the old Albyn Jones scale that was developed for women's soccer. The relative rankings between the teams aren't changed by doing this (#1 is still #1, #2 is still #2 and so on) but the meaning or significance of the rating difference may not reflect accurately Massey's numbers. It's a bit of guesswork on my part which has worked rather well for soccer but I'm not so sure about yet for volleyball.
The absolute numbers themselves don't have meaning, it's the difference between one team's rating and another. In the Albyn Jones scale, a difference of 100 points between Team A and Team B, for example, means that the higher rated team, Team A, has an expected win probability of (.667) over Team B (ignoring homefield or homecourt). If Team A is rated 200 points higher than B, then the expected win probability is (.800)
Naturally, later on, I plan on making some comments about the bracket and how balanced it is and things of that sort.
|
|
|
Post by karplets on Nov 29, 2011 11:08:59 GMT -5
In order of Massey ratings from highest to lowest (numbers crudely converted to the Albyn Jones scale) Sorry that my formatting is crummy... (Seed # is listed first)
7 USC 1979 3 Illinois 1955 1 Texas 1954 6 No. Iowa 1950 5 Purdue 1946 2 Nebraska 1938 10 Hawaii 1936 4 Iowa St 1908 8 Penn St 1907 9 UCLA 1904 14 Tennessee 1895 California 1868 11 Stanford 1863 Florida 1857 15 Pepperdine 1849 Washington 1841 12 Florida St 1829 Miami 1824 13 Minnesota 1815 Oregon 1815 San Diego 1814 Kentucky 1813 16 Texas A&M 1802 Western Ky 1787 Long Beach St 1774 No. Illinois 1770 Tulsa 1764 Michigan 1757 Colorado St 1745 Cincinnati 1736 Michigan St 1732 Arizona 1730 No. Carolina 1727 Kansas St 1720 Dayton 1717 Ohio St 1707 Duke 1698 Oklahoma 1697 Milwaukee 1686 Ball St 1684 No Dakota St 1683 Louisville 1678 Missouri 1674 Wichita St 1671 Missouri St 1663 West Michigan 1647 Marquette 1645 Baylor 1624 Middle Tenn 1602 Texas St 1585 No. Colorado 1581 Samford 1575 Central Mich 1574 Morehead St 1525 Albany (NY) 1523 Delaware 1511 Sacred Heart 1504 Yale 1474 Lipscomb 1471 American 1429 Niagara 1409 Liberty 1365 Md East Shore 1289 Jackson St 1217
In the Albyn Jones scale, a team rated 100 points higher than another would have an expected winning percentage of .667 (ignoring homecourt/homefield).
100 pt differential: .667 expected win pct (2 to 1 win ratio) 200 pt differential: .800 expected win pct (4 to 1 win ratio) 300 pt differential: .889 expected win pct (8 to 1 win ratio) 400 pt differential: .941 expected win pct (16 to 1 win ratio)
homefield advantage in soccer: usually was calculated to be between 50 and 60 rating points
|
|
|
Post by Ye Olde Dawg on Nov 29, 2011 12:09:04 GMT -5
Thank you!
I hope some VT'er--I don't have the energy at the moment--can put together expected bracket outcomes according to RPI, seeding (a gimme), AVCA and maybe RichKern, and Pablo. After all this whining about the brackets, some comparison with actual results would be good.
|
|
|
Post by karplets on Nov 29, 2011 12:14:35 GMT -5
Thank you! I hope some VT'er--I don't have the energy at the moment--can put together expected bracket outcomes according to RPI, seeding (a gimme), AVCA and maybe RichKern, and Pablo. After all this whining about the brackets, some comparison with actual results would be good. Yes, let's hope someone does that. It's interesting to see and could be revealing although I'm also one of those who caution that prediction isn't everything as far as the ranking systems go, although that may seem counter-intuitive. I'm sure to harp on that a little later...
|
|
|
Post by karplets on Nov 29, 2011 12:23:59 GMT -5
Here's the expected win probabilities using the converted Massey ratings for the first round.
favored team is listed first followed by their rating / site (H) = 1st team has homecourt, (N) = neutral site / win probability / underdog followed by their rating
The list is sorted by rating differential - from lowest to highest.
Homecourt advantage = 60 points
Michigan St 1732 (N) 0.500 Arizona.... 1730 Ball St...... 1684 (N) 0.509 Louisville 1678 San Diego 1814 (N) 0.569 Long Beach St 1774 Kansas St 1720 (N) 0.577 Wichita St 1671 Miami...... 1824 (N) 0.586 No. Illinois 1770 Tulsa...... 1764 (N) 0.611 Oklahoma 1697 Oregon.... 1815 (N) 0.619 Colorado St 1745 Delaware 1511 (N) 0.635 American 1429 Kentucky 1813 (N) 0.659 Dayton.... 1717 Ohio St...... 1707 (N) 0.674 Middle Tenn 1602 Michigan 1757 (N) 0.711 Baylor.... 1624 California 1868 (N) 0.725 No. Carolina 1727 Western Ky 1787 (N) 0.725 Marquette 1645 Cincinnati 1736 (N) 0.752 Samford.... 1575 Florida.... 1857 (N) 0.777 Missouri 1674 Minnesota 1815 (H) 0.789 No Dakota St 1683 Washington 1841 (N) 0.789 West Michigan 1647 Pepperdine 1849 (H) 0.845 Missouri St 1663 Tennessee 1895 (H) 0.854 Duke...... 1698 Iowa St...... 1908 (H) 0.874 Milwaukee 1686 Florida St 1829 (H) 0.926 Albany (NY) 1523 Texas A&M 1802 (H) 0.937 Lipscomb 1471 Hawaii...... 1936 (H) 0.947 No. Colorado 1581 Stanford.... 1863 (H) 0.947 Sacred Heart 1504 Texas...... 1954 (H) 0.950 Texas St 1585 Illinois.... 1955 (H) 0.955 Central Mich 1574 Purdue.... 1946 (H) 0.965 Morehead St 1525 USC...... 1979 (H) 0.980 Yale...... 1474 UNI (No. Iowa) 1950 (H) 0.985 Niagara 1409 Penn St.... 1907 (H) 0.985 Liberty.... 1365 UCLA....... 1904 (H) 0.989 Md East Shore 1289 Nebraska 1938 (H) 0.995 Jackson St 1217 Gosh darn my formatting stinks. It's worse here than on BigSoccer. Sorry!
The average expected win percentage for the favored side is (.792) That corresponds to an expected result of 25 wins out of 32 games for the higher-rated teams.
|
|
|
Post by The Bofa on the Sofa on Nov 29, 2011 12:29:03 GMT -5
Thank you! I hope some VT'er--I don't have the energy at the moment--can put together expected bracket outcomes according to RPI, seeding (a gimme), AVCA and maybe RichKern, and Pablo. After all this whining about the brackets, some comparison with actual results would be good. Yes, let's hope someone does that. It's interesting to see and could be revealing although I'm also one of those who caution that prediction isn't everything as far as the ranking systems go, although that may seem counter-intuitive. I'm sure to harp on that a little later... I would add that it is only 63 matches, half of which are so lopsided that they are easily picked by any ranking system, and half of what is left is sufficiently lopsided that everyone picks them the same. As such, you are maybe down to about 15 meaningful matches, and where the best method might get 10 right and the worst gets 7. Considering that a standard deviation is ± 3 or so, that doesn't tell us anything. Now, do this over the course of 5 or 10 tournaments and you might discover a pattern. The difference between Pablo and RPI over the last 5 tournaments is about 10 matches total, so 2 per year. It's a real difference, especially considering my description above (there are very few meaningful matches in any one year). However, there was at least one season where RPI outperformed Pablo. That doesn't make RPI better than Pablo overall, just that there aren't enough matches in a single tournament to compare them. Moreover, a single badly ranked team can account for a lot of the losses (when I do prediction assessment in the regular season, I am usually considering 300 - 400 matches involving 300 teams so if a team is off, it has a small effect; of course, this works in the opposite direction - if one method has a team nailed like Pablo did with Penn St last year, and everyone else has it wrong, it can inflate the results. This can massively distort a single season, so that's why you need to look over multiple seasons)
|
|
|
Post by karplets on Nov 30, 2011 12:35:49 GMT -5
Bofa on the Sofa's comments above about 63 matches not being much of a sample is, of course, apt and well worth reading.
On another note, I wanted to compare the "strength-of-schedule" of the different teams in the bracket using the Massey ratings. This is similar to what Bofa did on another thread using Pablo. I like Bofa's method but I'm going to stick with another one simply because I've done it this way before -- and also because it shows the results in terms of the probability of advancing through, say, the 2nd round or 3rd round and so on. I think it helps illustrate the very real difference that a favorable (or unfavorable as the case may be) draw can make.
So what I've done is assess the difficulty of the various seed's path through the tournament based on the ratings of their most likely opponents - that is, of course, essentially, the highest rated opponents in their side of the bracket. (The exception to that having to do with potential homecourt advantage in the regionals). And what I've done is to calculate the probability of a hypothetical team with a rating of 1900 advancing through the likely "hardest path" of the various seeds. (1900 is roughly the median of the top 16 teams, in theory the median of the seeded teams in the tournament if the Massey ratings were used to determine that.)
Through the 2nd round, here's the table. (1st number is the team's seeding) Probability of a 1900 team advancing playing the projected opponents of the teams listed:
8 Penn St 0.941 2 Nebraska 0.834 5 Purdue 0.830 12 Florida St 0.782 1 Texas 0.769 11 Stanford 0.767 7 USC 0.767 14 Tennessee 0.729 9 UCLA 0.724 3 Illinois 0.715 16 Texas A&M 0.707 10 Hawaii 0.681 6 UNI (No Iowa) 0.652 4 Iowa St 0.623 13 Minnesota 0.600 15 Pepperdine 0.576
To absolutely no one's surprise, Penn St has the easiest path to the 3rd round. I should be careful about treading on that subject but I think it's worth pointing out that these probabilities show that it's a real effect, the advantage of a favorable draw even in the 1st two rounds. That's a big difference between a (.941) probability vs (.767) or (.729), something in the low .700s or below.
At (.941) or, roughly speaking 95%, that means there's only a 1-in-20 chance of getting upset in the 1st or 2nd round. That's not happening often.
But when you're talking about the .750 range, that's a 1-in-4 chance, a 25% probability of getting knocked out. It's an upset but that's the kind of upset odds that happen. You have around 4 teams with that kind of probability - the likelihood (based on their being a 1900 team) is that one of them is getting knocked out.
Now of course, volleyball being a non-revenue sport, a lot of this has to do with regional pairings and travel restrictions so I don't want to get overly agitated about this but I want to emphasize these things do matter, at least from my perspective looking at things from a probability-based rating standpoint.
Through 3 rounds:
5 Purdue 0.514 8 Penn St 0.462 2 Nebraska 0.461 1 Texas 0.418 6 UNI (No Iowa) 0.365 3 Illinois 0.364 13 Minnesota 0.352 9 UCLA 0.350 4 Iowa St 0.338 12 Florida St 0.324 11 Stanford 0.318 10 Hawaii 0.317 14 Tennessee 0.296 16 Texas A&M 0.287 7 USC 0.255 15 Pepperdine 0.248
One thing to note here is USC's projected strength-of-schedule, considering that many people feel they deserved the #1 seed. Instead they've been handed a very difficult path to the Regional Finals.
Through the Regional Finals:
5 Purdue 0.248 1 Texas 0.202 8 Penn St 0.188 13 Minnesota 0.182 10 Hawaii 0.170 12 Florida St 0.156 2 Nebraska 0.153 3 Illinois 0.151 6 UNI (No Iowa) 0.148 9 UCLA 0.142 4 Iowa St 0.140 16 Texas A&M 0.139 11 Stanford 0.129 14 Tennessee 0.123 7 USC 0.110 15 Pepperdine 0.083
|
|
|
Post by The Bofa on the Sofa on Nov 30, 2011 12:43:54 GMT -5
The main difference with what I did is that I didn't use the rating of the most likely opponent, I weighted it by their probability of being the opponent. It's not a huge effect, especially at the extremes - if the previous match is very lopsided, then you are almost assuredly to face the favorite. OTOH, if the teams are exactly equal, then it doesn't matter who wins because they are the same strength. The biggest difference is in-between, where I take into account the possibility of an upset in earlier rounds.
|
|
|
Post by karplets on Nov 30, 2011 13:07:17 GMT -5
The main difference with what I did is that I didn't use the rating of the most likely opponent, I weighted it by their probability of being the opponent. It's not a huge effect, especially at the extremes - if the previous match is very lopsided, then you are almost assuredly to face the favorite. OTOH, if the teams are exactly equal, then it doesn't matter who wins because they are the same strength. The biggest difference is in-between, where I take into account the possibility of an upset in earlier rounds. Not that I think it makes a huge difference but I took a "hardest path" approach. What that also meant was I factored in an upset for a lower-rated potential opponent if it meant A) the opponent would have homecourt for the 3rd or 4th round, and B) by doing so their homecourt-adjusted rating would be higher than the alternative opponent. This cropped up a couple times. Again, I don't think it makes a huge difference. What both of our approaches do is isolate the difficulty of a team's projected path through the tournament from its own ratings.
|
|
|
Post by karplets on Dec 3, 2011 13:11:18 GMT -5
Here's the expected win probabilities using the converted Massey ratings for the second round.
favored team is listed first followed by their rating & site (H) = 1st team has homecourt, (N) = neutral site / underdog followed by their rating / rating differential (adjusted for homecourt) / favored team's win probability
The list is sorted by rating differential - from lowest to highest.
Homecourt advantage = 60 points
5 games were already played yesterday, the winner is shown in bold 13 Minnesota (H) 1815/ Washington 1841 34 0.552 16 Texas A&M (H) 1802/ Kentucky 1813 49 0.577 4 Iowa St (H) 1908/ Miami 1824 144 0.725 9 UCLA (H) 1904/ San Diego 1814 150 0.739 12 Florida St (H) 1829/ Cincinnati 1736 153 0.739 6 UNI (No. Iowa) (H) 1950/ Florida 1857 153 0.739 11 Stanford (H) 1863/ Michigan 1757 166 0.758 15 Pepperdine (H) 1849/ No. Carolina 1727 182 0.777 14 Tennessee (H) 1895/ Ohio St 1707 248 0.845 10 Hawaii (H) 1936/ Colorado St 1745 251 0.850 7 USC (H) 1979/ Tulsa 1764 275 0.871 2 Nebraska (H) 1938/ Kansas St 1720 278 0.871 1 Texas (H) 1954/ Michigan St 1732 282 0.874 5 Purdue (H) 1946/ Louisville 1678 328 0.905 3 Illinois (H) 1955/ Marquette 1645 370 0.929 8 Penn St (H) 1907/ Delaware 1511 456 0.959
In the Albyn Jones scale, a team rated 100 points higher than another would have an expected winning percentage of .667 (ignoring homecourt/homefield).
100 pt differential: .667 expected win pct (2 to 1 win ratio) 200 pt differential: .800 expected win pct (4 to 1 win ratio) 300 pt differential: .889 expected win pct (8 to 1 win ratio) 400 pt differential: .941 expected win pct (16 to 1 win ratio)
|
|
|
Post by pogoball on Dec 3, 2011 17:15:42 GMT -5
Wow, surprised that UT has such a high percentage against MSU
|
|
|
Post by karplets on Dec 3, 2011 17:42:43 GMT -5
In attempting to convert Massey's ratings into the Albyn Jones scale, I'm likely entering some degree of error although my conversion formula has worked out pretty well in women's soccer the last couple years.
But quite possibly from around a 150 point differential to around a 300 point differential I have some reason to think that my conversion may overstate the expected win percentage for the higher-rated team.
I noticed that the ranking of the two teams is pretty similar in Pablo vs Massey. Texas is #2 in Pablo and #3 in Massey. Michigan St is #26 in Pablo and #31 in Massey. So Massey's ranking is close to Pablo here.
If my (admittedly crude) conversion is overstating the probability for Texas, my guess is that they still have an expected win percentage over (.800) for this match.
I'm a soccer guy more than a VB guy so I don't know the teams well at all - does (.810) to (.830) sound more reasonable?
** then again, Texas is one of the elite teams. Teams like that don't seem to lose but once in a blue moon to teams other than very good teams. Especially with homecourt advantage. So an (.830) to (.850) expected win pct doesn't seem unreasonably high although (.874) may be on the high side.
.800 = upset 1 out of 5 games (200 pt differential) .833 = upset 1 out of 6 games (230 approximately) .858 = upset 1 out of 7 games (260 approximately) .875 = upset 1 out of 8 games (280 approximately) .889 = upset 1 out of 9 games (300 pt differential)
|
|
|
Post by karplets on Dec 3, 2011 18:01:05 GMT -5
By the way, in the first round, the Massey/AJ system "predicted" 28 out of 32, which sounds good but I'm not really happy with that- not enough upsets! There should've been more upsets!
This may seem paradoxical to some of you (or maybe not) - but in one very important sense, if an Elo-type of rating "correctly" predicts every game, then something is wrong.
It's a probability-based system. In the scale I'm using (the Albyn Jones scale for women's soccer), a team rated 100 points higher than its opponents is expected to win about 67% of the time. conversely, it should lose 33% of the time. (Nothing magical about the scale, it just happens to be the one I use. I find it convenient. There are other scales possible and Massey uses one. I just don't know what scale he's using and exactly what the correlation is between rating point differential and expected win percentage)
Against opponents 200 rating points lower, a team should win about 80% of the time and lose about 20%. The bigger the rating difference the higher the probability of winning - that makes sense, doesn't it?
Well, if every "prediction" is correct, if every time the higher-rated team wins over a team rated, say, 100 points below it, then something's wrong with the ratings -- either the team is underrated or the opponents are overrated because it should be losing 1/3 of the time.
In the 1st round, the average expected win percentage was (.792). Over 32 games, that's an expected result of 25 wins for the higher-rated teams and 7 losses.
So in that sense, the ratings were off a bit even though they appeared to be "right" 28 out of 32 times.
But it looks like the upsets got off to a fast start in the 2nd round...
|
|
|
Post by karplets on Dec 4, 2011 3:13:02 GMT -5
Pablo 40-8 RPI 38-10 RichKern 36-9 AVCA 37-9 (from the PTW thread at the top of the Forum)
Massey (modified/Albyn Jones scale): 39-9
Not doing too badly considering how the Pac 12 appears to be trying to live down to the RPI's expectations (or expectorations, if you prefer)
Through the first two rounds, the average expected win percentage for the higher-rated teams was (.793), corresponding to an expected result of 38 wins and 10 "upsets".
The actual result so far is 39 wins and the average deviance per game is (.020) or 2.0%
|
|
|
Post by karplets on Dec 4, 2011 5:14:33 GMT -5
Here are the win probabilities for the 3rd round using the converted Massey ratings. Homecourt plays a significant factor in this round. Without it, Hawaii would be the underdog to USC. Also, playing at Minnesota and Kentucky significantly cuts into the rating advantage for Texas and Iowa St.
(H) = homecourt for the favored team; (N) = neutral; (A) = favored team plays away
Next to last number is the rating differntial adjusted for homecourt. The win probability is the last number and is in bold.
Penn St … (N) 1907 ….. UCLA … 1904 … 3/ 0.500
Hawaii … (H) 1936 ….. USC … 1979 … 17/ 0.526
Iowa St … (A) 1908 ….. Minnesota … 1815 … 33/ 0.552
Texas … (A) 1954 ….. Kentucky … 1813 … 81/ 0.635
Purdue … (N) 1946 ….. Florida St … 1829 … 117/ 0.689
Pepperdine … (N) 1849 ….. Kansas St … 1729 … 120/ 0.697
Florida … (H) 1857 ….. Michigan … 1757 … 160/ 0.752
Illinois … (N) 1955 ….. Ohio St … 1707 … 248/ 0.845
|
|