bluepenquin
Hall of Fame
4-Time VolleyTalk Poster of the Year (2019, 2018, 2017, 2016), All-VolleyTalk 1st Team (2021, 2020, 2019, 2018, 2017, 2016)
Posts: 12,423
|
Post by bluepenquin on Oct 28, 2014 18:44:19 GMT -5
That's a really interesting assertion, but personally, I think it is just as much bull%*$# as the claim that good teams have the "ability to win the close ones" But I do give you credit for being contrarian. I also think 'ability to win the close ones' is BullXXXX. Of course my assertion has nothing to do with ability to win close ones or for good teams more likely to lose close ones. I am just asserting that bad teams rarely lose matches in which they outscore their opponent - and it is way more common for good teams. But that is just a guess w/o seeing the data.
|
|
|
Post by The Bofa on the Sofa on Oct 28, 2014 18:53:22 GMT -5
That's a really interesting assertion, but personally, I think it is just as much bull%*$# as the claim that good teams have the "ability to win the close ones" But I do give you credit for being contrarian. I also think 'ability to win the close ones' is BullXXXX. Of course my assertion has nothing to do with ability to win close ones or for good teams more likely to lose close ones. I am just asserting that bad teams rarely lose matches in which they outscore their opponent - and it is way more common for good teams. But that is just a guess w/o seeing the data. So you are asserting that good teams are more likely to lose close matches, which includes matches where they get outscored. Why would they do that if not for having an ability to do so?
|
|
|
Post by mikegarrison on Oct 28, 2014 19:00:52 GMT -5
Ranking = "ordered list, based on rating" Rating = "numerical score" Teams can be ranked according to anything you want. You can rank them by average shoe size if you want. If I tried that, I think I would have a hard time explaining why I was sneaking into their locker rooms.
|
|
|
Post by The Bofa on the Sofa on Oct 28, 2014 19:11:42 GMT -5
Teams can be ranked according to anything you want. You can rank them by average shoe size if you want. If I tried that, I think I would have a hard time explaining why I was sneaking into their locker rooms. Never underestimate the level of detail in the media guides these days... (when I was in high school, our basketball coach told another, "They are going to have you scouted so much they are going to everything right down to your jock size")
|
|
bluepenquin
Hall of Fame
4-Time VolleyTalk Poster of the Year (2019, 2018, 2017, 2016), All-VolleyTalk 1st Team (2021, 2020, 2019, 2018, 2017, 2016)
Posts: 12,423
|
Post by bluepenquin on Oct 28, 2014 21:44:34 GMT -5
I also think 'ability to win the close ones' is BullXXXX. Of course my assertion has nothing to do with ability to win close ones or for good teams more likely to lose close ones. I am just asserting that bad teams rarely lose matches in which they outscore their opponent - and it is way more common for good teams. But that is just a guess w/o seeing the data. So you are asserting that good teams are more likely to lose close matches, which includes matches where they get outscored. Why would they do that if not for having an ability to do so? Just a hunch - not necessarily well thought out. I know that Penn State has lost several matches over the past 3 years when outscoring their opponent and not ever seeing the reverse. So I was just thinking w/o looking at the data. Not that PSU plays worse in close matches, but that the only time they can possibly lose is during a close match.
Just for fun I looked through my limited database of game scores this season. I have 44 conference games this year where a team won in 5 sets while scoring less total points. Then looked at where their Pablo rating currently ranks to determine who was favored. It split exactly 22 and 22. The average of the team that won was worse than the team that lost. I then looked at only when the home team won by scoring less points - and 11 of the 16 had a worse Pablo rating. Small sample sizes and not really a much of a correlation. Like I said not well thought at - just a quick speculation on a reason.
|
|
|
Post by FTLOG on Oct 28, 2014 23:46:47 GMT -5
So you are asserting that good teams are more likely to lose close matches, which includes matches where they get outscored. Why would they do that if not for having an ability to do so? Just a hunch - not necessarily well thought out. I know that Penn State has lost several matches over the past 3 years when outscoring their opponent and not ever seeing the reverse. So I was just thinking w/o looking at the data. Not that PSU plays worse in close matches, but that the only time they can possibly lose is during a close match.
Just for fun I looked through my limited database of game scores this season. I have 44 conference games this year where a team won in 5 sets while scoring less total points. Then looked at where their Pablo rating currently ranks to determine who was favored. It split exactly 22 and 22. The average of the team that won was worse than the team that lost. I then looked at only when the home team won by scoring less points - and 11 of the 16 had a worse Pablo rating. Small sample sizes and not really a much of a correlation. Like I said not well thought at - just a quick speculation on a reason.
I think they were outscored by Stanford last year but still won.
|
|
|
Post by The Bofa on the Sofa on Oct 29, 2014 6:35:21 GMT -5
So you are asserting that good teams are more likely to lose close matches, which includes matches where they get outscored. Why would they do that if not for having an ability to do so? Just a hunch - not necessarily well thought out. I know that Penn State has lost several matches over the past 3 years when outscoring their opponent and not ever seeing the reverse. So I was just thinking w/o looking at the data. Not that PSU plays worse in close matches, but that the only time they can possibly lose is during a close match.
Just for fun I looked through my limited database of game scores this season. I have 44 conference games this year where a team won in 5 sets while scoring less total points. Then looked at where their Pablo rating currently ranks to determine who was favored. It split exactly 22 and 22. The average of the team that won was worse than the team that lost. I then looked at only when the home team won by scoring less points - and 11 of the 16 had a worse Pablo rating. Small sample sizes and not really a much of a correlation. Like I said not well thought at - just a quick speculation on a reason.
I agree you hadn't thought about it, because you didn't recognize the implication that I pointed out: you are proposing an "inability to win close games." It's the flip side of that ability to win the close one claim that we both think is @$$%*!*.
|
|
|
Post by The Bofa on the Sofa on Oct 29, 2014 8:42:12 GMT -5
Of course, I suppose one could argue that what is true for most teams might not be true for some teams. It's possible, but there is no reason to say that is the case here. In fact, this type of outcome is not really that uncommon in the grand scheme of things. For example, in this match, after 4 sets, Penn St had scored 53% of the points. Looking over the last 5 years, we can see that 70% of the teams that have outscored the opponent by that much in 4 sets have actually won the match in 4 sets (actually, 0.2% of those teams lost in 4). But it also means that even when outscoring your opponent 53% - 47%, 30% of the time it goes to a 5th set. In the 5th set, the team that scored 53% of the points through set 4 wins 60% of the time. That means that teams who score 53% of the points through the first 4 sets win 88% of the time. But it also means they still go on to lose 12% of the time. Is it really "something about those teams"? Or is it just that sometimes, %*$# happens for no good reason at all? We could compare this to simulation to see, but I am going to suggest that, no, this isn't anything meaningful. There are teams that will end up going 5 sets despite outscoring the opponents 53%-47% through 4. And when they get to set 5, it's almost anyone's ballgame, and that's where upsets occur. The best way to get an upset is to get the better team to a 5th set.
|
|
|
Post by alpacaone on Oct 29, 2014 8:46:45 GMT -5
Back to "Rankings" big R quotations, and many who almost view it as gospel, and the other "Rankings" which are the ultimate evils No doubt Pablo is very good at predictions, it's basically tied for second place PTW, very admirable, but 179-51 .743 won't win all that many titles on the court, but that is beside the point. As mike pointed out, we can rank anything, no doubt, and true, a title should have capitals. However, in the context of National Rankings there is an air of certification. Do any coaches vote their top 25 down Pablo lines? As far as I know set wins and total points mean absolutely nothing in NCAA volleyball. Coaches don't play as if losing a set by 2 or 12 points has an impact on their end goal. Coaches rest their entire lineup instead of battling a near hopeless set; most search their depth chart in a very literal win win situation without risk of losing some valuable points for a ranking. The same is true on the victors side. If big "R" NCAA volleyball Rankings were determined by how lob-sided a victory is achieved, certainly every coach wanting a tournament seed or birth would play a very different game. Coaches don't seem to penalize a lost set or even two, or total points in their ranking assessments, because maybe they have very little importance to their approach to the college game. Don't get me wrong, I do find a rating of how strong teams can play very interesting as well as confirming, I just feel the need to advocate against its total worthiness as a Ranking System based on the fact that it's not a system college coaches need to base their season's success on.
|
|
|
Post by The Bofa on the Sofa on Oct 29, 2014 9:05:42 GMT -5
As far as I know set wins and total points mean absolutely nothing in NCAA volleyball. This is a meaningless statement. What do you mean "in NCAA volleyball"? In the won/loss standings, sets and points mean nothing. But that is by definition. However, if you want to know how good teams are, then sets and, more importantly points, tell you a lot more than wins. I don't give a %*$# what coaches think (although I think you are wrong about coaches, see below), or NCAA officials think, or anyone thinks. It's empirically true. Looking at points is a better indicator of how teams will do than looking at who won. The data have been presented and are indisputable. The NCAA uses whatever criteria the NCAA uses, and if you want to talk about things like who will make the NCAA tournament, then focus on the things that the NCAA cares about. You don't use Pablo to predict who is going to make the NCAA tournament. However, if you want to know how good teams are, then that's a different question. So whether points or sets "matters" in NCAA volleyball is irrelevant to the question of whether it matters in other ways, and to the question of how well Pablo does in assessing that. Note that there was a strong push _among volleyball coaches_ a few years back to get the NCAA committee to use Pablo rankings instead of RPI. It failed, but if that had succeeded, it would have meant that points WOULD matter. And this is what coaches far and wide want. I know, because they have told me so. So be careful when you speak for coaches. It may be true that their objective is to win, but they also recognize very clearly that not all wins are equal, and when it comes to evaluating teams, it's important to take that into account, because when you don't, you get things like, RPI. Which sucks.
|
|
bluepenquin
Hall of Fame
4-Time VolleyTalk Poster of the Year (2019, 2018, 2017, 2016), All-VolleyTalk 1st Team (2021, 2020, 2019, 2018, 2017, 2016)
Posts: 12,423
|
Post by bluepenquin on Oct 29, 2014 9:28:33 GMT -5
Who's better, Penn St or Stanford? There will be those who will point to the fact that Stanford beat Penn St, and that settles it. But it doesn't, not for me, and not objectively. Because there's more to it. Not all wins are equal. So take Stanford's win over Penn St, in Maples, where Penn St scored 51.5% of the points. What would happen if they played again? We can't know, of course, but we can look to see what has happened in other circumstances like this, where the home team has won despite scoring only 48.5% of the points. I have the database to do it. A team who wins at home despite scoring 48 - 49% of the points wins 19% of the time if the two teams play again on the other team's home court. Shoot, even if they play again on their home court, they only win 25% of the time. I was actually shocked at how lopsided this is. Admittedly, there aren't a ton of examples to base it upon (50 to 80 examples, depending on the circumstances), but still, that is really, really lopsided. Now, if the road team can pull of the 5 set win, they are more likely than not to win if they play again. Interestingly, it's been about 55% whether they turn around and play at home or even if they play again on the road. Could be some distortion due to small numbers, but the trend is consistent. Even if the two matches are both on neutral sites, 40% of the time, the team that lost the first match wins the second. Very clearly, this is what Pablo is seeing when it comes to Penn St. I keep telling you all this, because it's true. If you want to know how many wins and losses a team has, look at their wins and losses. However, if you want to know how good a team is, look at the number of points they are scoring, because that where that information is. Keep in mind, this is not just a Pablo bias, and I'm not just making this up. Sure, I started out Pablo with this type of idea as the underlying model, but as the data above show, it's born about by what happens on the court. See my comment yesterday in the 3210 thread. If you know how many points were scored, then knowing who won tells you very little about who is better (it's not completely worthless - there is a tiny, tiny premium in winning, to the tune of increasing the expected winning percentage by maybe 2% in the region close to zero - what this means is that if two teams score the exact same number of points, the winning team has like a 52% chance of winning a second match; the tipping point looks to be about 49.5% of the points). <<In the example you used above of Penn St and Stanford, you appear to base much of your prediction on their prior meeting, and the fact that Penn St outscored Stanford. Do you not consider opponents in common? Interestingly enough, Penn St also outscored both Nebraska (+3) and Illinois (+2) in those losses. Stanford was (-2) against Illinois, and (+22) against Nebraska. Does that get factored in?>> I don't think this has been answered. In my understanding, Pablo doesn't use just one match to project win probabilities in future matches. The Stanford/Penn State match in September is just one of many data points used in calculating the Pablo Rating. Common opponent matches and non-common opponent matches are used - and none carry any more weight by virtue of being H-H or Common Opponent. As I understand the current win probability for a future Stanford/Penn State match - Penn State would have approx. a 54% chance of winning on a neutral court and Stanford would be favored if played on their home court. Bofa could provide better clarity if I am wrong in my understanding.
|
|
|
Post by alpacaone on Oct 29, 2014 9:28:46 GMT -5
I very much understand how much you value this rating; you put at least as much time into this as I do my animals which range from cnardian to camalid, my other is a veterinarian. I respect your work! But, unless it is a system recognized and played by the entire league the system is somewhat incomplete as a seeding or ranking. What I do see Pablo as is a very good measure of the under and over acheivers in the league. Certainly, right now if Stanford falls in the final four according to Pablo they achieved their potential, while if Penn State does what they have done this year and falls to a lessor rated team in a region semi, they then could be labelled underacheivers by this analysis, and finally should North Carolina make it to OK they then would have that special ability to overachieve that every coach dreams of having no matter how skilled their individual parts are.
|
|
|
Post by The Bofa on the Sofa on Oct 29, 2014 10:14:46 GMT -5
I very much understand how much you value this rating; you put at least as much time into this as I do my animals which range from cnardian to camalid, my other is a veterinarian. I respect your work! But, unless it is a system recognized and played by the entire league the system is somewhat incomplete as a seeding or ranking.. Nonsense. It's not "incomplete", it does exactly what it does, completely. How anyone else, including the league, chooses to recognize or use it is immaterial to that. I've heard the objections that all coaches care about it is winning, and therefore the premise of Pablo that point differentials reflect quality doesn't hold up, but empirically, it doesn't work that way. The success of Pablo of predicting outcomes as a function of rating difference is undeniable. It is empirically true that the outcomes for matches between teams that are separated by a very small amount turn out to be pretty much 50/50, exactly what Pablo says they should be, whereas outcomes in matches between teams very different end up being strongly in favor of the favored team, but about the right percentage. And the analytical relationship between rating difference and winning percentage is much better than anyone had any right to expect, including me. So that tells us that the Pablo underlying model holds up. But it goes even further than that. Irrespective of Pablo, I have looked at the significance of matches played in terms of telling us about things to come. I've mentioned a few of these in this thread. When two teams play twice in the same season, the team that wins the first match wins the second 76.15% of the time. However, the team that scores more points in the first match, regardless of whether they win or lose, wins 76.6% of the time in the second. You can say that coaches are only motivated by wins and losses, but in terms of seeing what teams will do, the points they score matters more. If all that matters is winning, why is there such a strong dependence on the second match outcome on sets? First Match Sets | How Often the team that won the first wins the second | 3 | 85% | 4 | 71% | 5 | 59% |
Clearly, regardless of what coaches state as their motivation, in assessing the outcome and what it means for the relative abilities of the teams involved, winning in 3, 4 or 5 sets matters. Teams that go 5 sets are more evenly matched, as reflected by the outcome in a second meeting. We can see it even more detailedly in point percentages. Here are the winning percentages in the second match for the winning team in the first by the points they score (this is based on both matches played on neutral sites so a little difference from that I posted yesterday) Point Pct by winning team in Match 1 | Winning Pct in Match 2 | .485 | 42% | .495 | 50% | .505 | 54% | .515 | 64% | .525 | 68% | .545 | 76% | .565 | 81% | .585 | 83% | .605 | 88% |
Coaches may say they don't care about points, and I believe them, but in looking at how teams do in subsequent matches, the point percentage of the first match is very telling. Teams that score the same number of points are very similar, and this is apparent when they play again (interestingly, you can see the value of winning in this table - the "teams are equal" point percentage by the winning team is actually 49.5%, and not 50%; that's the correction I apply in Pablo these days). Of course, to tie the two arguments together, the functional relationship between these data fits the Pablo model with an r^2 value of about .96 (there is some breakdown at the high end in that Pablo overestimates the difference between teams in very lopsided matches; a lot of this is corrected by capping scores of individual sets) And I'm not arguing this because I "value" Pablo, so you can drop that strawman. The reason is because I have studied the crap out of it, and the more I investigate, the more it becomes clear that Pablo is doing it right. It doesn't have to be that way. Remember in the old days, I used a Pablo approach that put more emphasis on winning, but I've dropped that now because the analyses I've done have shown very clearly that the premium on winning was overstated. So I am perfectly willing to change Pablo where we can find issues with it. But to find those issues, we have to assess it in the light of what it is and does.
|
|
|
Post by The Bofa on the Sofa on Oct 29, 2014 10:32:55 GMT -5
<<In the example you used above of Penn St and Stanford, you appear to base much of your prediction on their prior meeting, and the fact that Penn St outscored Stanford. Do you not consider opponents in common? Interestingly enough, Penn St also outscored both Nebraska (+3) and Illinois (+2) in those losses. Stanford was (-2) against Illinois, and (+22) against Nebraska. Does that get factored in?>> I don't think this has been answered. In my understanding, Pablo doesn't use just one match to project win probabilities in future matches. The Stanford/Penn State match in September is just one of many data points used in calculating the Pablo Rating. Common opponent matches and non-common opponent matches are used - and none carry any more weight by virtue of being H-H or Common Opponent. As I understand the current win probability for a future Stanford/Penn State match - Penn State would have approx. a 54% chance of winning on a neutral court and Stanford would be favored if played on their home court. Bofa could provide better clarity if I am wrong in my understanding. Yeah, it looks like Stanford is favored over Penn St at home yet. And yes, while this talks about individual matches, a teams rating is determined by (basically) the average of all their matches, and Penn St vs Stanford is just one point of the mix. Before I started doing the mixed set at the beginning of the season, there used to be a lot of comments about how Pablo doesn't work well for a small number of matches. However, that is not really true. Pablo works really well for a small number of matches. The problem is in the teams, and that their small number of matches are not very representative of their overall quality. A team has one great match, and that has a disproportionate effect on their assessment.
|
|
|
Post by volleyguy on Oct 29, 2014 10:40:15 GMT -5
Thanks for a very clear analysis, BOFA.
Many of the issues that arise over ranking or comparing teams have their roots in a fundamental misunderstanding of the differences between RPI, Pablo, and RPI futures. In order to understand the differences, one must know what it is they actually do. I'm curious how you would describe Pablo? Is it a predictor, a probabilistic estimator (of something?). I understand the RPI future to be an estimate of what the final RPI will be. How would describe the function/purpose of RPI.
|
|