|
Post by hebrooks87 on Nov 24, 2014 10:28:55 GMT -5
If you "ignore" UNLV's win against CSU, then what's the point of this exercise? I've said all along doing this type of approach gives results that look odd. However, I'd actually like to hear the discussion of the actual methodology and foundations outside of "it don't like right"
Unfortunately I wouldn't have clue how to properly discuss your methodology on this than I could on the Pablo Rating (this is way over my head). I can only trust the results - this system is (much) better at 'predicting' who won matches already played and Pablo is very successful at predicting who would win matches going forward. This is based only on wins and losses (like RPI), while Pablo uses points in its calculation. So maybe through examples I could better understand. A team like Oklahoma is significantly worse here than RPI. A thought would be that the win over Texas would do some considerable good in a system like this - where RPI doesn't care at all where the wins come from. Colorado comes in much better vs. RPI - and more improvement than other PAC 12 teams. LSU is a little worse. i bring those 3 teams up to better understand - Oklahoma and Colorado has what seems like a huge upset on their resume, yet both teams go in different directions from RPI. LSU has a huge upset loss - but it doesn't appear to have much if any negative impact compared to RPI - especially when considering that just about every SEC team does worse in this System. So curious - does this better reward the High and Low teams (Oklahoma, LSU, Colorado) or does it reward the more consistent team (very few upset wins and losses). I am guessing that the answer is 'it depends'? Also - at this point in the season, there is just very little movement in RPI and Pablo - especially at the top. Would this system act similarly? The real issue with Oklahoma isn't "ignoring" the win over Texas, because it's also ignoring the loss at TCU. The critical match is the loss to SMU. The optimization has used the HCA very nicely with OU to put them close enough to Kansas, CSUN, and UALR to have their home wins over them and the road SMU loss correctly classified. Unless I've missed something, there are two OU misclassifications (win at Texas, loss at TCU). The closeness in the overall rankings of Kansas, CSUN, UALR, and SMU pretty much means OU is locked there. If OU moves up any compared to SMU, it adds a misclassification. Colorado get the Washington and Oregon results right with the HCA, but CU has several misclassifications (4, I think). In general, minimizing misclassification costs involves assumptions about the cost of different kinds of errors. In my professional life, the trade-off between false alarms and missed detections of tornadoes in the warning process is implicitly taken into account by assuming the cost of a missed detection is much larger than the cost of a false alarm (most medical tests work the same way). Here, the assumption is the costs are the same for all kinds of errors, so that minimizing cost is identical to minimizing errors. (The exception comes in the Harvard example give, earlier by the sofa-sitting Bofa, which I think is equivalent to assuming different costs for at least some errors.) That's not a bad default assumption in this case. I can imagine developing a way to weight errors differently, but it would likely have a lot of pseudo-objective inputs. To me, one of the points of this kind of exercise is that it highlights the differences inherent in different ranking processes. Ranking processes are designed, implicitly or explicitly, with assumptions about what's important and provide some sort of view of an unknown truth. For the most part, all remotely reasonable systems agree in broad brush. The departures from that agreement tell us something about the impacts of those assumptions. RPI, in women's volleyball, unintentionally rewards major conference teams in the east compared to the west. (It's my understanding that our friends in women's soccer have similar issues with RPI.) Using a win/loss only system, compared to a points based system in the Pablo structure, moves a small number of teams a large distance. Bofa, as you know, I am a lazy theoretician for the most part. It would be interesting to see a list of the number of misclassifications of each sign for each team and a comparison of the ratings from the full system and the W/L only system. I know I could put the tables together myself, but you probably already have them sitting around in some form.:-) Want to send them to me?
|
|
|
Post by mikegarrison on Nov 24, 2014 10:39:56 GMT -5
Unfortunately I wouldn't have clue how to properly discuss your methodology on this than I could on the Pablo Rating (this is way over my head). I can only trust the results - this system is (much) better at 'predicting' who won matches already played and Pablo is very successful at predicting who would win matches going forward. This is based only on wins and losses (like RPI), while Pablo uses points in its calculation. So maybe through examples I could better understand. A team like Oklahoma is significantly worse here than RPI. A thought would be that the win over Texas would do some considerable good in a system like this - where RPI doesn't care at all where the wins come from. Colorado comes in much better vs. RPI - and more improvement than other PAC 12 teams. LSU is a little worse. i bring those 3 teams up to better understand - Oklahoma and Colorado has what seems like a huge upset on their resume, yet both teams go in different directions from RPI. LSU has a huge upset loss - but it doesn't appear to have much if any negative impact compared to RPI - especially when considering that just about every SEC team does worse in this System. So curious - does this better reward the High and Low teams (Oklahoma, LSU, Colorado) or does it reward the more consistent team (very few upset wins and losses). I am guessing that the answer is 'it depends'? Also - at this point in the season, there is just very little movement in RPI and Pablo - especially at the top. Would this system act similarly? The real issue with Oklahoma isn't "ignoring" the win over Texas, because it's also ignoring the loss at TCU. The critical match is the loss to SMU. The optimization has used the HCA very nicely with OU to put them close enough to Kansas, CSUN, and UALR to have their home wins over them and the road SMU loss correctly classified. Unless I've missed something, there are two OU misclassifications (win at Texas, loss at TCU). The closeness in the overall rankings of Kansas, CSUN, UALR, and SMU pretty much means OU is locked there. If OU moves up any compared to SMU, it adds a misclassification. Colorado get the Washington and Oregon results right with the HCA, but CU has several misclassifications (4, I think). In general, minimizing misclassification costs involves assumptions about the cost of different kinds of errors. In my professional life, the trade-off between false alarms and missed detections of tornadoes in the warning process is implicitly taken into account by assuming the cost of a missed detection is much larger than the cost of a false alarm (most medical tests work the same way). Here, the assumption is the costs are the same for all kinds of errors, so that minimizing cost is identical to minimizing errors. (The exception comes in the Harvard example give, earlier by the sofa-sitting Bofa, which I think is equivalent to assuming different costs for at least some errors.) That's not a bad default assumption in this case. I can imagine developing a way to weight errors differently, but it would likely have a lot of pseudo-objective inputs. To me, one of the points of this kind of exercise is that it highlights the differences inherent in different ranking processes. Ranking processes are designed, implicitly or explicitly, with assumptions about what's important and provide some sort of view of an unknown truth. For the most part, all remotely reasonable systems agree in broad brush. The departures from that agreement tell us something about the impacts of those assumptions. RPI, in women's volleyball, unintentionally rewards major conference teams in the east compared to the west. (It's my understanding that our friends in women's soccer have similar issues with RPI.) Using a win/loss only system, compared to a points based system in the Pablo structure, moves a small number of teams a large distance. Bofa, as you know, I am a lazy theoretician for the most part. It would be interesting to see a list of the number of misclassifications of each sign for each team and a comparison of the ratings from the full system and the W/L only system. I know I could put the tables together myself, but you probably already have them sitting around in some form.:-) Want to send them to me? Another way to put that is that you can minimize the number of errors or you can minimize the overall magnitude of error (usually RMS) or you can minimize the number of extremely large errors or.... You have to make choices with this kind of thing. And minimizing errors doesn't mean eliminating them. ANY ranking system is going to have individual teams that you can pick out and call an error. But so what? If you get 300 teams right and 20 teams pretty close and 10 teams wrong, that's damn good. I like pablo better than this, though. I like judging a team on their strength of scoring points better than on their W/L record, because I think it tells us more useful information. But even with pablo, there will be some amount of teams that probably are misranked 1:1 because it was necessary to minimize the total error of all 330+ teams.
|
|
|
Post by hebrooks87 on Nov 24, 2014 10:57:50 GMT -5
The real issue with Oklahoma isn't "ignoring" the win over Texas, because it's also ignoring the loss at TCU. The critical match is the loss to SMU. The optimization has used the HCA very nicely with OU to put them close enough to Kansas, CSUN, and UALR to have their home wins over them and the road SMU loss correctly classified. Unless I've missed something, there are two OU misclassifications (win at Texas, loss at TCU). The closeness in the overall rankings of Kansas, CSUN, UALR, and SMU pretty much means OU is locked there. If OU moves up any compared to SMU, it adds a misclassification. Colorado get the Washington and Oregon results right with the HCA, but CU has several misclassifications (4, I think). In general, minimizing misclassification costs involves assumptions about the cost of different kinds of errors. In my professional life, the trade-off between false alarms and missed detections of tornadoes in the warning process is implicitly taken into account by assuming the cost of a missed detection is much larger than the cost of a false alarm (most medical tests work the same way). Here, the assumption is the costs are the same for all kinds of errors, so that minimizing cost is identical to minimizing errors. (The exception comes in the Harvard example give, earlier by the sofa-sitting Bofa, which I think is equivalent to assuming different costs for at least some errors.) That's not a bad default assumption in this case. I can imagine developing a way to weight errors differently, but it would likely have a lot of pseudo-objective inputs. To me, one of the points of this kind of exercise is that it highlights the differences inherent in different ranking processes. Ranking processes are designed, implicitly or explicitly, with assumptions about what's important and provide some sort of view of an unknown truth. For the most part, all remotely reasonable systems agree in broad brush. The departures from that agreement tell us something about the impacts of those assumptions. RPI, in women's volleyball, unintentionally rewards major conference teams in the east compared to the west. (It's my understanding that our friends in women's soccer have similar issues with RPI.) Using a win/loss only system, compared to a points based system in the Pablo structure, moves a small number of teams a large distance. Bofa, as you know, I am a lazy theoretician for the most part. It would be interesting to see a list of the number of misclassifications of each sign for each team and a comparison of the ratings from the full system and the W/L only system. I know I could put the tables together myself, but you probably already have them sitting around in some form.:-) Want to send them to me? Another way to put that is that you can minimize the number of errors or you can minimize the overall magnitude of error (usually RMS) or you can minimize the number of extremely large errors or.... You have to make choices with this kind of thing. And minimizing errors doesn't mean eliminating them. ANY ranking system is going to have individual teams that you can pick out and call an error. But so what? If you get 300 teams right and 20 teams pretty close and 10 teams wrong, that's damn good. I like pablo better than this, though. I like judging a team on their strength of scoring points better than on their W/L record, because I think it tells us more useful information. But even with pablo, there will be some amount of teams that probably are misranked 1:1 because it was necessary to minimize the total error of all 330+ teams. As I think about this more, one of the issues is structural. Minimizing errors with dichotomous identification of dichotomous events (win/lose) is a more unstable process than minimizing errors with a probabilistic forecast (or identification) of a dichotomous event. The use of the points-based predictor (which I also prefer) allows for smoother predictions and results. A team can almost win (or lose) and get some credit (fault) for that result.
|
|
|
Post by The Bofa on the Sofa on Nov 24, 2014 13:34:50 GMT -5
To me, one of the points of this kind of exercise is that it highlights the differences inherent in different ranking processes. Ranking processes are designed, implicitly or explicitly, with assumptions about what's important and provide some sort of view of an unknown truth. For the most part, all remotely reasonable systems agree in broad brush. The departures from that agreement tell us something about the impacts of those assumptions. RPI, in women's volleyball, unintentionally rewards major conference teams in the east compared to the west. (It's my understanding that our friends in women's soccer have similar issues with RPI.) Using a win/loss only system, compared to a points based system in the Pablo structure, moves a small number of teams a large distance. To me, the point of this kind of exercise is to demonstrate the folly of the "The rankings should at least reflect who won or lost" comments I hear, either as an objection to Pablo or as a justification for RPI (at least the concept of it). We can do that. We can create rankings that do an amazing job of reflecting who won or lost. At least, much better than the ones we use. And for good reason. THIS is what you get when you focus on reflecting simple wins and losses as well as possible. You get a lot of crazy results. I can send you the set of rankings that I used to create that assessment table, that's all I have. Figuring out the number of misclassifications for each teams is a little bit harder. I'm doing a HCA-free method now. It's a lot harder than including a HCA (success right now is something like 87.1%; it's a difference of fewer than 10 matches though)
|
|
|
Post by The Bofa on the Sofa on Nov 24, 2014 14:40:19 GMT -5
The real issue with Oklahoma isn't "ignoring" the win over Texas, because it's also ignoring the loss at TCU. The critical match is the loss to SMU. The optimization has used the HCA very nicely with OU to put them close enough to Kansas, CSUN, and UALR to have their home wins over them and the road SMU loss correctly classified. Unless I've missed something, there are two OU misclassifications (win at Texas, loss at TCU). The closeness in the overall rankings of Kansas, CSUN, UALR, and SMU pretty much means OU is locked there. If OU moves up any compared to SMU, it adds a misclassification. Colorado get the Washington and Oregon results right with the HCA, but CU has several misclassifications (4, I think). In general, minimizing misclassification costs involves assumptions about the cost of different kinds of errors. In my professional life, the trade-off between false alarms and missed detections of tornadoes in the warning process is implicitly taken into account by assuming the cost of a missed detection is much larger than the cost of a false alarm (most medical tests work the same way). Here, the assumption is the costs are the same for all kinds of errors, so that minimizing cost is identical to minimizing errors. (The exception comes in the Harvard example give, earlier by the sofa-sitting Bofa, which I think is equivalent to assuming different costs for at least some errors.) That's not a bad default assumption in this case. I can imagine developing a way to weight errors differently, but it would likely have a lot of pseudo-objective inputs. To me, one of the points of this kind of exercise is that it highlights the differences inherent in different ranking processes. Ranking processes are designed, implicitly or explicitly, with assumptions about what's important and provide some sort of view of an unknown truth. For the most part, all remotely reasonable systems agree in broad brush. The departures from that agreement tell us something about the impacts of those assumptions. RPI, in women's volleyball, unintentionally rewards major conference teams in the east compared to the west. (It's my understanding that our friends in women's soccer have similar issues with RPI.) Using a win/loss only system, compared to a points based system in the Pablo structure, moves a small number of teams a large distance. Bofa, as you know, I am a lazy theoretician for the most part. It would be interesting to see a list of the number of misclassifications of each sign for each team and a comparison of the ratings from the full system and the W/L only system. I know I could put the tables together myself, but you probably already have them sitting around in some form.:-) Want to send them to me? Another way to put that is that you can minimize the number of errors or you can minimize the overall magnitude of error (usually RMS) or you can minimize the number of extremely large errors or.... This is why something like Massey (or the W/L version of Pablo) is a much better way to do this. If you do an RMS, you would rather have two matches with errors of 400 points than 1 match with an error of 0 and the other with an error of 800. In this approach, if you have a win over #20 and a loss to #140, it's better to either be ranked better than 20 or below 140. In the Pablo w/l approach, you'd put them at #80, even if you don't account for points. In the full Pablo approach, you put them between the two and closer to which ever team they had the closer match against. When I did the old W/L approach (and it's not hard, I just have to do another run and it takes time), the number of correct results was maybe 2% higher than full Pablo, so up to something like 84% instead of 82%. Definitely better, but not up to the Ultimate Ranking System where it got up to 88.5.
|
|
|
Post by pogoball on Nov 25, 2014 0:40:51 GMT -5
And for good reason. THIS is what you get when you focus on reflecting simple wins and losses as well as possible. You get a lot of crazy results. To make sure I understand, when you say crazy results, are you saying that when the URS is wrong, it can be more wrong than, say pablo? That is, it will outperform pablo in accuracy (88 to 82), but it may have a winner ranked considerably lower than a loser when it is wrong, whereas pablo may have fewer accurate predictions, but when it is wrong, it is predictably wrong (the teams will be ranked closer together).
|
|
|
Post by The Bofa on the Sofa on Nov 25, 2014 8:57:37 GMT -5
And for good reason. THIS is what you get when you focus on reflecting simple wins and losses as well as possible. You get a lot of crazy results. To make sure I understand, when you say crazy results, are you saying that when the URS is wrong, it can be more wrong than, say pablo? That is, it will outperform pablo in accuracy (88 to 82), but it may have a winner ranked considerably lower than a loser when it is wrong, whereas pablo may have fewer accurate predictions, but when it is wrong, it is predictably wrong (the teams will be ranked closer together). So if for Pablo we focus on the W/L version (since full Pablo is a completely different concept and completely not comparable). The answer to the above is basically yes, but it goes both ways. You will also be ranked closer to teams that you beat than you should. So a team that beat's 20 and loses to 120 is going to be put just ahead of #20, the team they beat, but way ahead of #120, who they lost to (or just behind 120 and way behind 20). However, if you look at those two matches and what they tell you, you'd think they'd be telling you that the team is somewhere between 20 and 120. But that's not what the URS is going to do. That's why I say it's going to give us crazy results. The problem is that once it becomes apparent to the system that the team needs to be ranked above 120, then that outcome doesn't matter any more. Or we could consider it in light of the fact that if you have two extreme results, one at each end, instead of taking the average like Massey or Pablo, the URS is going to choose and extreme and consider it legit, and ignore the other one. The reason it's legitimate is because there is a 50/50 chance whether it will go with extreme high or extreme low, so if you look at the overall mix, half go up, half go down, and the superposition is somewhere in-between, but from an individual standpoint, you don't have the shades of gray.
|
|
|
Post by The Bofa on the Sofa on Nov 25, 2014 15:00:11 GMT -5
Variations on the theme Part 1: No HCA 1 Stanford 7360 2 Texas 6940 3 Washington 6895 4 Wisconsin 6890 5 Illinois 6865 5 North Carolina 6865 7 Florida State 6765 8 Nebraska 6745 9 Penn State 6740 10 Oregon 6555 10 Colorado 6555 10 Arizona 6555 13 Florida 6540 14 Colorado State 6465 15 BYU 6460 15 Loyola Marymount 6460 17 UCLA 6455 18 Kentucky 6370 19 Pittsburgh 6365 19 Santa Clara 6365 19 Duke 6365 22 Miami-FL 6310 23 Iowa State 6270 24 Virginia 6245 24 Louisville 6245 24 Oregon State 6245 24 Texas A&M 6245 24 Arizona State 6245 29 USC 6240 30 Kansas State 6215 31 Oklahoma 6170 32 Kansas 6165 32 Utah 6165 34 Purdue 6135 34 Ohio 6135 34 Minnesota 6135 34 Ohio State 6135 38 Long Beach State 6125 39 West Virginia 6095 40 Seton Hall 6090 40 Creighton 6090 42 Alabama 6040 43 Marquette 5980 44 Cal State Northridge 5950 44 San Diego 5950 44 LSU 5950 44 Hawaii 5950 48 UCF 5945 49 LIU Brooklyn 5930 50 Michigan State 5925 50 Pacific 5925
|
|
|
Post by The Bofa on the Sofa on Nov 25, 2014 15:01:01 GMT -5
Variations on the theme Part 2: With HCA = 159 but more recent matches weighted more heavily 1 Stanford stan 7255 2 Wisconsin wi 6885 3 Texas tx 6835 4 Washington wa 6800 5 Illinois il 6720 6 North Carolina nc 6710 7 Colorado co 6640 8 Florida State fs 6605 9 Penn State ps 6560 10 Duke duke 6450 10 Oregon or 6450 12 Florida fl 6445 13 Arizona az 6420 14 Nebraska nebr 6405 15 Loyola Marymount lm 6355 16 Colorado State csu 6285 17 BYU byu 6280 18 UCLA ucla 6255 19 Kansas State ks 6235 20 Santa Clara sanc 6195 21 Iowa State is 6190 22 Oregon State ors 6185 23 USC usc 6160 24 Kentucky ky 6155 25 Texas A&M txam 6130 25 UNLV nvlv 6130 27 Arizona State asu 6105 28 Utah utah 6100 28 Miami-FL miam 6100 30 Long Beach State lbs 6085 31 Kansas k 6080 32 Creighton crei 6070 33 San Diego sd 6030 34 Oklahoma ok 6025 35 LIU Brooklyn li 5975 36 Alabama al 5970 37 Ohio ou 5965 38 Minnesota mn 5960 39 Seton Hall seth 5955 40 Washington State was 5945 41 LSU lsu 5920 42 Purdue purd 5915 43 Marquette marq 5910 43 Pittsburgh pitt 5910 45 Hawaii hi 5895 46 Illinois State ils 5890 46 Ohio State osu 5890 48 Pacific pac 5870 48 Northern Illinois nil 5870 50 Cal State Northridge csn 5860 50 Arkansas-Little Rock alr 5860
|
|
|
Post by The Bofa on the Sofa on Nov 25, 2014 17:07:54 GMT -5
Assessment (model, correct, %) RPI Raw 3728 0.823139766 RPI HCA 3761 0.830426143 Full Pablo 3734 0.824464562 UPR 3971 0.876793994 UPR No HCA 3826 0.844778097 UPR Time 3957 0.873702804
So not surprising, we sacrifice a bit when emphasizing the later part of the season, but still the performance is good. You can get a lot better with a HCA
|
|
bluepenquin
Hall of Fame
4-Time VolleyTalk Poster of the Year (2019, 2018, 2017, 2016), All-VolleyTalk 1st Team (2021, 2020, 2019, 2018, 2017, 2016)
Posts: 12,447
|
Post by bluepenquin on Nov 25, 2014 17:25:50 GMT -5
Assessment (model, correct, %) RPI Raw 3728 0.823139766 RPI HCA 3761 0.830426143 Full Pablo 3734 0.824464562 UPR 3971 0.876793994 UPR No HCA 3826 0.844778097 UPR Time 3957 0.873702804 So not surprising, we sacrifice a bit when emphasizing the later part of the season, but still the performance is good. You can get a lot better with a HCA Also not surprised - I think time is usually overstated before one does the analysis. Never liked - who is 'Hot', or what was their record in the last 10 matches. Wonder how much better RPI Raw would improve with HCA (3.1%)?
|
|
|
Post by The Bofa on the Sofa on Nov 25, 2014 17:36:56 GMT -5
Assessment (model, correct, %) RPI Raw 3728 0.823139766 RPI HCA 3761 0.830426143 Full Pablo 3734 0.824464562 UPR 3971 0.876793994 UPR No HCA 3826 0.844778097 UPR Time 3957 0.873702804 So not surprising, we sacrifice a bit when emphasizing the later part of the season, but still the performance is good. You can get a lot better with a HCA Also not surprised - I think time is usually overstated before one does the analysis. Never liked - who is 'Hot', or what was their record in the last 10 matches. Wonder how much better RPI Raw would improve with HCA (3.1%)? The RPI Raw is RKPI straight from the horse's mouth. RPI HCA is RKPI with an empirical HCA added in to maximize the success (.0291). As for the time aspect...I've looked at this in terms of predictability of outcomes. There is very little time dependence throughout the season until basically November, at which case it ramps up significantly. The time function I use in Pablo is something like a half-life of 200 days from the beginning of the season to November (where a match played 200 days ago counts half of what one does now), but a half-life of 56 days starting at Nov 1. This is based on analysis of past years' results. Note that even the 56 day half-life starting in November is still larger than what I used to use, which was 48 for the whole season. But an analysis of results shows that teams don't change near that much during the year. In the NFL, however, they do. The half-life for an NFL outcome is something like 4 weeks. That means that if two teams play at the beginning of the season and again at the end, by the time they play at the end, who won the first week tells you almost nothing. What matters is the home field.
|
|
|
Post by mikegarrison on Nov 25, 2014 17:37:29 GMT -5
Assessment (model, correct, %) RPI Raw 3728 0.823139766 RPI HCA 3761 0.830426143 Full Pablo 3734 0.824464562 UPR 3971 0.876793994 UPR No HCA 3826 0.844778097 UPR Time 3957 0.873702804 So not surprising, we sacrifice a bit when emphasizing the later part of the season, but still the performance is good. You can get a lot better with a HCA Also not surprised - I think time is usually overstated before one does the analysis. Never liked - who is 'Hot', or what was their record in the last 10 matches. Wonder how much better RPI Raw would improve with HCA (3.1%)? Well I'm not surprised either, but not for the same reason you are. If you design the ranking system to reflect a certain input (time-weighted wins) and then measure it against something else (raw wins) then of course it probably won't score as high as something that used the same input as what you are measuring it against. It's like if I want to rank order everybody in a room by height, and I have a scale, I can decide to weigh them all and figure the taller people usually weigh more. But that's not going to give me as good of a ranking by height as if I use actual height as my input. What's interesting is the HCA. Bofa, when you say that with HCA you get a better fit to the input than without it, does that mean you match 87% of the raw wins/losses with your HCA-adjusted methodology or does it mean you match 87% of the HCA-adjusted wins/losses with your HCA-adjusted methodology?
|
|
|
Post by gogophers on Nov 25, 2014 17:52:49 GMT -5
I won't pretend to understand any of this, but maybe this is a good time to ask a question about HCA. Pablo assumes, if I understand it correctly, that HCA is the same for all teams playing at home. But I would think that certain teams, by virtue of location (e.g., mountain schools)or attendance would have markedly better HCAs than other schools. I can understand the difficulty or crafting a HCA specific to a team. But--and here, finally, is the question--does using a HCA that is, out of necessity, the same for everyone create wrong predictions to a significant degree?
|
|
bluepenquin
Hall of Fame
4-Time VolleyTalk Poster of the Year (2019, 2018, 2017, 2016), All-VolleyTalk 1st Team (2021, 2020, 2019, 2018, 2017, 2016)
Posts: 12,447
|
Post by bluepenquin on Nov 25, 2014 17:59:02 GMT -5
Also not surprised - I think time is usually overstated before one does the analysis. Never liked - who is 'Hot', or what was their record in the last 10 matches. Wonder how much better RPI Raw would improve with HCA (3.1%)? Well I'm not surprised either, but not for the same reason you are. If you design the ranking system to reflect a certain input (time-weighted wins) and then measure it against something else (raw wins) then of course it probably won't score as high as something that used the same input as what you are measuring it against. It's like if I want to rank order everybody in a room by height, and I have a scale, I can decide to weigh them all and figure the taller people usually weigh more. But that's not going to give me as good of a ranking by height as if I use actual height as my input. What's interesting is the HCA. Bofa, when you say that with HCA you get a better fit to the input than without it, does that mean you match 87% of the raw wins/losses with your HCA-adjusted methodology or does it mean you match 87% of the HCA-adjusted wins/losses with your HCA-adjusted methodology? Kind of realized my error once I posted. Thinking more in terms of time and predicting going forward and not fitting backwards. But then, after thinking about it, I don't really understand how the formula could work for time - going backwards? If done like you describe - then it would not be necessary and I then become surprised that the two are that close.
|
|