Jump to content

Recommended Posts

Posted
6 hours ago, ugarles said:

have you updated your rankings with the complete data set somewhere? i don't see it.

Not yet.  Still working on last week, looking at the outcomes for this week.  I do have all the conference tournament scores from last weekend, though.

Hey, if anyone wants to help, can you dig up all the outcomes for matches with seeded wrestlers?  One thing I want to do is to compare Pablo predictions with the seed predictions to see if I'm in the ballpark. If you have the ability and time and can do a tournament, drop a result here.   What I'd be looking for is

Conference

Weight class(es): # of matches the higher seed won - # of matches the lower seed won

Look at all matches, including consolations (that's where it gets trickier).  If one wrestler has a seed and the other doesn't, the seeded wrestler is expected to win.  If neither wrestler is seeded, ignore it.  This will help me a lot in terms of knowing if this is worthwhile.

I will have final rankings available by probably the middle of next week. 

 

Posted

OK, this is probably be my last update of results on this thread.  New Pablo content will be posted in a new thread when I get a chance.  But here's what I have learned.

Using the bigger dataset as described above, I now have updated rankings for matches before the conference tournaments weekend.  These were calculated using an updated valuation model using the rematch data outcomes to assess the predictive ability of events.

I won't post the full rankings, just some assessment data.  As I mentioned earlier in the thread, my starting benchmark are conference tournament seeds.  What we can do is to compare how well do seeds do at predicting outcomes vs Pablo rankings.  I didn't give you enough time to help me out with the data, but I have gone through all the conference tournaments and calculated how often the matches went according to seeds.  I didn't do anything real rigorous, but I just went through the brackets on Track, and if they didn't have the seeds there, I looked them up and had them for reference.  I can't guarantee the results to be perfect, but they are probably pretty close.

I have no historical context to assess this, so I will just provide the results. 

For the conference tournaments last weekend, the higher seeded wrestler (better seeded?) won 75.4% of the time (in almost 1300 matches).  I didn't include matches that were injury defaults or medical forfeits, only if they wrestled out.  I also didn't include matchups of non-seeded wrestlers.  The one place where I know there could be some issues is with the handful of wrestlers who changed weight classes, wrestling in weight classes they've not been in all season.  I'm pretty sure that they wouldn't get seeded in the tournament, or would be seeded low, but that doesn't reflect how good they are.  But I also think Pablo is going to call their matches as losses, so I don't think it affects the comparison.

OK, so the seeds, which are determined by the magic of the seeding people with their brilliant understanding of wrestling and their having watched matches and stuff got 75.4% correct.

I am not able to look at just the seeded matches in Pablo, but I did look at all the matches in the conference tournaments (the difference is that Pablo considered matches between unseeded wrestlers where they exist).  The number of correct matches was 77.0%.  Therefore, this version of Pablo rankings did better than the seeds.

Now, it's not by much.  If you look at the matches by the seeded wrestlers, if you used Pablo rankings to compare them, you would get about 20 more matches predicted correctly than if you used seeds (out of almost 1300).  But that's a difference.

This is very promising.  If nothing else, it shows that Pablo, who knows nothing about wrestling and only knows who wrestled, who won, and by what score, is at least just as smart at seeding wrestlers as the current approaches.   It will be interesting to see what happens with nationals.  That's the next test.

There is another way to think about this.  Of those matches in the conference tournaments, about 600 of them were rematches from earlier in the season, so I can use them in my match-pair dataset. I've done that, and while it has made some small tweaks to the model, the conclusions are the same as what I posted above.  However, one thing I also did was to look at the new additions, the rematches that took place last weekend.  What I found is that, if the two wrestlers had met up previously in the season and then met up again in the tournament, the wrestler who won in the regular season won .... 76.6% of the time in the tournament.

In one respect, that's satisfying because it says Pablo is basically reflecting the case of, well, who won before?  Let's expect them to win again.  However, I'm not as happy, because I'd like to hope that by using scores/outcomes, I can learn more about the wrestlers than just who won.

There are some things I need to do to improve the fitting algorithm and maybe that will improve things a bit.  I'll keep working on it in the background, but the good news, we are at a good starting place.

The TL;DR Summary:  Pablo would have seeded the conference tournaments better than they were. 

  • Fire 1

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×
×
  • Create New...