Jump to content
HWBOT Community Forums

Adjustment for Global Points - Work in progress


Recommended Posts

  • Replies 313
  • Created
  • Last Reply

Top Posters In This Topic

Interesting, I like the changes. A big change to hw-points, not shure what impact that will have but it is also necessary because hw-points will only continue to be more and more problematic is it is now.

 

But if the points are dependant on the number of participants each submitted score will require a full recalculation anyway, won't it?

 

Maybe you could lighten the server load by having scheduled recalculations? Once a day would be enough. Could of course also be split up by category, benchmark etc if a full recalculation is too heavy.

Link to comment
Share on other sites

Guest george.kokovinis

Can someone ( Pieter ) please advise us when the reversal is done ?

Thank you.

 

I see my points reversed and all other top ten enthusiast members not.

Link to comment
Share on other sites

Guest george.kokovinis

Is there an actual rollback - reversal happening or an alternative algorithm applied or is the server simply gone bananas ???:banana::nana::banana:

Link to comment
Share on other sites

Guest george.kokovinis
Will be a proper rollback, will just take a while to get back to normal is all

 

THANK YOU !!!:)

 

P.S.

 

Best wishes to you and yours my good friend.

Link to comment
Share on other sites

  • Crew

I like the last try as well it is promising.

 

Along reading this thread I was thinking : why not the based point distribution and a factor when reaching some goals,like top 100 / 50 / 30 /10/ 5 /3 #1 this would be similar to that last try in the end.

 

Note : just out of my poorly hydrated brain. Not much back thinking yet about it...

Link to comment
Share on other sites

Everything in life is about assuming your decisions. So if you like to spend money on hardware, LN2, etc it's your call. BUT it's the same when you choose to post on HWBot for points. You have to assume that the rankings may vary, change or whatever. So before complaining about some changes in a website (that is not yours), just chill and think about it twice. Some people are working hard to offer this FREE website just for you to post some scores.

 

So we have to talk solutions. A fast idea that just raises to my mind is what about points to start from 0.1 (for the lowest score) and go to X (the number of the highest score divided by 10). So #1 will get 100 points if 1000 users post scores at that benchmark. Would that solves the problem with benchmark popularity as well?

Let me know your thoughts about this!

Link to comment
Share on other sites

This thread seems to have gone from looking for a solution to balance 2d/3d to a complete reworking of the bot!

 

Back to OP....all imo...

 

1. Lower threshold on 3d to better balance with 2d. That right there makes it a fair playing field for everyone to bench 2d/3d whatever takes their fancy.

 

2. Improved scaling of points. All you top guys need to chip in on this one as well imo as it might effect your group more than others.

 

3. Joe's suggestion for core grouping in 3d sounds good if the aim is to get more people benching a wider variety of 3d. I know just in my team alone many won't bench vantage etc as 5960 is out of reach for a lot of guys. Only downside I can think of with that is ending up with way too many benches if all current ones are kept then split into per core...?

 

4. Pros don't need a separate league. There's already distinction between extreme/elite etc. Ok, you might not get 1st place ranking in a bench but you still compete in your own league. Enthusiast was highly competitive when I was active a couple of years back and I don't think we cared one iota what the xoc guys were doing. We were still fighting for top h2o honours.

 

5. If point 1 is sorted then there shouldn't be a need to create separate 2d/3d leagues. Unless your suggesting separate for each then an overall ranking? Danger with that one is it maybe ending up pushing guys more in one direction than they are already. I.e, only concentrating on one league rather than utilising both 2d/3d to gain points.

 

I definitely think some sort of a poll should be setup as not all will want to speak publicly in this thread I'm guessing. At the end of the day, you can't please everyone and that'll be the biggest hurdle. Some will take it for what it is, hopefully with some middle ground for all. Others will just moan and whine regardless as they won't recognise any middle ground and won't be able to take it on the chin if they don't agree 100% with any changes.

Link to comment
Share on other sites

This thread seems to have gone from looking for a solution to balance 2d/3d to a complete reworking of the bot!

 

Back to OP....all imo...

 

1. Lower threshold on 3d to better balance with 2d. That right there makes it a fair playing field for everyone to bench 2d/3d whatever takes their fancy.

Agree!

 

2. Improved scaling of points...

Agree, think we came up with 100, 85, 75% a few pages back and noone disagreed...

 

3. Joe's suggestion for core grouping in 3d sounds good if the aim is to get more people benching a wider variety of 3d. I know just in my team alone many won't bench vantage etc as 5960 is out of reach for a lot of guys. Only downside I can think of with that is ending up with way too many benches if all current ones are kept then split into per core...?

I think this might be good, but only do one unrestricted cpu ranking and one 4c (unlimited threads) ranking. Might make 3d a little more popular.

 

4. Pros don't need a separate league. There's already distinction between extreme/elite etc. Ok, you might not get 1st place ranking in a bench but you still compete in your own league. Enthusiast was highly competitive when I was active a couple of years back and I don't think we cared one iota what the xoc guys were doing. We were still fighting for top h2o honours.

Agree, you're only fooling yourself if you think you're better than you are or that the pros get their results withput effort or for free.

 

5. If point 1 is sorted then there shouldn't be a need to create separate 2d/3d leagues. Unless your suggesting separate for each then an overall ranking? Danger with that one is it maybe ending up pushing guys more in one direction than they are already. I.e, only concentrating on one league rather than utilising both 2d/3d to gain points.

Agree, a 3d ranking can be a curiosity (like the 3d king thread) but should not be separate rankings.

 

I definitely think some sort of a poll should be setup as not all will want to speak publicly in this thread I'm guessing. At the end of the day, you can't please everyone and that'll be the biggest hurdle. Some will take it for what it is, hopefully with some middle ground for all. Others will just moan and whine regardless as they won't recognise any middle ground and won't be able to take it on the chin if they don't agree 100% with any changes.

Don't know if a poll will provide any useful results unfortunately. The way threads like this go there will never be a majority that will support any change, because all proposals are garbage...

 

In real life authorities propose a change, get the public and expert opinions, and then make the changes they want anyway :)

In the end I think hwbot has the best grasp of what will be the best course.

Link to comment
Share on other sites

Rollback done, points will return to normal very soon.

 

-----------------------------------------------------------------------------------

 

Important update on the Adjustment

 

We discussed the adjustment for the points internally and one of the key problems with the current calculation method is that it's based mainly on the position of a result within a ranking. The consequence of this method is that a new, high score in the ranking will affect many other results and trigger the chain of updater and notification algorithms. For example, if you submit a #5 result in a ranking with 2000 participants, our server needs to:

  • update the rank position for 1995 results
  • recalculate the points for 1995 results
  • update the league totals for 1995 users
  • update the team league totals for ? teams
  • store historical points information for 1995 results and users
  • create notifications for the result, user and team

In short: a ton of work.

 

In the long run, this is not sustainable. This year we breached the barrier of 1.5 million benchmark results as 25,000 overclockers submitted about 350K results. That's a ton of recalculations, hence we're having recalculation slowdowns once in a while. Since we're trying to improve the points anyway, why not give it a shot.

 

I spent this beautiful afternoon working on a new algorithm that takes into account two parameters: participants and a submission's score in relation to the top score. The algorithm I am playing with right now has the following characteristics:

  • Same algorithm for Global and Hardware rankings
  • GL Min Pts = 1
  • HW Min Pts = 0.1
  • Threshold for Global ranking is ~ 1000 participants
  • Threshold for Hardware is ~ 55 participants
  • GL reaches Min Pts at ~ 65-70% of top score, depending on popularity
  • GL Points at 1000 participants for #1, #2, #3 are 150, 135 (90%), 127.5 (82.5%)
  • HW Points at 55 participants for #1, #2, #3 are 50, 45 (90%), 42.5 (82.5%)
  • Max points for #1 when 1, 10, 20, 30 participants is respectively: 3.9, 16.4, 26.7, 34.8 points.

This is a pretty radical paradigm shift for HWBOT points. I want to emphasize that this is something that will be mandatory in the near future. No other way. These are some of the effects:

  • Improving your score will always yield more points (except if you're #1)
  • The only way to reduce your opponent's points is by improving the #1 score
  • Equal score = equal points (ie. XTU 2xCPU -> top50 has 100pts)
  • Less steep slope, so more points
  • Top-3 always has an extra bonus
  • A lot more points in HW rankings (easy 20pts), even for uncompetitive rankings

As an example, I've taken three rankings from the database and plotted the points in Excel. The rankings are Fire Strike 1xGPU (Global), XTU 2xCPU (Global) and Fire Strike Extreme 1xGPU GTX 970 (Hardware). You can see the points plotted as position in the ranking for the three rankings below.

 

CAVEAT: this is the first time I'm trying out this new algorithm, so please understand that I am aware that the scaling is far from perfect yet. There are three things I want to look into. First, used fixed multipliers for top-10 instead of top-3 to give higher reward for pushing into the top-10. Second, make a steeper slope for the categories with low participation so that being 1st out of 30 gives maybe 20 pts rather than 35. Three, find a way to ensure poorly scaling benchmarks (like XTU) don't generate so much points for everyone. (*)

 

Anyway, that's it for this Sunday. See you all tomorrow! :celebration:

 

attachment.php?attachmentid=3667&stc=1&d=1451218973

 

(*): and of course find a way to include the parameter 'skill' somehow.

 

Thanks for the dedication to the community.

Regardless the result of the changes, is good to know Hwbot is working to be better.

 

P.S. maybe you can leverage this fact to includde in a pool about the Top 10 OC-eSports Points. The most of the people (at least, the people I asked about), don't understand why this points can contribute to the ranking, since there are competitions that's not for all. I think it cannot be together the ranking points, but maybe people prefer decide in a pool.

 

Thanks

Link to comment
Share on other sites

Thanks for the dedication to the community.

Regardless the result of the changes, is good to know Hwbot is working to be better.

 

P.S. maybe you can leverage this fact to includde in a pool about the Top 10 OC-eSports Points. The most of the people (at least, the people I asked about), don't understand why this points can contribute to the ranking, since there are competitions that's not for all. I think it cannot be together the ranking points, but maybe people prefer decide in a pool.

 

Thanks

 

Nice gear change.

When you cannot get what you want added, you try to take from everybody else. :rolleyes:

 

Competition points were added to entice more people to bench in comps. That's basically the only reason.

Take the comp points away then. Nobody will care......but then nobody will go out of the way to bench them either.

Cause and affect man. For every action, there is a reaction.

Link to comment
Share on other sites

Guest george.kokovinis
Nice gear change.

When you cannot get what you want added, you try to take from everybody else. :rolleyes:

 

Competition points were added to entice more people to bench in comps. That's basically the only reason.

Take the comp points away then. Nobody will care......but then nobody will go out of the way to bench them either.

Cause and affect man. For every action, there is a reaction.

 

 

Never crossed my mind Scotty, that a kinder garden for big boys could

evolve into a such complicated issue.

Seems that either I am too naïve or the financial interests hidden carefully

by some members are much bigger than I initially thought.

Link to comment
Share on other sites

Continued working on the algorithms today. What I worked on is the following:

  • Moved completely to a Performance based algorithm away from Ranking based (here explained why necessary)
  • Work out the specific parameters for Hardware, Global and WR points
  • Test the theory on 5 different hardware items

The characteristics of the rankings are:

 

[table=head]Type | Max User Thres | Lower Pts Thres. | MaxPts | MinPts | #1 MP | #2 MP | #3 MP | #4 MP | #5 MP

Hardware | 55 | 0.75 | 50.2 | 0.1 | 1.15 | 1.1 | 1.05 | |

Global | 1000 | 0.5 | 149.6 | 1 | 1.3 | 1.2 | 1.15 | 1.1 | 1.05

WR | 5000 | 0.9 | 50 | 0 | ||||

[/table]

 

Here are some notes:

  • To help understand, let's take the Global rankings. The maximum points is 50.2pts when the amount of users is equal or higher than 1000. To receive more than the minimum of 1 points, you need to score minimum 75% of the result of the leader in the global ranking. Being #1 in the global ranking will always give you a 30% increase over 6th place, #2 will get 20% more than 6th place, etc.
     
  • I included the WR Algorithm for now to address a specific remark made by @BenchBrothers\.de in the general discussion thread.
    Yesterday 8 Pack broke the global world record in 3DMark05. And this global wr comes in in 5th position on the frontpage - beaten by a XTU-score from Bullshooter of 740 points. In the hwbot-database there are more tan 23600 faster results in XTU than the score from Bullshooter. In addition, there are six other overclockers which reached the same score earlier than him in the same hardware-category. And yet he beats a global world record on the hwbot-frontpage (and lead to believe it's a bigger achievement).
    Expressing the value of each score in relation to the overall World Record of a benchmark will help us avoid this kind of situation. I'm not sure yet how to include the WR algorithm in the overall equation (part of GL or separate), as today I just wanted to lay the groundwork.
  • The Max User Threshold determines when a ranking is maxed out. The user thresholds were discussed earlier in the thread.
  • The Lower Pts Threshold determines when a user starts scoring points in the ranking. This is the biggest change in the algorithm as this replaces the rank parameter. I will adopt a more scientific writing in the future, but for now it suffices to say that this is the relation between your score and the top score. The ratio is a number between 0 and 1. The closer the threshold is set to 1, the closer you need to be to the top score in order to start scoring points.

In the graphs below you can find the distribution of points for different levels of participants.

 

attachment.php?attachmentid=3696&stc=1&d=1451820293attachment.php?attachmentid=3697&stc=1&d=1451820293attachment.php?attachmentid=3698&stc=1&d=1451820293

 

 

Testing Theory with i3 6320, i7 6700K, GTX 970, GTX 980 and GTX 980 Ti

 

To test the parameters for the different algorithms, I tested the following hardware and benchmarks:

  • XTU: Core i3 6320 and Core i7 6700K
  • Fire Strike Extreme: GTX 970, GTX 980 and GTX 980 Ti

There are two versions of the hardware and global point chart. The first version describes the points awarded expressed as rank, the second describes the points expresses as performance. The first version is the most practical as it reflects how the results will be displayed on the site. The second version reflects the overall distribution of the points within a ranking.

 

Some notes:

  • The XTU 2xCPU and Core i3 6320 is really freak ranking. I included it in the test, but I really shouldn't because it does not reflect the natural behavior of rankings. A freak ranking where the top score is capped artificially because the ranking is dominated by locked CPUs is as much a paint to award points to like a ranking with only one result that is world-class
  • The above is the reason why so many Core i3 results score high points. It's high performance (close to top score) and highly popular
  • Hardware / Real Rank: the i7 6700K has the most natural spread of points. This is mainly because of the quantity of results. The quick drop-off in the VGA categories can be explained by a lack of top competitive results.
  • Hardware / Performance: not much to say, you can see the spread diverges when participation hits threshold
  • Global / Real Rank: the threshold of 1000 participants makes Fire Strike Extreme as valuable as the XTUs. The quick drop-off is mostly related to the low amount of super-competitive scores in those categories. You can also see that a 0.5 threshold isn't enough for the GTX 970 to get any global points. The top score in FSE 1xGPU is 14432 marks (2200/2170) and the top score for the GTX 970 is 7113 marks (1962/2142) (*)
  • Global / Performance: you can see the effect of threshold (lines diverging) and level of competition (#2 in XTU 4xCPU is closer to the top than the #2 in FSE 1xGPU)

 

(*): it's been brought up before by @xxbassplayerxx and I think this adds to his opinion that "[...] your argument here should be with Futuremark". The best GeForce GTX 970, overclocked to 2G on LN2, scores not even 0.5 of the best FSE 1xGPU. We can try as much as we can to create an environment that allows for more competition, fact of the matter is that 3D benching equals a $700 GPU and anything else is not even remotely competitive. FYI, the best Core i5 6600K XTU 4xCPU scores 1913 points which is 0.89 compared to the top score. In the test algorithm, the i5 would score 92 Global Points compared to the 150 of the top score.

 

attachment.php?attachmentid=3699&stc=1&d=1451820697attachment.php?attachmentid=3700&stc=1&d=1451820697

 

attachment.php?attachmentid=3701&stc=1&d=1451820819attachment.php?attachmentid=3702&stc=1&d=1451820819

 

 

Moving Forward

 

Some stuff on the to-do list:

  • Check with dev if adding bonus for specific ranks (1-5) will affect the server load. If so, how much.
  • Check with dev workload of new algorithm implementation
  • Re-verify the distribution of participation in the Hardware, Global and Wr rankings (like here)
  • Discuss: thresholds for max participation, minimum performance
  • Try parameters on more rankings. Specifically look at WR+GL integration.
  • Eventually deploy on UAT
  • Revisit the balance of global, hardware and oc-esports points for Overclockers League

 

Have a good week everyone! :celebration:

Link to comment
Share on other sites

If I understand this properly... in short this proposed change would mean all submissions scoring less than 75%/50% of the best result in HW/GL category are basically worthless in terms of getting points?

 

For example Core 2 Duo E6600 PCMark 05 HW ranking:

http://hwbot.org/benchmark/pcmark_2005/rankings?hardwareTypeId=processor_873&cores=2#start=0#interval=20

http://hwbot.org/submission/2363595_havli_pcmark_2005_core_2_e6600_%282.4ghz%29_20228_marks

 

The top score is 36998 marks, my score is 20228 marks. At the moment my submission is ranked 27th out of 227 and getting 17.3 points. However in the new system it would mean 0.1 points... since is only 54.67% of the top score?

 

 

This new system seems to be working very similar to the current one in rankings with uniform score distribution. However rankings with few good scores and many average ones doesn't look very good.

Example - R11.5 32cores global http://hwbot.org/benchmark/cinebench_-_r11.5/rankings?cores=32#start=0#interval=20

 

Top score = 43.24... the lower treshold is 43.24*0.5 = 21.62. So it seems anything at current position 8 and below will get just 1 point.

Link to comment
Share on other sites

In my opinion, the new distribution for hardware points is a really bad idea because the rewards will no longer be representative of the actual ability to bench legacy hardware.

 

Just some examples:

- Benching a totally obscure (or made-up) CPU/VGA at stock will now yield 10-20 hardware points.

- A semi-obscure category, like Pentium 4 630 SuperPI 32M will now be a 50-pointer. An argument "the categories will become more difficult once people discover and fill them up with scores" is not valid here since there will be thousands of 50-pointer categories and, realistically, there will not be enough active overclockers to fill them all up.

- Moving up a few spots in a really challenging category (like GeForce GTX780Ti Catzilla 720p) will yield zero-point-something points, unless you're talking about top 3 (getting where in the first place is an achievement of its own) in which case improving your score will give you a "generous" reward of 2-3 points.

 

This means that getting a decent coverage of hardware points (say, 700+) that count towards a league rank will reduce from "getting 2nd or better in 20 categories with 250+ participants each" to "getting 20 decent scores in non-obscure hardware categories", the latter of which can be achieved by skill-free scatter benching.

 

Also, for the HW masters league it will mean that two garbage scores will outweigh one "proper" score. So, the only way of becoming the HW king will now be by going through warehouse-like amounts of old and useless (non-resellable) hardware, benching which at any level (let alone properly) might take a decade of 100% free-time commitment. I can't see many people liking this since it is currently possible to become the HW king by benching family/wife-compatible amounts of relatively modern (and resellable) hardware and/or not benching 2D or 3D at all.

Edited by TaPaKaH
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...