Jump to content
HWBOT Community Forums

2D Benchmark proposal list for 2021


Leeghoofd

Recommended Posts

Err, wprime is not generally "comparable" on old hardware. There are loads of scores with crazy efficiency and large gaps sometimes in the rankings - maybe even due to people in the past using the dll tweak in the past and not disclosing it.

Also, merry bunny extracting christmas everyone. ?

Edited by Noxinite
Link to comment
Share on other sites

10 hours ago, Noxinite said:

Err, wprime is not generally "comparable" on old hardware. There are loads of scores with crazy efficiency and large gaps sometimes in the rankings - maybe even due to people in the past using the dll tweak in the past and not disclosing it.

Also, merry bunny extracting christmas everyone. ?

You could say that for just about every benchmark out there. Numerous tweaks for PiMod for example. Should people disclose the use of TinyXP? At what point is it no longer a dead horse beaten to death with this age old conversation?

I'd wager the popularity of the benchmark would have it left in it's place, but I've already had my vote for this year.

 

  • Thanks 1
Link to comment
Share on other sites

  • Crew

Wprime Globals have been canned due to the Software Elite DLL OCers not even remotely to be considered as a tweak mindset...Those that can't let go from software tinkering, have shot themselves and all the rest once more in the foot...

Uncontrollable now, unless run with Benchmate... I don't play the game as my predecessors; the disclosing or sharing is caring is good enough policy. It makes my OC heart bleed it is continuously labelled as a tweak... Logic reasoning tells me  a tweak doesn't give you this kind of a performance boost, allowing you to thrash a 2D ranking with 400MHz less processor core clocks... Especially for S775 Wprime v2020 has to be considered as a new benchmark version. 

So what is expected to be next for wprime: use another dll that gives only a minor boost or run it with the latest super dll and you clock up the cpu speed after the run and falsify the screenie so it looks more legit? You would be surprised what people are willing to do for some virtual boints...

We have learnt from the past this sort of tweak acceptance doesn't work, as a minority keeps on doing what they consider as innovating, but in fact are just continuously software hacking the vulnerabilities of the benchmark. Thus ruining it for everybody else. 

  • Like 3
  • Thanks 2
Link to comment
Share on other sites

3 hours ago, Leeghoofd said:

as a minority keeps on doing what they consider as innovating, but in fact are just continuously software hacking the vulnerabilities of the benchmark. Thus ruining it for everybody else. 

I'm humble with whatever changes are needed to stop minority cheaters we let stay here. 

But from my angle, the second one of our team members gets caught for something like that, he will be released from our team immediately and exposed as a cheat.

 

 

  • Like 1
Link to comment
Share on other sites

9 hours ago, Leeghoofd said:

Wprime Globals have been canned due to the Software Elite DLL OCers not even remotely to be considered as a tweak mindset...Those that can't let go from software tinkering, have shot themselves and all the rest once more in the foot...

Uncontrollable now, unless run with Benchmate... I don't play the game as my predecessors; the disclosing or sharing is caring is good enough policy. It makes my OC heart bleed it is continuously labelled as a tweak... Logic reasoning tells me  a tweak doesn't give you this kind of a performance boost, allowing you to thrash a 2D ranking with 400MHz less processor core clocks... Especially for S775 Wprime v2020 has to be considered as a new benchmark version. 

So what is expected to be next for wprime: use another dll that gives only a minor boost or run it with the latest super dll and you clock up the cpu speed after the run and falsify the screenie so it looks more legit? You would be surprised what people are willing to do for some virtual boints...

We have learnt from the past this sort of tweak acceptance doesn't work, as a minority keeps on doing what they consider as innovating, but in fact are just continuously software hacking the vulnerabilities of the benchmark. Thus ruining it for everybody else. 

The fundamental flaw in this argument is that socket 775 isn't fast enough for globals. Are we now at the stage that we ban things by association? ?

Link to comment
Share on other sites

  • Crew

The only flaw I see is that another benchmark's integrity has been compromised and now fully exposed. S775 is the perfect example how bad this can turn out. No matter how many rules we will impose, we will no longer be able to fully cover this software OC behavior on Wprime as we continuously will be trailing these software guys. 

Life could be so simple if one just pressed Run Benchmark... 

Link to comment
Share on other sites

3 minutes ago, Leeghoofd said:

 

Life could be so simple if one just pressed Run Benchmark... 

You know damn well competitive benchmarking did not start with "pressing start" and getting a score. That's for the gamers at UserBenchmark. And perhaps for modern benching, it's the more proper approach to comparing systems. I think HWBot is much more elaborate than this. And then much more difficult to modernize and implement.

Would it be nice to have just a HWBot benchmark that differs from the rest? Push start, get a score!! It seems maybe using a bunch of 3rd party benchmarks is a loose loose situation for these modern times.

 

  • Like 1
Link to comment
Share on other sites

On 12/26/2020 at 3:47 AM, Noxinite said:

Err, wprime is not generally "comparable" on old hardware.

It is. Show me an all-out CB11.5/15 run on AM2 or 939 for example. They don't exist because the bench wasn't out back then, but wPrime has 40+ HWPoints on some CPUs. Anyway, it's no big deal. If the return of Pifast requires the "sacrifice" of wPrime then I'm o.k. with that. The bigger issue is the loss of both 05 and 06 on the 3D side :(

Link to comment
Share on other sites

  • Crew

@ShrimpBrime

Ofcourse you can tune/tweak your setup to get better performance. But in this case if we are on a continuous quest to spend hours/days to figure out how the benchmark code ticks and which software it adresses in Windows than we have a problem. The flaw is one of the  reasons Matt didn't include Wprime at first in his BenchMate benchmark suite. it might be fun for programmers and such to analyze that, but that is where it should stop...

For me tweaking is how does it perform with faster memory/ tighter timings/ dual or 4 sticks/ scaling with uncore/affinity/ priority/diagnostic mode/... For Wprime there are so many things that you can do to get better performance, even changing the installation drive has its impact...... This all without the need to fiddle around in the system folder because you can't match a score from 2010...

And after we figured out all that, then we just press Run benchmark.

Its too little and too late (again), the damage is done...

  • Like 1
Link to comment
Share on other sites

Wprime is as sucky as a benchmark can be. Good riddance. But I do not agree with the reason for doing it. When I started benching 3dmark01 was the latest 3d benchmark out there and the one that everyone was playing. I fell in love with tweaking it, both clocks and OS, to get a higher score. I could not afford a hard drive just for benching, and I was using the familys shared computer. So everytime I benched I had to do a fresh install of win2000, tweak it and run. Then reinstall winxp, Word, antivirus etc so that my parentes could use the computer again.

That was a maginificent time for overclocking. Required skill for both pushing clocks and tuning OS, drivers etc. It's so rewarding to see scores go higher from small tweaks and not just higher clocks. I guess that's why 32M is still so popular to play. You can tune OS and also the tiniest subtiming. If you can shave off 0.1s you consider it a huge gain.

Now hwbot says I should have just used my family style winxp and pressed "run benchmark" . You know what, I think I would have done that two or three times and then never again.

I can understand being fed up with finding exploits for benchmarks. But tweaks will always be there and always be invented. Should we ban 32M because copy waza is not fair to older scores? The gain is huge... 

  • Like 1
  • Thanks 1
Link to comment
Share on other sites

  • Crew

How about a "tweak" that shaves of 120 seconds in Superpi and for which you almost don't need to do anything besides being a hardcore software guru... You exactly highlight what real tweaking is Tobias and its not what happened lately... Don't take the press Run benchmark to literally ?

Link to comment
Share on other sites

3 hours ago, Leeghoofd said:

How about a "tweak" that shaves of 120 seconds in Superpi and for which you almost don't need to do anything besides being a hardcore software guru... You exactly highlight what real tweaking is Tobias and its not what happened lately... Don't take the press Run benchmark to literally ?

Take off 120 seconds? That's easy get rid of W10 and run Xp :P 

Please rid of hackers and cheaters like you will with certain benchmarks as a result. That's all I ask for. :)

Link to comment
Share on other sites

  • Crew
On 12/26/2020 at 10:50 PM, Leeghoofd said:

Those that can't let go from software tinkering, have shot themselves and all the rest once more in the foot...

I don't agree with your position but I will respect it.

 

On 12/27/2020 at 10:30 AM, Mr.Scott said:

Have to draw the line somewhere........and stick to it. 

Exactly. And I don't see it now. Before it's been pretty simple, take PCM05 for example: you're not allowed to change benchmark's DLLs (I've tried to update the sound encoding with a newer version and it gave crazy boosts) because they're part of the test bundle. Pretty much like textures are part of 3DMark 2001 bundle if anyone is this old to remember that cheat. But you are allowed to change driver version, strip down the OS, upgrading or downgrading DirectX, using maxmem, LSC, copywaza making the software part of the system faster (I want to point this - making faster, not believing it runs faster). And now there's a forbidden subsystem we can't touch.
If tomorrow comes a new driver or DirectX that will gain 5% and make old rankings obsolete, will we ban it too? Or will this be a kickstart for a re-bench? Oh, and how about OpenCL drivers, will we ban the fastest ones so everybody stays in line with the slowest guys?
Two days ago I've spent about 6 hours disassembling an old BIOS to give it hell, maybe I should stop from software tinkering?
I really wanna see a line.

Link to comment
Share on other sites

  • Crew

This will be an eternal debate and to draw a line is impossible as you can't define it all in a rule set. Let alone control it.

Take eg your PCmark05 benchmark. It was designed at a specific period for a specific available hardware/software set.

Issue today is that with many of the 2D legacy benchmarks is that they do not include all the required software in their benchmark. For PCMark05  you were required to install seperately the Media encoder. So looking at the laid down rules and your software OC mindset: you can't change a dll or such inside the benchmark folder, but anywhere else it is allowed. So if an OCer 10 years later installs windows 7 and starts spending hours/days looking for upgraded media encoders that support eg more than 6 threads you are fine with it. Or that one finds a tool that can remove the 6 core thread limit in the foreseen Media Encoder and raise it to eg 12... You can call that overclocking, I failed to see it back then and even more now.

So what happened, Christian approved & shared these "tweaks" and YoungPro was officially nominated as super mod specifically for PCMark05. Software gurus like Glucovio and such found other ways to boost eg.the txt file test, browser tests,... turning PCMark05 into a geek software battle. Nothing to do anymore with pushing hardware clocks, but analyzing and "optimizing" the benchmark. Yes you can cover your actions by: I don't make the benchmark believe it is running faster, it actually is running faster. But that was not how the benchmark was developed was it?

Ask yourself this simple question:  is it due to the fact that Futuremark didn't make it secure enough or thanks to lets call it the "creativity of the OCer" that PCMark05 is now labelled as a crappy, buggy benchmark? And that the effort all those people that enjoyed benching it out of the box has become bointless.

Similar as what happened now, beating LN2ed scores with far less clocked setups. That because a few have an edge due to having a programmer background and that they stay within the never ever perfect rules set.

The bottom line is you can't define or draw a perfect line within the rules. The easiest simple line is the ethics and moral of the overclocker. Ask yourself: Is it hard to just install the operating system, tune it by removing abundant software you don't need, install the benchmark and see how it reacts to core/uncore/mem speed and timings? What you do in the bios or such that requires skill/expertise and I know it is the thing especially for the legacy stuff. 

Nowadays the knowledge/expertise of the OCers, the abundant Bios settings, the cooling gear and such are at a far higher level than it was ever before. Therefore I fail to see that we need to install eg a 2015 software into a 2001 OS and to claim: hey mum look how good I am. I thrashed the global first with 500MHz less. Does that feel right to you? If it does, you will never ever grasp my point of view and we can debate endlessly.

You have to think, if we continuously allow this too happen and repeat the decisions of the past, we might end up in 2022 with a truely secured benchmark suite, by tools like BenchMate. Is that what everyone wants here? no more XP support? Limited to 12 benchmarks? Some that might not even run on older hardware? Many specifically all look at it as one benchmark, you have to look at it from a wider angle. Don't we all want to battle with the same tools and be competive. Or you prefer being Indiana Jones and just shoot the guy with the swords. 

For me the only way to demotivate the software tinkering and that the database remains sort of intact is to remove the overkill scores and remove the global points from Wprime.

As stated already so many times before, rules are rules. Rules will never be perfect, people will bend them, break them and will, in the long run spoil the fun for others. Again our own acts will lead to an abundant and probably incomprehensible rules set, making it so hard on newcomers to grasp it all. Some already give up as they have to open 2 CPUZ tabs. Imagine they have to make a distinction between this Dll can or can't be used...  

The concept of the upcoming HWBot version is to simplify things so it will be fun again and doesn't require an engineering degree before starting to push your hardware. It is up to you as an individual to make that work or not... 

  • Like 5
Link to comment
Share on other sites

@Leeghoofd it sounds like more of a job than a hobby for you. I do not envy your position at this company.

I just want to run benchmarks. Copywaza is even too much for my old ass to bother using to eek out 0.0010 seconds. 

Anyhow. The second it becomes a job for overclockers, like making submissi ok s and filling out forms that take longer than benchmarks, I guess I just enjoy the hobby less at that point.

  • Like 1
Link to comment
Share on other sites

Just to clarify: I didn't add wPrime to BenchMate at first because it is not a good benchmark. Visual Basic 6 is not the right tool and therefor it was a lot of effort to support it. But I did it anyway because benchers define what is worthwhile (as in fun) and what's not. And I'm more than okay with that as long as BenchMate's security and reliability standard can be upheld. That is not the case for the x265 benchmark btw (or any other bench using Java or other interpreter languages).

The above was kind of true for wPrime as I stated in my Facebook post, because the DLL changed a lot of functionality in the benchmark's workload and made the results less comparable for rankings. I personally don't consider using different DLLs a cheat. It happens all the time even without anyone's knowledge. Just using another OS version or even installing another piece of software might have an impact on a benchmark's performance. IF the benchmark's workload relies heavily on that 3rd party code (which can also be a driver or the OS in general). That's true for most benchmarks out there, especially for 3DMark and other 3D benchmarks (DirectX, display driver), PCMark, GPUPI (OpenCL, CUDA) and so on.

But there are other benchmarks that are not influenced by other software. For example SuperPi, pifast and some code paths of y-cruncher I guess.

In the case of wPrime we are talking about a non-bundled system DLL, that has significant changes between different OS versions. It can not be dismissed as a cheat in general, because this is just the way software works. But I do consider it a tweak of course and the bot can or can not allow it for the sanctity of their database and the process of score moderation.

Bottom line is the bot could have either allowed it or removed boints. Alternatives are the bench itself gets updated (not possible for unmaintained legacy benches) or a wrapper becomes mandatory (BenchMate).

In my opinion it was the right choice. wPrime does not scale above 64 cores and even worse than the DLL: it doesn't have real threading. It spawns processes through COM, which puts even more dependencies on the OS and is therefor even more uncontrollable. Good riddance.

Edited by _mat_
  • Like 3
  • Thanks 1
Link to comment
Share on other sites

10 hours ago, Leeghoofd said:

Just finished a meeting with the programmer, new benchmark point system will be setup this weekend, first on the test uat.hwbot.org, afterwards on the main site.

Awesome! Looking forward to the changes. Thank you for all the hard work and patience dealing with people like me! :D

Happy New Years btw. Hope it's a better one for everybody! 

Edited by ShrimpBrime
  • Like 1
Link to comment
Share on other sites

  • Crew

Dear mr Matthew,

1.Roman has some ideas for the points, the programmer has to translate it into code. If something is up on UAT I'll post it here so you guys can try it and evaluate it.

2. Change to GPUPI 3.3 has been requested in the 3D thread, no change was mentioned for the 3.2 for CPU... Anyway I talked to Matt and GPUPI 4.0 will not be very different in output than v3.3, so we can merge those probably in the future.

 

Yes the WR points column will be removed, only the WR rank will remain when watching the sub and will change a tiny bit like this

wrrank.png

  • Thanks 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...