Jump to content
HWBOT Community Forums

The Official Team CUP 2018 DDR3 stage thread:


Leeghoofd

Recommended Posts

4 hours ago, Leeghoofd said:

Are you using the SDK 2.9.1 too Mickulty or the one from the VGA, which version is that?

Johannes did some testing too and said it is 99.9% related to the AGESA code of the initial GB bios, I tried his OS, SDK drivers and can't replicate it on it the Asus nor the cheap ass MSI  board...

Using the one from the 6450 drivers.  2.9.1 would make sense but the best info is probably from the GPUPI detection.

Which exact boards and which exact bios versions are you using?  OS and drivers *shouldn't* make a difference if the theory is correct.

5 hours ago, _mat_ said:

I saw your stream multiple times and measured your final run too. I know it's realtime. But I was actually talking about the slower runs that mickulty has on other mainboards.

Looks to me like the slower runs are real too.  Which brings us to...

12 hours ago, _mat_ said:

If the slower runs are really timed correctly then there is only one explanation: It's actually faster.

Regarding the comparison to other available hardware: an early GB FM1 board should be easier to find than three venice A64s or two Vega Ms or three different 9800 variants or three different PCIe 6800 variants or three different HD 2900 variants etc etc - once the scores are externally (to /r/oc) confirmed as well it's hardly unfair to allow them.  And besides it's not as if we're keeping this supertweak quiet, anyone can use it and I'm more than happy for my results to be spread far and wide.  Also it's really easy to flash bios backwards with Q-flash, no worries about trying to get the right bios on the board unlike with stages that need non-k skylake oc to score well.

  • Like 2
Link to comment
Share on other sites

4 hours ago, Leeghoofd said:

 

Looking at both your testing it seems to be related to the bios, which has a play with the GPUPI benchmark... sadly if this is the case I can not approve the scores as the efficiency is way off versus other available hardware.

It goes deeper than that.

It makes the bench almost un-moderatable. Much the same as UCBench was. 

  • Confused 1
Link to comment
Share on other sites

1 minute ago, Mr.Scott said:

If the mods are already questioning the validity, that means there is a problem with the bench.

Can we please stop repeating this over and over? It's not the bench, it's the hardware!

Jeeze. Whatever the moderation outcome - please, please don't have this turn into GPUpi becoming a tainted benchmark.I'd rather have FM1 disallowed completely.

Link to comment
Share on other sites

3 minutes ago, unityofsaints said:

Can we please stop repeating this over and over? It's not the bench, it's the hardware!

Jeeze. Whatever the moderation outcome - please, please don't have this turn into GPUpi becoming a tainted benchmark.I'd rather have FM1 disallowed completely.

If you can't run the hardware on the bench reliably, you can't moderate or give points for the bench. Period.

It's inter-related man. Not my call, but I can see this being a problem already.

Exploit is exploit, no matter how it happens.

Link to comment
Share on other sites

Just now, Mr.Scott said:

If you can't run the hardware on the bench reliably, you can't moderate or give points for the bench. Period.

It's inter-related man. Not my call, but I can see this being a problem already.

Exploit is exploit, no matter how it happens.

It's not an exploit.  Stop trolling.

Link to comment
Share on other sites

  • Crew

I will verify tonite, but I thought it was on one of the early biosses. However like I mentioned before, these runs with the old Agesa are in the neighbourhood of the improved 3.3 (looking at the eficiency)  and in the league of 4500Mhz Ivy CPUs, that can't be right. imagine running 3.3 on the old Agesa you would be in Kaby lake vicinity :p

  • Thanks 1
Link to comment
Share on other sites

2 minutes ago, Leeghoofd said:

I will verify tonite, but I thought it was on one of the early biosses. However like I mentioned before, these runs with the old Agesa are in the neighbourhood of the improved 3.3 (looking at the eficiency)  and in the league of 4500Mhz Ivy CPUs, that can't be right. imagine running 3.3 on the old Agesa you would be in Kaby lake vicinity ?

The bios is the whole thing man, need to be crystal clear about that.  BTW I put up 3.3 results as well: https://imgur.com/a/lUQee6G

If mat says the work is really being done and the timing is really correct, ivy doesn't matter, you can't say this cpu isn't allowed to be faster than that other cpu if it is actually doing the work.

To me this is like calling 3dmark01 on a 1080ti cheated because a score with an LN2 7700K is so much higher than a score with an FX-8350.

Link to comment
Share on other sites

7 minutes ago, mickulty said:

 

To me this is like calling 3dmark01 on a 1080ti cheated because a score with an LN2 7700K is so much higher than a score with an FX-8350.

You're comparing apples and oranges here. It's not the same scenario.

Anyhow, it's not my call. I get to voice my opinion just like everyone else does. Like it or not.

Link to comment
Share on other sites

  • Crew

No it is not doing the work, that just the thing,  it is somehow bugging the output of the GPUPI benchmark, Comon imagine a Llano at 5Ghz doing a WR beating Intel's superiority running at 7ghz, but sucks at any other app like Superpi or other simple 2D stuff... not gonna happen on my watch...

Your 2001 comparison is a benchmark where CPU speed has taken over from the GPU... nothing to do with a bios version that would double the output on identical hardware.... Also it proves the FX was all about sick clockspeeds and crap at anything else.

Now we had special 2D biosses before ( eg rampage extreme) but effie was a few percentages not like this....   This one goes from 2 min 40 to under a minute for the same clockspeeds... now that's an unexpected boost :p Putting these APUs amongst more powerful at anything processors :p

I'll try the initial release bios on the asus tonite and check back... 

However no matter the outcome I think the most logic to do is to remove the current super fast scores in this benchmark to maintain the validity, moderability and credibility of the benchmark.

 

  • Like 2
Link to comment
Share on other sites

11 minutes ago, Leeghoofd said:

Comon imagine a Llano at 5Ghz doing a WR beating Intel's superiority running at 7ghz

7350K WR is 22sec 917ms at 6.65 GHz

My submission is 52sec 979ms, at 3.9 GHz so that would scale to 41s at 5 GHz or 26s at 8 GHz.

In what universe is that unrealistic?! We are even talking about 2c/4t vs. 4c/4t here.

Edited by unityofsaints
Link to comment
Share on other sites

Obviously, this is a bug of some kind. It is not possible to have one board with specific BIOS version twice as fast compared to other boards and/or BIOS versions.

Also Llano simply must be in line with other K10 quad cores running on similar clock. For example http://hwbot.org/submission/3314113_havli_gpupi_for_cpu___100m_phenom_x4_9650_2min_11sec_369ms

or http://hwbot.org/submission/3699925_havli_gpupi_for_cpu___100m_opteron_8380_2min_12sec_595ms

Link to comment
Share on other sites

1 minute ago, Leeghoofd said:

No it is not doing the work, that just the thing,  it is somehow bugging the output of the GPUPI benchmark, Comon imagine a Llano at 5Ghz doing a WR beating Intel's superiority running at 7ghz, but sucks at any other app like Superpi or other simple 2D stuff... not gonna happen on my watch...

Your 2001 comparison is a benchmark where CPU speed has taken over from the GPU... nothing to do with a bios version that would double the output on identical hardware.... Also it proves the FX was all about sick clockspeeds and crap at anything else.

Now we had special 2D biosses before ( eg rampage extreme) but effie was a few percentages not like this....   This one goes from 2 min 40 to under a minute for the same clockspeeds... now that's an unexpected boost :p Putting these APUs amongst more powerful at anything processors ?

I'll try the initial release bios on the asus tonite and check back... 

However no matter the outcome I think the most logic to do is to remove the current super fast scores in this benchmark to maintain the validity, moderability and credibility of the benchmark.

 

Firstly, I think you should defer to mat's opinion as to whether it's bugging the output.  I'll send my UD4H to austria for him to check over it with a fine-tooth comb and replicate himself if I have to.  I do however hope this means you agree that if it is for sure doing the work that's different and it's fine.

Secondly, it's not 2m40 to under a minute.  It's 2m15 to 1m17.  Point of information.  Slowest 3570K score, at the stock 4GHz, is 1m05 on the much slower GPUPI 2.3.4.

I agree that it may be worth temporarily removing the good scores, but only on the understanding that it's fully looked into and when - in my opinion it is a when - it's agreed even by the developer that it's not bugging anything, the work is being done and the timing is correct, the scores should 100% be allowed.

Regarding the Intel comparison, Intel CPUs just suck as OpenCL targets.  Intel's OpenCL runtime is bad, anyone who benches GPUPI knows this, and AMD's is obviously targeted to AMD chips.  You can see this in ryzen's relative gpupi efficiency and fx vs lga2011.

Link to comment
Share on other sites

  • Crew

I just made a quick apples to oranges comparison as in the previous post Johannes... but honestly you feel the 52 sec score is right for this CPU generation?  Take another old 4 core, the 3570K at 4500Mhz beaing beaten by an APU with 500Mhz less on the cores and lower memory speed... Llano is far from that good... 1 min 40 seems reasonable to me...

Yes Intels OpenCL is inferior, hence why we mostly run the AMD SDK drivers :p

Yes it is the Agesa code, but million dollar question is why did AMD hamper its performance in newer AGESA codes?. Did some other users run into issues when using software based on similar instruction sets... normally we can expect a few percentages loss in efficieny for added compatibility but not this... this is out of the world...

Edited by Leeghoofd
Link to comment
Share on other sites

10 minutes ago, havli said:

Also Llano simply must be in line with other K10 quad cores running on similar clock. For example http://hwbot.org/submission/3314113_havli_gpupi_for_cpu___100m_phenom_x4_9650_2min_11sec_369m

I'm sorry, but wut!? You're comparing 1st gen k10 to 3rd gen k10. They're not remotely the same, like at all. Huge difference in amount of cache and cache layout to start as well as actual slight improvements to the arch even over k10.5 (deneb/thuban) with things like a better hw prefetcher. A much better comparison if you want to compare to another k10 based chip would be a phenom II. 

Anyways not sure why you're all so confused that a certain agesa is faster even if it's older, look at 1st gen ryzen where past a certain agesa wprime scores suck. That's a modern platform where newer agesa totally messed up latency for certain operations. Hell it could be just like the agesa that launched with 1st gen k10 that disabled the tlb. Just because it's worse at opencl loads doesn't mean there was a bug related to opencl. 

  • Like 1
Link to comment
Share on other sites

5 minutes ago, Leeghoofd said:

the 3570K at 4500Mhz beaing beaten by an APU with 500Mhz less on the cores and lower memory speed... Llano is far from that good... 1 min 48 seems reasonable to me...

You have to be careful comparing to Ivy because most of those results are on GPUpi 2.3.4 or earlier, which is much slower than 3.2. Also many results use Intel OpenCL instead of the much faster AMD one.

A 3570K @ 5.25 GHz scores 32sec 510ms in GPUpi 3.2, my 651K result scaled up to 53 secs would be around 39 seconds or approx. 20% slower. This looks completely reasonable to me.

Link to comment
Share on other sites

1 minute ago, Leeghoofd said:

That result was done on 3.2...

Again why would AMD destroy the advantage they have/had over the Intel counterparts so dramatically, that's the million dollar question... a coders error ?

 

Yes. My FM1 result is also 3.2, I am comparing apples to apples.

It is only OpenCL, not many benches use OpenCL on the CPU. E.g. CB15 and x265 run at the same speed on both BIOSes.

Link to comment
Share on other sites

12 minutes ago, yosarianilives said:

I'm sorry, but wut!? You're comparing 1st gen k10 to 3rd gen k10. They're not remotely the same, like at all. Huge difference in amount of cache and cache layout to start as well as actual slight improvements to the arch even over k10.5 (deneb/thuban) with things like a better hw prefetcher. A much better comparison if you want to compare to another k10 based chip would be a phenom II. 

Anyways not sure why you're all so confused that a certain agesa is faster even if it's older, look at 1st gen ryzen where past a certain agesa wprime scores suck. That's a modern platform where newer agesa totally messed up latency for certain operations. Hell it could be just like the agesa that launched with 1st gen k10 that disabled the tlb. Just because it's worse at opencl loads doesn't mean there was a bug related to opencl.  

Yes, I am... because they are the same. Cache and memory doesn't matter for GPUPI and those other improvements are more or less paper dragons with very small performance impact. And btw one of my links is Phenom II....

Different agesa can be faster... but single digit percents at most, not twice as fast.

And one more thing - CPU performance of GPUPI 2.x to 3.2 is very similar, as long as you are comparing 64 bit versions.

 

Edited by havli
Link to comment
Share on other sites

2 hours ago, havli said:

Different agesa can be faster... but single digit percents at most, not twice as fast.

They can be way slower, you forget the tlb bug that was patched on 1st gen k10 which took easily 30% performance because it just disabled the tlb. You guys keep asking "why would amd lower their opencl performance" well if something raises performance in one very specific scenario that can't be seen anywhere else but causes system lock ups in some specific enterprise scenario wouldn't you disable it? This could just as well be a patch of some obscure tlb or predictive cache algorithm that fixed an obscure system lockup scenario but absolutely murders performance for a different edge case. How would you propose that this agesa is altering the workload as it clearly hasn't affected timekeeping?

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...