Jump to content
HWBOT Community Forums

Zen SuperPI 32M 5G/4G Challenge


Massman

Recommended Posts

I dug up a couple more screens to show scaling with frequency.

Like I said 10s per 100 MHz cpu so if zen 2 is = zen 1 target for say 4.6 should be 7:15 but....I think just based on my 9:11 w10run on zen 2 and knowing how fast 7 is compared to 10...might want to shoot for even faster times as I'm fairly certain zen2 is faster.

 

8-15-1 crop.jpg

7-55.jpg

Link to comment
Share on other sites

As I think zen+ has a small advantage in mem latency. Its not so easy to measure properly, but pretty hard to improve mem perf after a level with zen2. (In spi. because you need latency not more bandwith in the end) Good game however. :)

Edited by Alpi
Link to comment
Share on other sites

yah im not sure but based on my run and loud_silences 5.2 untuned run and comparing that to my tuned 5.0 run I personally feel that zen 2 when I run some 7 will be faster. its a different kind of fast but faster none the less I think. I think if you cold bug imc hard though and you had to run both zen 1 and zen 2 @ 1333 10-9-9 etc zen 1 would win however but I don't think that will ever be the case where you need to go that low on zen 2 cold.

If you were able to maintain your above runs mem speeds cold @ 5.0 there is no doubt in my mind that it would be faster than my 6:53 5.0 @ 1333 ddr.

its looking like a 6:50-6:40 range run to me if you ran @ 4999.

 

 

Edited by chew*
Link to comment
Share on other sites

Played some more today. No screens yet as still testing. 
 

3133C10 mem/ 1566flck I got the following

 

8m20.900 - 4000Mhz - 500.900

7m58.404 - 4200Mhz - 478.404

7m38.716 - 4400Mhz - 458.716

7m30.896 - 4500mhz - 450.896

4600 not exact

 

Scaling really tapers bad as clock speed increases. From 4.4g to 4.5g there is only 7.9s compared to 11s /100mhz average from 4g-4.2g 
 

This is why when we reach over 5.4g the scores just seem to stop dead. Still trying to find a work around to help though. Even though this is roughly 4 seconds faster efficiency at 4g, it will likely only translate to 0.5sec at flat out  

 

Would love to try my old 2700x again with this current efficiency as I feel it could be faster in the top end vs the 3900X. No way I’ll get it out of the wife’s daily and have my life in tact afterward. 

  • Sad 1
Link to comment
Share on other sites

On 12/21/2019 at 6:06 AM, Hardware_Numb3rs said:

The problem with Zen2 and ln2 is the fabric, hard to get above 1466, and for the best latency we must run 1:1, with A2 sticks I can go down to C9 but still at the end there’s not a big difference with Zen+ 

What I can see from your runs is that your Win7 is very well optimized, in a way that nobody here can do :)

c9 might not be as fast as c10. It wasn't on zen1 at least according to the data I collected. at 1333 c9 is easily doable yet I did not use it ;)

I always follow the philosophy I got from tony @ OCZ. Use what IS fast not what you THINK is fast.

That's why we tested multiple boards under multiple speeds etc etc etc. I must have tested at least 10+ motherboards on zen 1. Even cpu's impacted the results, tested a lot of those 2. 

Example I ran multiple tests last week with 3800x over 3900x. The closest I could get was within 1 second of 3900x on good runs and more often than not it was 2 secs slower.

I think the largest problem with 32m pi and zen 2 will be the inability to use very low dividers and bumping back up via ref clock to get the boards to "train" at tighter inaccessible "settings" when you go cold to make up for the lack of speed. You can try this on intel and pay attention to RTL in AMD land I think it was MRL on older arch. Not sure if they call it that anymore on AMD but I know as of now and zen1 its not accessible in bios or software but is via lower dividers and ref clock.

On zen+ it may not work because its not achieved natively but on a chip separating PCI but I never tested it so who knows.

I also know on zen1 when we set timings they had no effect at higher speeds but when we used a lower divider + ref clock to "trick" the cpu they actually worked.

I can guarantee you 1 thing for certain. The OS is hardly optimized. I'm not even using max mem. The "spectacle" others put on about the OS was a bunch of dogs barking up the wrong tree. I found it mildly amusing for a little while but the amusement wore off quickly as the drama escalated and I decided to go back to hibernating as that's what's best for me to avoid drama.

The only thing people really did not try was a giga or asrock to beat us and we never used the asus to try to beat anyone ( I had it, it was slow cpc ). The asrock and giga used a ref memory trace layout. the asus did not....the asus clocked higher and easier. quite obvious what was going on. they slacked board to "gain" visual speed.

This is why infras fastest run is with very high refclock and high mem speed. He got the MRL or whatever they call it now down and bypassed all the slacking crap asus did to achieve higher speeds than other boards could do.

Also as to why we didn't bother with ref at 3333-3400 was simple. our boards were not very fond of over 110 and you needed 1866/2133/2400 dividers to get the real gains. 2933 gained nothing and 2666 was not all that fast either.

Once again use what IS fast not what you THINK is fast.

These cpus have a lot of features that can be used towards your advantage. I believe error correction is 1 so using maxmem to gains stability because your actually unstable....I wonder how that works with the cpu's error correction feature ;)

Edited by chew*
  • Like 2
  • Thanks 3
Link to comment
Share on other sites

  • 2 weeks later...
On 12/23/2019 at 8:03 PM, chew* said:

c9 might not be as fast as c10. It wasn't on zen1 at least according to the data I collected. at 1333 c9 is easily doable yet I did not use it ;)

I always follow the philosophy I got from tony @ OCZ. Use what IS fast not what you THINK is fast.

That's why we tested multiple boards under multiple speeds etc etc etc. I must have tested at least 10+ motherboards on zen 1. Even cpu's impacted the results, tested a lot of those 2. 

Example I ran multiple tests last week with 3800x over 3900x. The closest I could get was within 1 second of 3900x on good runs and more often than not it was 2 secs slower.

I think the largest problem with 32m pi and zen 2 will be the inability to use very low dividers and bumping back up via ref clock to get the boards to "train" at tighter inaccessible "settings" when you go cold to make up for the lack of speed. You can try this on intel and pay attention to RTL in AMD land I think it was MRL on older arch. Not sure if they call it that anymore on AMD but I know as of now and zen1 its not accessible in bios or software but is via lower dividers and ref clock.

On zen+ it may not work because its not achieved natively but on a chip separating PCI but I never tested it so who knows.

I also know on zen1 when we set timings they had no effect at higher speeds but when we used a lower divider + ref clock to "trick" the cpu they actually worked.

I can guarantee you 1 thing for certain. The OS is hardly optimized. I'm not even using max mem. The "spectacle" others put on about the OS was a bunch of dogs barking up the wrong tree. I found it mildly amusing for a little while but the amusement wore off quickly as the drama escalated and I decided to go back to hibernating as that's what's best for me to avoid drama.

The only thing people really did not try was a giga or asrock to beat us and we never used the asus to try to beat anyone ( I had it, it was slow cpc ). The asrock and giga used a ref memory trace layout. the asus did not....the asus clocked higher and easier. quite obvious what was going on. they slacked board to "gain" visual speed.

This is why infras fastest run is with very high refclock and high mem speed. He got the MRL or whatever they call it now down and bypassed all the slacking crap asus did to achieve higher speeds than other boards could do.

Also as to why we didn't bother with ref at 3333-3400 was simple. our boards were not very fond of over 110 and you needed 1866/2133/2400 dividers to get the real gains. 2933 gained nothing and 2666 was not all that fast either.

Once again use what IS fast not what you THINK is fast.

These cpus have a lot of features that can be used towards your advantage. I believe error correction is 1 so using maxmem to gains stability because your actually unstable....I wonder how that works with the cpu's error correction feature ;)

You are right, lower timings doesn't mean necessarily faster, so I've switched back to C10, and now testing with 1566 fclk 3133 ram, wich is around my max flck at fullpot with the cpu I'm using right now (3950X).

I'm still optimizing a bit of everything and the os I'm using is not at his best since I'm doing the tests remotely with VNC :) but so far I have this:

4500MHz: 7m32s557

4600MHz: 7m23s834 (-8,723)

4700MHz: 7m15s834 (-8,016)

I have only few results so I can't trace correctly the gain per 100MHz, but the efficiency doesn't seem bad, considering the Os I'm using.

I tried as well the Asrock X570 Taichi, and some Gigabyte B450/X470 but I didn't see any particular deficit of performance of my CH8 right now, I used as well 118 blck with 2400 and lower dividers but again it seems not so different, at least at ambient clocks

There are some interesting bios features that I'm exploring tho, but I can't say at now how much is helping the score.

This CPU have the 1st CCD very very strong, so I might have at this point a good base to try the sub 6m, maybe I can do an LN2 session even tonight to see the behavior at "high altitude"

Keep you posted :)

Link to comment
Share on other sites

Decided to burn up the last bit of LN2 I had before refiling and still having no tools that work for W7 I started direct booting from bios. Unsure if this is safe, but not my cpu, not my problem :D

No waza, no tweaks just determining max clock, fuklock, and mem clock.

16/32 direct boot from bios:

5650

3200 c14

jeHxU7r.png

3200 c12

sTzcw49.png

3200 c12 even tighter

RjnwGUY.png

Then decided to try turning off some cores . . .

5750

3200 c12

ujkince.png

5800

3200 c12

mzRVKGo.png

5800

3200 c12 tightest

PtiGNk5.png

 

5850 passed but reboot at the end after completing pi and the time was 6:03:xxx so with a tuned run it will be an easy sub 5 run, but I think even 5800 should be enough at 3200 c12 to break the mythical 5 min barrier.

Now to sort waza on AMD . . .

 

 

Edited by l0ud_sil3nc3
  • Like 3
Link to comment
Share on other sites

Man. Them clocks will do it. 5.75g I think is possible, but my chip won’t do it reliably for me. 5.8g I would have been there last year. 
 

Going back to 2700x to see if I can get in the bracket, but it’s not looking promising at this stage still. I just can’t find the efficiency

  • Like 1
  • Thanks 1
Link to comment
Share on other sites

1 minute ago, KaRtA said:

Man. Them clocks will do it. 5.75g I think is possible, but my chip won’t do it reliably for me. 5.8g I would have been there last year. 
 

Going back to 2700x to see if I can get in the bracket, but it’s not looking promising at this stage still. I just can’t find the efficiency

I bet, but I am not an AMD pi master like you haha, only know how to go fast on the blue team for 32M.

Still learning, but at the end of the day it's kinda sad that an air cooled 4770K with DDR3 is still faster than any Ryzen this generation will ever be :(

Regardless still fun to play my favorite game, even if it's with the slow mode switch on lol

Link to comment
Share on other sites

I just wish I had more chips and LN2 at my disposal. 
 

very much a fun platform to try and tweak. I wouldn’t call myself a master, more luck than anything. The days lost testing this stuff is insane for a person without a clue on how to tweak an OS. It’s all a guessing game on my end. 

Link to comment
Share on other sites

1 minute ago, KaRtA said:

I just wish I had more chips and LN2 at my disposal. 
 

very much a fun platform to try and tweak. I wouldn’t call myself a master, more luck than anything. The days lost testing this stuff is insane for a person without a clue on how to tweak an OS. It’s all a guessing game on my end. 

Agreed testing cold shit is one thing, testing OS shits adds a whole other layer of fail.

Link to comment
Share on other sites

5 minutes ago, KaRtA said:

Had a chance today with what LN2 I had left. Took a good 30L in total just today.

 

So many runs just short of the 6 minute mark for so long, its awesome to have actually done it now.

 

Sorry Max, got past you.

https://hwbot.org/submission/4324190_karta_superpi___32m_ryzen_9_3900x_5min_58sec_509ms

I told you that you had everything to make it! Well done ? 

I'm very happy you got past me, I’ve done the first step towards the 6m gate and that was my goal, but also was like when you finish a good tv series, felt empty ?

So our “fight” is still on, I’ll try to get better!

again congratulation an welcome to our exclusive AMD Spi Sub 6 Club ?
 

 

  • Like 1
Link to comment
Share on other sites

I am done on this for now. No real plans to re-bench now that I have made the 6 minute mark. I’m sure you can improve in time with those clocks available. On the fence to re-do the 2700X now with the little LN2 I have remaining. 
 

I have some pretesting screens to share when my headache eases. 
 

Anyone with a chip capable of 5.8G and over 1500fclk should be able to achieve 6 minutes with the right setup. 

Link to comment
Share on other sites

For what I've seen, the recipe for the sub 6 with a Zen2 is:  

  • flck at least at 1566 (below that I wasn't able to do it even at nearly 5900MHz)
  • a CPU clock around 5750 
  • Ram at 1:1 C10 with tight subtimings (and stable)
  • A well tweaked Win7
  • A well tweaked bios (especially with the llc settings for both CPU and SOC)
  • A strong waza (can give you up to 4s)
  • Patience of a saint
  • Some luck
  • a state of the art insulation that can hold a 3+ hours session (especially in the ram section)

The hardest part is to balance between the flck CB and CPU frequency, if you don't have an alien CPU capable of 1600+ at fullpot there's a tradeoff between the two, with a focus on flck.

I'm doing some test with the cascade, since I can run Spi32M at 5450Mhz and 1800+ flck at -96°, this is where the clock start to be inefficent at low flck.

A deeper analysis will follow with screenshots in the next days, at least for who want to try this with a minimal base to start.

  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...