Mysticial Posted March 4, 2017 Posted March 4, 2017 (edited) One of my internal benchmark applications is insta-hard-freezing on Ryzen. Ryzen 7 1800X Asus Prime B350M-A (BIOS 0502) 4 x 8GB Corsair CMK32GX4M4A2400C14 @ 2133 MHz Nothing is overclocked. Everything is stock. Windows 10 Anniversary Update When I run the Haswell binary from here: https://github.com/Mysticial/Flops/tree/master/version2/binaries-windows The entire system usually freezes when it gets to: Single-Precision - 128-bit FMA3 - Fused Multiply Add: Sometimes, it will make it past that, but it usually ends up crashing/freezing later on in the test anyway. For those who don't trust the binary, the program is completely open-sourced in that GitHub repo. If you have Visual Studio installed: Open the project, build the x64 Haswell binary, and run. For me this always hard freezes the computer: At all clock speeds. When running single-threaded, it happens to any core that I pin it to. The questions that I want to answer are: Is this specific to my setup? No - Confirmed by multiple other people. Is this specific to Asus mobos or an immature BIOS? If so, can it be fixed with a later BIOS? Is this an issue with Windows? The crash does not seem to happen in Linux, but that is with slightly different code due to differing compilers. Is this a CPU errata? (I hope not - however unlikely it might be.) --------------------------- Current Testing Status: All of these are running Windows, and are at stock settings or underclocked. Confirmed Crashes: 1800X + Asus Prime B350M-A (BIOS 0502) 1700 + Asus Prime B350M-A (BIOS ???) 1700 + Asus Crosshair VI Hero 1700 + Asus Crosshair VI Hero (BIOS 5803) (two sets of memory G.Skill + Kingston - also fails with overvolted SOC) 1800X + Asus Crosshair VI Hero (Windows 7) - Once pass, mostly failures. Confirmed No-Crash: none yet For those interested in the technical details, I'm getting hard freezes for all types of FMAs (128-bit, 256-bit, single and double precision). But for some reason, it only affects this particular benchmark. Other programs (like prime95 and y-cruncher) aren't affected despite using FMAs. --------------------------- Update 3/16/2017: As much as I had least expected this to be the case, this appears to have been confirmed as an errata in the AMD Zen processor. In other words, the last bullet on my list (and the most serious). Fortunately, it's one that is fixable with a microcode update and will not result in something catastrophic like a recall or the disabling of features. To everyone pouring in from the various news sites: The important part is that a user mode program should not be able to hard freeze the entire system. Because if it can (as is the case here), it makes it possible to perform DOS attacks. IOW, this errata is a security issue. Don't be fooled by the "Haswell binary". The benchmark is 5 years old and I've largely neglected it for the last 3. So I haven't updated it for Zen yet. Any processor will be able to run any of the binaries if it supports the underlying instruction sets. If it doesn't, the program merely crashes with an, "illegal instruction". Under no circumstances should a user-mode application be able to bring down an entire system. Edited March 16, 2017 by Mysticial Quote
Ross Allan Posted March 4, 2017 Posted March 4, 2017 Confirmed at stock clocks on Ryzen 1700 on Crosshair motherboard. Becomes unresponsive, DRAM led flashing and 8 on q-code display. Quote
Mysticial Posted March 6, 2017 Author Posted March 6, 2017 Uh oh... This doesn't look good. I also have one other confirmation on a different forum. Other things to note: It doesn't always freeze instantly. I have a different Win10 installation that sometimes manages to survive the first FMA test only to crash on the second. The crash doesn't reproduce in Linux, but the code for Linux is slightly different since it uses a different compiler. Quote
flanker Posted March 6, 2017 Posted March 6, 2017 Guysm do you have last BIOSes for motherboards? Whats your voltage in BIOS or CPUZ in load? Quote
Mysticial Posted March 6, 2017 Author Posted March 6, 2017 Guysm do you have last BIOSes for motherboards? Whats your voltage in BIOS or CPUZ in load? For me yes. BIOS 0502 (February 28) The BIOS and AI Suite show a vcore of 1.350. CPUz shows it as 1.550. And it also happens when underclocked to 2.2 GHz. The Windows Event Log occasionally manages to record which core it crashes on. It's pretty random among all 16 vcores. There's no single core that it always happens to. IOW, I don't see any signs of weakness to a specific core. Quote
flanker Posted March 6, 2017 Posted March 6, 2017 Its clear, seems too high voltage...Its possible this voltage is for XFR/turbo. You can try disable turbo in BIOS and try the test again and watch your voltage/temps 1800X is hot chip with voltage, 1700 or 1700X have lower temps with same voltage. Quote
Mysticial Posted March 6, 2017 Author Posted March 6, 2017 (edited) Its clear, seems too high voltage...Its possible this voltage is for XFR/turbo. You can try disable turbo in BIOS and try the test again and watch your voltage/temps1800X is hot chip with voltage, 1700 or 1700X have lower temps with same voltage. There's no option to disable XFR or turbo in my BIOS. I don't trust CPUz's vcore reading since it is clearly too high and it conflicts with AI Suite. These are at stock settings, so it shouldn't getting anywhere near 1.5 anyway. When I use AI Suite to manually downclock, it seems to disable both the XFR and the turbo and it holds the frequency steady at 2.2 GHz. The vcore seems to stay at a static 1.35 (under load) according to AI Suite. Again CPUz jumps all the over place to as high as 1.550. But that's beside the point. It really shouldn't be crashing at stock settings - let alone downclock. Which is why I'm looking for more people to test this on different motherboards and from different manufacturers. So far I have 3 positive confirmations (crash), and zero negative confirmations (did not crash). The crashes have these setups - all running at stock and/or underclocked. 1800X + Asus Prime B350M-A (BIOS 0502) 1700 + Asus Prime B350M-A (BIOS ???) 1700 + Asus CrossHair The unanswered questions that I want to know are: Specific to my setup? No - Confirmed by two other people. Specific to Asus mobos or an immature BIOS? If so, can it be fixed with a later BIOS? Is this an issue with Windows? Is this a CPU errata? (I hope not - however unlikely it might be.) Edited March 6, 2017 by Mysticial Quote
flanker Posted March 6, 2017 Posted March 6, 2017 (edited) So, try it different, what you will see in HWinfo about voltage in load? https://www.fosshub.com/HWiNFO.html/hw64_545_3090.zip Its bad for me, because I have not here yet the test setup with Ryzen (tomorow with Crosshair). Iny my theory it could be: -overheating the CPU or VRM because the vcore is fluctuating to high at auto settings -BIOS issue -Windows 10 issue (Win7 seems more ready for Ryzens as few guys at another forum wrote) PS:HPET is enabled via cmd in WIndows? Edited March 6, 2017 by flanker Quote
chew* Posted March 8, 2017 Posted March 8, 2017 (edited) Mystical can this program run in win 7? If so can you please try it for me? flanker I have an idea that its not related to that stuff nor smt. errata I doubt misinformed as to how much cache really exists possibly or the memory. Before you go rip a windows install put in one quick test. Use 1 stick please run and tell me if they are double sided and if it runs. Also what is the default SOC voltage on that board. Edited March 8, 2017 by chew* Quote
Mysticial Posted March 8, 2017 Author Posted March 8, 2017 So, try it different, what you will see in HWinfo about voltage in load? https://www.fosshub.com/HWiNFO.html/hw64_545_3090.zip Its bad for me, because I have not here yet the test setup with Ryzen (tomorow with Crosshair). Iny my theory it could be: -overheating the CPU or VRM because the vcore is fluctuating to high at auto settings -BIOS issue -Windows 10 issue (Win7 seems more ready for Ryzens as few guys at another forum wrote) PS:HPET is enabled via cmd in WIndows? Enabling/disabling HPET has no effect. Both instantly crash. Mystical can this program run in win 7? If so can you please try it for me? I can't install Win7 because the installer doesn't have USB drivers and I don't have a PS2 mouse/keyboard. Quote
chew* Posted March 8, 2017 Posted March 8, 2017 Enabling/disabling HPET has no effect. Both instantly crash. I can't install Win7 because the installer doesn't have USB drivers and I don't have a PS2 mouse/keyboard. pull 3 sticks and run Quote
Mysticial Posted March 8, 2017 Author Posted March 8, 2017 pull 3 sticks and run Things I've tried: One stick of memory. Crashes both with my Corsair and G.Skill TridentZ. Two different video cards. Two different installations of Win10 on different devices. (SSD + HD) The only parts I haven't changed are: The CPU. (I only have one Ryzen CPU.) The PSU. (I don't have any spare PSUs lying around and it's too much work to take apart my other builds.) The motherboard. (I only have one AM4 motherboard.) Temperatures are always below 80C. So I doubt it's a cooling issue. Quote
chew* Posted March 8, 2017 Posted March 8, 2017 Things I've tried: One stick of memory. Crashes both with my Corsair and G.Skill TridentZ. Two different video cards. Two different installations of Win10 on different devices. (SSD + HD) The only parts I haven't changed are: The CPU. (I only have one Ryzen CPU.) The PSU. (I don't have any spare PSUs lying around and it's too much work to take apart my other builds.) The motherboard. (I only have one AM4 motherboard.) can you take a picture of the bios voltages for me, namely dram dram termination and SOC Quote
Mysticial Posted March 8, 2017 Author Posted March 8, 2017 can you take a picture of the bios voltages for me, namely dram dram termination and SOC What's your hypothesis? Quote
chew* Posted March 8, 2017 Posted March 8, 2017 The reason I wanted you to test win 7 is..... Logical Processor to Cache Map: *--------------- Data Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64 *--------------- Instruction Cache 0, Level 1, 64 KB, Assoc 4, LineSize 64 *--------------- Unified Cache 0, Level 2, 512 KB, Assoc 8, LineSize 64 *--------------- Unified Cache 1, Level 3, 16 MB, Assoc 16, LineSize 64 -*-------------- Data Cache 1, Level 1, 32 KB, Assoc 8, LineSize 64 -*-------------- Instruction Cache 1, Level 1, 64 KB, Assoc 4, LineSize 64 -*-------------- Unified Cache 2, Level 2, 512 KB, Assoc 8, LineSize 64 -*-------------- Unified Cache 3, Level 3, 16 MB, Assoc 16, LineSize 64 --*------------- Data Cache 2, Level 1, 32 KB, Assoc 8, LineSize 64 --*------------- Instruction Cache 2, Level 1, 64 KB, Assoc 4, LineSize 64 --*------------- Unified Cache 4, Level 2, 512 KB, Assoc 8, LineSize 64 --*------------- Unified Cache 5, Level 3, 16 MB, Assoc 16, LineSize 64 ---*------------ Data Cache 3, Level 1, 32 KB, Assoc 8, LineSize 64 ---*------------ Instruction Cache 3, Level 1, 64 KB, Assoc 4, LineSize 64 ---*------------ Unified Cache 6, Level 2, 512 KB, Assoc 8, LineSize 64 ---*------------ Unified Cache 7, Level 3, 16 MB, Assoc 16, LineSize 64 ----*----------- Data Cache 4, Level 1, 32 KB, Assoc 8, LineSize 64 ----*----------- Instruction Cache 4, Level 1, 64 KB, Assoc 4, LineSize 64 ----*----------- Unified Cache 8, Level 2, 512 KB, Assoc 8, LineSize 64 ----*----------- Unified Cache 9, Level 3, 16 MB, Assoc 16, LineSize 64 -----*---------- Data Cache 5, Level 1, 32 KB, Assoc 8, LineSize 64 -----*---------- Instruction Cache 5, Level 1, 64 KB, Assoc 4, LineSize 64 -----*---------- Unified Cache 10, Level 2, 512 KB, Assoc 8, LineSize 64 -----*---------- Unified Cache 11, Level 3, 16 MB, Assoc 16, LineSize 64 ------*--------- Data Cache 6, Level 1, 32 KB, Assoc 8, LineSize 64 ------*--------- Instruction Cache 6, Level 1, 64 KB, Assoc 4, LineSize 64 ------*--------- Unified Cache 12, Level 2, 512 KB, Assoc 8, LineSize 64 ------*--------- Unified Cache 13, Level 3, 16 MB, Assoc 16, LineSize 64 -------*-------- Data Cache 7, Level 1, 32 KB, Assoc 8, LineSize 64 -------*-------- Instruction Cache 7, Level 1, 64 KB, Assoc 4, LineSize 64 -------*-------- Unified Cache 14, Level 2, 512 KB, Assoc 8, LineSize 64 -------*-------- Unified Cache 15, Level 3, 16 MB, Assoc 16, LineSize 64 --------*------- Data Cache 8, Level 1, 32 KB, Assoc 8, LineSize 64 --------*------- Instruction Cache 8, Level 1, 64 KB, Assoc 4, LineSize 64 --------*------- Unified Cache 16, Level 2, 512 KB, Assoc 8, LineSize 64 --------*------- Unified Cache 17, Level 3, 16 MB, Assoc 16, LineSize 64 ---------*------ Data Cache 9, Level 1, 32 KB, Assoc 8, LineSize 64 ---------*------ Instruction Cache 9, Level 1, 64 KB, Assoc 4, LineSize 64 ---------*------ Unified Cache 18, Level 2, 512 KB, Assoc 8, LineSize 64 ---------*------ Unified Cache 19, Level 3, 16 MB, Assoc 16, LineSize 64 ----------*----- Data Cache 10, Level 1, 32 KB, Assoc 8, LineSize 64 ----------*----- Instruction Cache 10, Level 1, 64 KB, Assoc 4, LineSize 64 ----------*----- Unified Cache 20, Level 2, 512 KB, Assoc 8, LineSize 64 ----------*----- Unified Cache 21, Level 3, 16 MB, Assoc 16, LineSize 64 -----------*---- Data Cache 11, Level 1, 32 KB, Assoc 8, LineSize 64 -----------*---- Instruction Cache 11, Level 1, 64 KB, Assoc 4, LineSize 64 -----------*---- Unified Cache 22, Level 2, 512 KB, Assoc 8, LineSize 64 -----------*---- Unified Cache 23, Level 3, 16 MB, Assoc 16, LineSize 64 ------------*--- Data Cache 12, Level 1, 32 KB, Assoc 8, LineSize 64 ------------*--- Instruction Cache 12, Level 1, 64 KB, Assoc 4, LineSize 64 ------------*--- Unified Cache 24, Level 2, 512 KB, Assoc 8, LineSize 64 ------------*--- Unified Cache 25, Level 3, 16 MB, Assoc 16, LineSize 64 -------------*-- Data Cache 13, Level 1, 32 KB, Assoc 8, LineSize 64 -------------*-- Instruction Cache 13, Level 1, 64 KB, Assoc 4, LineSize 64 -------------*-- Unified Cache 26, Level 2, 512 KB, Assoc 8, LineSize 64 -------------*-- Unified Cache 27, Level 3, 16 MB, Assoc 16, LineSize 64 --------------*- Data Cache 14, Level 1, 32 KB, Assoc 8, LineSize 64 --------------*- Instruction Cache 14, Level 1, 64 KB, Assoc 4, LineSize 64 --------------*- Unified Cache 28, Level 2, 512 KB, Assoc 8, LineSize 64 --------------*- Unified Cache 29, Level 3, 16 MB, Assoc 16, LineSize 64 ---------------* Data Cache 15, Level 1, 32 KB, Assoc 8, LineSize 64 ---------------* Instruction Cache 15, Level 1, 64 KB, Assoc 4, LineSize 64 ---------------* Unified Cache 30, Level 2, 512 KB, Assoc 8, LineSize 64 ---------------* Unified Cache 31, Level 3, 16 MB, Assoc 16, LineSize 64 each zen thread is being registered as an individual core with its own L2 and L3 cache I have a weird feeling that this and some other gremlins some are experiencing could be related...... Quote
chew* Posted March 8, 2017 Posted March 8, 2017 OK the SOC voltage is preety damn low. It should be around 1.00 minimum and the range is 1.0-1.20 bump to gain stability. The bottom voltage Dram termination voltage should be equal to 50% Quote
Mysticial Posted March 8, 2017 Author Posted March 8, 2017 OK the SOC voltage is preety damn low. It should be around 1.00 minimum and the range is 1.0-1.20 bump to gain stability. The bottom voltage Dram termination voltage should be equal to 50% The BIOS won't let me set an SOC voltage offset of more than 0.2. It puts a hard limit of 1.0 volts. Perhaps it can forced higher via the LLC settings. But I'm hesitant to put settings up the limit of what the BIOS allows. I'll play around with that a bit tomorrow. But if what you say is true (SOC should be 1.0 - 1.2), and the BIOS number is correct, then perhaps ASUS is simply setting it too low to begin with? Quote
chew* Posted March 8, 2017 Posted March 8, 2017 (edited) The BIOS won't let me set an SOC voltage offset of more than 0.2. It puts a hard limit of 1.0 volts. Perhaps it can forced higher via the LLC settings. But I'm hesitant to put settings up the limit of what the BIOS allows. I'll play around with that a bit tomorrow. But if what you say is true (SOC should be 1.0 - 1.2), and the BIOS number is correct, then perhaps ASUS is simply setting it too low to begin with? I'm running high LLC and 1.00 SOC my board defaults at 1.1. I found in my case chasing highest clocks I can drop it .100 to drop heat. For 4 dims populated yes that would be a tad to low by default imo. My biggest concern is not your board but the report of the crosshair also exhibiting this issue. I'm under the understanding that they are working on updating mainstream boards second top tier first......but if ch6 has this issue..it should have latest agesa already. Edited March 8, 2017 by chew* Quote
flanker Posted March 8, 2017 Posted March 8, 2017 you can try install win7 via iso with USB3 drivers inside (at web of Asus is in download section software to create bootable USB win7 with USB3 drivers) Quote
Mysticial Posted March 8, 2017 Author Posted March 8, 2017 (edited) Before I go through the trouble of trying out Win7. Have you guys tried running the benchmark? Did it crash? (I'm desperately working to get the Zen tuning parameters for y-cruncher v0.7.2 in time for March 14. So I don't really have that much time to keep debugging this.) Edited March 8, 2017 by Mysticial Quote
Massman Posted March 9, 2017 Posted March 9, 2017 Quickly tested here too, failed on both systems (changed memory). Also failed with higher SOC voltage. Quote
chew* Posted March 9, 2017 Posted March 9, 2017 Thx i got someone on this. Its above my pay rate now. Quote
shing3232 Posted March 11, 2017 Posted March 11, 2017 The bug is repetible on WIN10. I heard that win10 is unable to recognize cache of Ryzen correctly but Win7 Could. You should benchmark it on WIN7 Quote
I.nfraR.ed Posted March 11, 2017 Posted March 11, 2017 I could reproduce it on Windows 7 x64 SP1 and everything on default in bios with 1800X. The system shuts down (8 code on Hero). Tried with OC and manual settings, same thing. Had it pass once though (on 4.1GHz). Quote
Massman Posted March 13, 2017 Posted March 13, 2017 Was told this issue will be fixed in a new AGESA code. In other words: it was an AMD issue, not C6H issue. Thanks for finding this bug @Mysticial! Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.