Jump to content
HWBOT Community Forums

Sick Rig - r640 with current GPU - debugging hints needed


tmc

Recommended Posts

I may be a bit offtopic here, but I hope someone can give me a few hints in the right direction.

 

first off: I`m a former server technician for Cray Servers - I know a few things, maybe a bit rusty here and there with intel based technology.

 

why it is "a bit off": It aint exactly overclocking the system (i didnt change the clock) - it is stretching the performance lifetime. That Rig is just too fast to throw away when you`re kids can play on it. I want to keep it running at it`s edge.

 

I have a Fujitsu Siemens R640 Workstation (X5000 Chipset, 2xXeon E5160@3.0 Ghz, 32 Gb ECC Ram, 840W Fujitsu PSU, more in the links below) which I have used with a Asus AMD7970 4GDDR5 OC2 in the past, a ASUS R9 390-X 6GDDR5 OC2 until last week and a (temporary) R9 Fury Nano White this week. The 7970 once killed a power supply, but downclocking it helped to keep it stable. The more modern cards have less power consumption but never got stable.

 

The system runs fine through Prime95 & memtest86 for days. furmark ? 3dmark ? both crash almost instantly to cold freeze. calculating windows 7 performance index crashes during buffer width setup for dx3d 10 to a cold freeze. I switched out CPUs, GPUs, PSUs. I updated firmware and drivers with a test matrix - current windows 7 patches, AMD drivers beta as of this week, checked RAM. symptoms stay the same - i do something in dx3d 10 -> cold crash.

the problem with that crash is the machine will not even power on after that until i completely drain power from the mainboard (pull plug from mains, press start for 10 sec minimum - anything less and ethernet stays down or similar ghost efffects. diags say "nothing", so it cant even save PCI errors. temperature at the whole time: below 50C degree CPU, below 54C degree GPU.

 

GPU-Z, CPU-Z, CoreTemp - all fine. during crash nothing looks out of the norm.

 

TL;DR: Any suggestions which PCI-E / PCI-X tools to use for debugging e.g. power flanks on the PCI or what i may have overlooked ?

people in "normal" forums seem to be very confused once you ask for PCI debuggers, oscilloscope readings, etc.

(thats why I ask here)

 

 

I`m starting to doubt my sanity / debugging skills here or the GPU manufacturer`s keeping of the specs.

 

(should this run stable, i would love any hints on how to get the CPU up about 10% - one multiplyer - if anyone still remembers these...)

 

 

Links to the Board:

http://tech.zeiss.net-base.de/data/MIRAX/MANUALS/FSC/Celsius_R640/Mainboard%20D1808/A26361-D1808-Z120-en.pdf

(it is a customized version of this one, so you cannot just use a "off the shelf powersupply" - still trying to figure out the wiring for a current ATX one - might slaughter the dead PSU from 2 years back)

TYAN - Download BIOS: Tempest i5000XT (S2696)

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...