Jump to content
HWBOT Community Forums

GPUPI - SuperPI on the GPU


_mat_

Recommended Posts

Yes, use 2.3.4 it fixes all fo the sync and validation issues with multiple cards.

 

Btw, 2.2 had no runtime validation, it only validates the final result. That's why it works.

 

I know, but 2.2 would pass the end validation.

Link to comment
Share on other sites

  • 4 weeks later...

Hello im having a bad day with gpupi.

 

pops up this message: Error synchronizing device after kernel execution

 

any hints for solution, drivers reinstalled etc. No can do. 6700k&ranger & gt710 pcie card

Maybe ill just swap to system for gpupi bench

Link to comment
Share on other sites

That's a CUDA error, that happens directly after the calculation kernel when waiting for the GPU to return the data. Something like this occurs for example when there was something wrong with the memory (read or write in unallocated areas). Is the card heavily overclocked? Are you using high batch and reduction sizes? Try stock clocks and the lowest sizes and see if that's the problem.

 

Btw, you should have also gotten a detailed error message in square brackets right next to the error you posted. Please let me know what it is.

Link to comment
Share on other sites

Hello im having a bad day with gpupi.

 

pops up this message: Error synchronizing device after kernel execution

 

any hints for solution, drivers reinstalled etc. No can do. 6700k&ranger & gt710 pcie card

Maybe ill just swap to system for gpupi bench

 

To run 1b on a gt 710 you need to set batch size below 10m or it will fail at loop 4

Link to comment
Share on other sites

  • 2 months later...

Since I can't flag my own submission, I noticed something about GPU PI 2.3.4....

 

Any particular reasons why the data file puts my Phenom II X4 955 BE as a non-BE? I've gone in and edited the submission so it's showing as a Black Edition, but is this a limitation of GPU Pi in and of itself, or?

 

WhiteWulfe`s GPUPI for CPU - 1B score: 20min 19sec 331ms with a Phenom II X4 955 for my submission in question.

 

EDIT: Actually, it won't let me edit it to a Black Edition, just winds up sitting at Apache Tomcat/7.0.59 - Error report and doesn't go back to my submission.

Link to comment
Share on other sites

Moved the post to the GPUPI thread, maybe _mat_ can explain or help :)

 

It would be nice to know... I also didn't know about this thread ^_^

 

On the plus side, the other bit about not being to edit it.... I was able to edit my 1B and 100M score afterwards so it's the correct processor. Still annoying that it registers as the incorrect processor, and then the server wouldn't let me manually correct it to a Black Edition.

Link to comment
Share on other sites

  • Administrators
It would be nice to know... I also didn't know about this thread ^_^

 

On the plus side, the other bit about not being to edit it.... I was able to edit my 1B and 100M score afterwards so it's the correct processor. Still annoying that it registers as the incorrect processor, and then the server wouldn't let me manually correct it to a Black Edition.

 

I checked the sub, it is shown as Phenom II X4 955 BE for me... is this different for you? I don´t think so, edit worked, maybe it can be fixed so that edit is no need in the future

 

WhiteWulfe`s GPUPI for CPU - 1B score: 20min 19sec 331ms with a Phenom II X4 955 BE

Link to comment
Share on other sites

I checked the sub, it is shown as Phenom II X4 955 BE for me... is this different for you? I don´t think so, edit worked, maybe it can be fixed so that edit is no need in the future

 

WhiteWulfe`s GPUPI for CPU - 1B score: 20min 19sec 331ms with a Phenom II X4 955 BE

 

.... .....I just said I was finally able to edit it. :P Wasn't able to when it was originally submitted, hence why I had created the thread in the first place. GPU Pi is identifying my processor as a 955 non-black edition for some reason.

 

I can resubmit the datafile as a no-points submission if that helps.

Link to comment
Share on other sites

The name of the device is retrieved via the opencl driver, which normally just takes the CPUID brand string as it is shown in CPU-Z, a hardcoded value inside the CPU. GPUPI removes various prefixes and postfixes to be able to submit the result to HWBOT.

Link to comment
Share on other sites

  • 7 months later...

I have a question and before someone says "search" or it was discussed already in this thread I fully read from the last page in this thread all the way to page 30 and then fell asleep at my keyboard and now just woke up and decided to ask my question anyways at risk of being flamed or moderated - so here goes:

 

My understanding is that as long as I have a skylake/kabylake processor AND I'm on Windows 10 then I'm not supposed to be forced to set the HPET settings in the operating system or the BIOS since my system isn't "bugged". When I run GPUPI there is no way for me to avoid setting it in the O/S or it complains about the HPET timer not being set.

 

How can I bypass this? If I can't because I'm totally misunderstanding the HPET bug thingy please feel free to re-educate me... right after I get a coffee. ;-)

 

Marco

Link to comment
Share on other sites

Windows 8 and above are effected by the RTC skewing bug when bclock is changed in Windows. I don't think that Skylake and Kaby Lake are any exception to this rule, but I haven't tested it myself yet.

 

To circumvent HPET you have to use Windows 7.

 

Edit: Rules of HWBOT allow the legacy benchmarks on SL and KL, so I guess it has been tested and it's not affecting the RTC timer. :)

Well, with the next version GPUPI I will remove the HPET restriction on SL and KL.

Edited by _mat_
Link to comment
Share on other sites

Windows 8 and above are effected by the RTC skewing bug when bclock is changed in Windows. I don't think that Skylake and Kaby Lake are any exception to this rule, but I haven't tested it myself yet.

 

To circumvent HPET you have to use Windows 7.

 

Edit: Rules of HWBOT allow the legacy benchmarks on SL and KL, so I guess it has been tested and it's not affecting the RTC timer. :)

Well, with the next version GPUPI I will remove the HPET restriction on SL and KL.

 

Intel introduced a separate BCLK clockgen on SKL which circumvents the RTC bug :celebration:.

Link to comment
Share on other sites

A little preview of some of the features of GPUPI 3.0 for my fellow overclockers. Command line version for Windows:

 

attachment.php?attachmentid=223510

 

Autoselection of the compute platform, Batch Size and Reduction Size depending by prebenching it for the user:

 

$ ./GPUPI_x64 -c -d 100M

GPUPI 3.0 (64 bit)

 

API: OpenCL GPU with 1 devices

API: OpenCL CPU with 2 devices

API: CUDA with 1 devices

 

Testing device: OpenCL CPU -> Intel® OpenCL -> Intel Core i7-6950X

=> 1M, 16: 2.294076 (Kernel: 2.250068, Reduction: 0.043280)

=> 1M, 32: 2.248263 (Kernel: 2.209528, Reduction: 0.037883)

=> 1M, 64: 2.270596 (Kernel: 2.231809, Reduction: 0.038067)

=> 1M, 128: 2.245034 (Kernel: 2.207602, Reduction: 0.036715)

=> 1M, 256: 2.279390 (Kernel: 2.229491, Reduction: 0.049113)

=> 1M, 512: 2.266061 (Kernel: 2.193988, Reduction: 0.071337)

=> 2M, 16: 2.315099 (Kernel: 2.236380, Reduction: 0.078018)

=> 2M, 32: 2.288076 (Kernel: 2.219284, Reduction: 0.068005)

=> 2M, 64: 2.287389 (Kernel: 2.226804, Reduction: 0.059873)

=> 2M, 128: 2.249376 (Kernel: 2.191177, Reduction: 0.057482)

=> 2M, 256: 2.283105 (Kernel: 2.215427, Reduction: 0.066962)

=> 2M, 512: 2.254495 (Kernel: 2.194912, Reduction: 0.058892)

=> 4M, 16: 2.307497 (Kernel: 2.218491, Reduction: 0.088419)

=> 4M, 32: 2.260795 (Kernel: 2.183106, Reduction: 0.077162)

=> 4M, 64: 2.304972 (Kernel: 2.238267, Reduction: 0.066159)

=> 4M, 128: 2.255765 (Kernel: 2.196260, Reduction: 0.058924)

=> 4M, 256: 2.277544 (Kernel: 2.209126, Reduction: 0.067898)

=> 4M, 512: 2.249683 (Kernel: 2.191406, Reduction: 0.057736)

=> 5M, 16: 2.304984 (Kernel: 2.217214, Reduction: 0.087279)

=> 5M, 32: 2.265134 (Kernel: 2.187128, Reduction: 0.077524)

=> 5M, 64: 2.279445 (Kernel: 2.212483, Reduction: 0.066463)

=> 5M, 128: 2.238783 (Kernel: 2.180829, Reduction: 0.057460)

=> 5M, 256: 2.299566 (Kernel: 2.231994, Reduction: 0.067090)

=> 5M, 512: 2.267714 (Kernel: 2.197324, Reduction: 0.069908)

=> 10M, 16: 2.311983 (Kernel: 2.226683, Reduction: 0.084900)

=> 10M, 32: 2.271478 (Kernel: 2.194653, Reduction: 0.076431)

=> 10M, 64: 2.261646 (Kernel: 2.190358, Reduction: 0.070862)

=> 10M, 128: 2.238901 (Kernel: 2.181579, Reduction: 0.056898)

=> 10M, 256: 2.278743 (Kernel: 2.215327, Reduction: 0.062982)

=> 10M, 512: 2.271698 (Kernel: 2.204813, Reduction: 0.066495)

=> 20M, 16: 2.316387 (Kernel: 2.224665, Reduction: 0.091349)

=> 20M, 32: 2.264053 (Kernel: 2.185630, Reduction: 0.078063)

=> 20M, 64: 2.302941 (Kernel: 2.229473, Reduction: 0.073087)

=> 20M, 128: 2.256671 (Kernel: 2.193457, Reduction: 0.062854)

=> 20M, 256: 2.256185 (Kernel: 2.194374, Reduction: 0.061453)

=> 20M, 512: 2.239121 (Kernel: 2.177762, Reduction: 0.061003)

=> 100M, 16: 2.351734 (Kernel: 2.219785, Reduction: 0.131625)

=> 100M, 32: 2.284867 (Kernel: 2.182241, Reduction: 0.102328)

=> 100M, 64: 2.331753 (Kernel: 2.241809, Reduction: 0.089634)

=> 100M, 128: 2.314707 (Kernel: 2.239817, Reduction: 0.074468)

=> 100M, 256: 2.272911 (Kernel: 2.204067, Reduction: 0.068538)

=> 100M, 512: 2.262099 (Kernel: 2.198868, Reduction: 0.062920)

 

Testing device: OpenCL CPU -> Experimental OpenCL 2.1 CPU Only Platform -> Intel Core i7-6950X

=> 1M, 16: 2.256272 (Kernel: 2.208813, Reduction: 0.046615)

=> 1M, 32: 2.259307 (Kernel: 2.213463, Reduction: 0.045005)

=> 1M, 64: 2.261296 (Kernel: 2.217321, Reduction: 0.043054)

=> 1M, 128: 2.255290 (Kernel: 2.210472, Reduction: 0.043979)

=> 1M, 256: 2.276476 (Kernel: 2.228651, Reduction: 0.046952)

=> 1M, 512: 2.290972 (Kernel: 2.212072, Reduction: 0.078073)

=> 2M, 16: 2.281812 (Kernel: 2.198045, Reduction: 0.082942)

=> 2M, 32: 2.257342 (Kernel: 2.186708, Reduction: 0.069812)

=> 2M, 64: 2.274022 (Kernel: 2.210600, Reduction: 0.062644)

=> 2M, 128: 2.242322 (Kernel: 2.182171, Reduction: 0.059333)

=> 2M, 256: 2.284035 (Kernel: 2.214316, Reduction: 0.068943)

=> 2M, 512: 2.245358 (Kernel: 2.182079, Reduction: 0.062468)

=> 4M, 16: 2.314692 (Kernel: 2.223573, Reduction: 0.090524)

=> 4M, 32: 2.287976 (Kernel: 2.207894, Reduction: 0.079501)

=> 4M, 64: 2.280830 (Kernel: 2.212408, Reduction: 0.067748)

=> 4M, 128: 2.246577 (Kernel: 2.185500, Reduction: 0.060417)

=> 4M, 256: 2.267059 (Kernel: 2.196838, Reduction: 0.069591)

=> 4M, 512: 2.245285 (Kernel: 2.185138, Reduction: 0.059543)

=> 5M, 16: 2.297634 (Kernel: 2.208198, Reduction: 0.088819)

=> 5M, 32: 2.252606 (Kernel: 2.173261, Reduction: 0.078811)

=> 5M, 64: 2.289753 (Kernel: 2.219699, Reduction: 0.069525)

=> 5M, 128: 2.241125 (Kernel: 2.181959, Reduction: 0.058589)

=> 5M, 256: 2.272509 (Kernel: 2.203694, Reduction: 0.068293)

=> 5M, 512: 2.255514 (Kernel: 2.184216, Reduction: 0.070740)

=> 10M, 16: 2.283480 (Kernel: 2.197331, Reduction: 0.085667)

=> 10M, 32: 2.259312 (Kernel: 2.181606, Reduction: 0.077262)

=> 10M, 64: 2.273700 (Kernel: 2.201239, Reduction: 0.071997)

=> 10M, 128: 2.239782 (Kernel: 2.180927, Reduction: 0.058395)

=> 10M, 256: 2.288214 (Kernel: 2.223593, Reduction: 0.064161)

=> 10M, 512: 2.275962 (Kernel: 2.210551, Reduction: 0.064933)

=> 20M, 16: 2.298107 (Kernel: 2.206268, Reduction: 0.091459)

=> 20M, 32: 2.282686 (Kernel: 2.204328, Reduction: 0.077989)

=> 20M, 64: 2.264337 (Kernel: 2.189890, Reduction: 0.074074)

=> 20M, 128: 2.261340 (Kernel: 2.197828, Reduction: 0.063139)

=> 20M, 256: 2.254334 (Kernel: 2.191051, Reduction: 0.062873)

=> 20M, 512: 2.261168 (Kernel: 2.199776, Reduction: 0.060975)

=> 100M, 16: 2.363773 (Kernel: 2.221893, Reduction: 0.141445)

=> 100M, 32: 2.325590 (Kernel: 2.220586, Reduction: 0.104667)

=> 100M, 64: 2.281907 (Kernel: 2.196272, Reduction: 0.085290)

=> 100M, 128: 2.290375 (Kernel: 2.214583, Reduction: 0.075452)

=> 100M, 256: 2.274559 (Kernel: 2.203661, Reduction: 0.070590)

=> 100M, 512: 2.287639 (Kernel: 2.223986, Reduction: 0.063316)

 

Best device found: OpenCL CPU -> Intel® OpenCL -> Intel Core i7-6950X with 5M, 128.

 

Timer: HPET (14.32 MHz)

Init HWiNFO: Ok

 

OpenCL CPU: Intel Core i7-6950X (20 CUs, 3000 MHz)

Compiling OpenCL kernels ... done.

 

Calculating 100.000.000th digit of PI. 20 iterations.

 

Allocated device memory : 83.89 MB

Batch Size : 5M

Reduction Size : 128

 

00h 00m 00.480s Batch 1 finished.

00h 00m 00.945s Batch 2 finished.

00h 00m 01.403s Batch 3 finished.

00h 00m 01.850s Batch 4 finished.

00h 00m 02.263s Batch 5 finished.

00h 00m 02.734s Batch 6 finished.

00h 00m 03.201s Batch 7 finished.

00h 00m 03.649s Batch 8 finished.

00h 00m 04.089s Batch 9 finished.

00h 00m 04.502s Batch 10 finished.

00h 00m 04.980s Batch 11 finished.

00h 00m 05.450s Batch 12 finished.

00h 00m 05.915s Batch 13 finished.

00h 00m 06.367s Batch 14 finished.

00h 00m 06.784s Batch 15 finished.

00h 00m 07.257s Batch 16 finished.

00h 00m 07.724s Batch 17 finished.

00h 00m 08.187s Batch 18 finished.

00h 00m 08.639s Batch 19 finished.

00h 00m 09.055s PI value output -> CB840E219

 

Highest clocks measured:

CPU: 3800.11 MHz

GPU: 202.50 MHz

GPU memory: 101.25 MHz

 

Statistics:

Calculation + Reduction time: 8.822s + 0.231s

 

PI calculation is done!

Link to comment
Share on other sites

Quick question, having an issue with GPUPI. I'm doing 32B runs on a 1080Ti, they'll complete just fine and the checksum appears to be right. If I save the validation file, I get the normal message saying that the file was saved successfully. However, if I try to immediately validate that same file, I get the following error: "The result file was successfully decrypted, but the data is invalid[invalid XML data]". Tried validating the file on another computer to no avail, any ideas?

Link to comment
Share on other sites

Check GPUPI.log there should be an extended error description which XML node makes the file invalid. Please post it here, might be a bug in the validation.

Here's the relevant data from the log file.

Error while decrypting output: StreamTransformationFilter: invalid PKCS #7 block padding found

Error while decrypting output: StreamTransformationFilter: invalid PKCS #7 block padding found

XML validation: submission node not found!

Message box: The result file was successfully decrypted, but the data is invalid [invalid XML data] (Validation result)

Thanks for your help.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...