Jump to content
HWBOT Community Forums

_mat_

Members
  • Posts

    1003
  • Joined

  • Last visited

  • Days Won

    41

Posts posted by _mat_

  1. Probably uploaded in the wrong category, right? The error messages are not very clear on that. You have to be sure to upload CPU scores in the "GPUPI for CPU" categories and GPU scores into the "GPUPI" categories.

     

    It's best to submit the scores inside the benchmark, if your bench system is connected to the internet.

  2. Ryzen is not split up in GPUPI. Each compatible OpenCL platform will show one Ryzen device. You can install multiple AMD OpenCL platforms to try out multiple OpenCL drivers, each will perform differently.

     

    Search for the AMD App SDK, version 3.0 will install the OpenCL 2.0 driver, version 2.9 should result in OpenCL 1.2.

  3. A little preview of some of the features of GPUPI 3.0 for my fellow overclockers. Command line version for Windows:

     

    attachment.php?attachmentid=223510

     

    Autoselection of the compute platform, Batch Size and Reduction Size depending by prebenching it for the user:

     

    $ ./GPUPI_x64 -c -d 100M

    GPUPI 3.0 (64 bit)

     

    API: OpenCL GPU with 1 devices

    API: OpenCL CPU with 2 devices

    API: CUDA with 1 devices

     

    Testing device: OpenCL CPU -> Intel® OpenCL -> Intel Core i7-6950X

    => 1M, 16: 2.294076 (Kernel: 2.250068, Reduction: 0.043280)

    => 1M, 32: 2.248263 (Kernel: 2.209528, Reduction: 0.037883)

    => 1M, 64: 2.270596 (Kernel: 2.231809, Reduction: 0.038067)

    => 1M, 128: 2.245034 (Kernel: 2.207602, Reduction: 0.036715)

    => 1M, 256: 2.279390 (Kernel: 2.229491, Reduction: 0.049113)

    => 1M, 512: 2.266061 (Kernel: 2.193988, Reduction: 0.071337)

    => 2M, 16: 2.315099 (Kernel: 2.236380, Reduction: 0.078018)

    => 2M, 32: 2.288076 (Kernel: 2.219284, Reduction: 0.068005)

    => 2M, 64: 2.287389 (Kernel: 2.226804, Reduction: 0.059873)

    => 2M, 128: 2.249376 (Kernel: 2.191177, Reduction: 0.057482)

    => 2M, 256: 2.283105 (Kernel: 2.215427, Reduction: 0.066962)

    => 2M, 512: 2.254495 (Kernel: 2.194912, Reduction: 0.058892)

    => 4M, 16: 2.307497 (Kernel: 2.218491, Reduction: 0.088419)

    => 4M, 32: 2.260795 (Kernel: 2.183106, Reduction: 0.077162)

    => 4M, 64: 2.304972 (Kernel: 2.238267, Reduction: 0.066159)

    => 4M, 128: 2.255765 (Kernel: 2.196260, Reduction: 0.058924)

    => 4M, 256: 2.277544 (Kernel: 2.209126, Reduction: 0.067898)

    => 4M, 512: 2.249683 (Kernel: 2.191406, Reduction: 0.057736)

    => 5M, 16: 2.304984 (Kernel: 2.217214, Reduction: 0.087279)

    => 5M, 32: 2.265134 (Kernel: 2.187128, Reduction: 0.077524)

    => 5M, 64: 2.279445 (Kernel: 2.212483, Reduction: 0.066463)

    => 5M, 128: 2.238783 (Kernel: 2.180829, Reduction: 0.057460)

    => 5M, 256: 2.299566 (Kernel: 2.231994, Reduction: 0.067090)

    => 5M, 512: 2.267714 (Kernel: 2.197324, Reduction: 0.069908)

    => 10M, 16: 2.311983 (Kernel: 2.226683, Reduction: 0.084900)

    => 10M, 32: 2.271478 (Kernel: 2.194653, Reduction: 0.076431)

    => 10M, 64: 2.261646 (Kernel: 2.190358, Reduction: 0.070862)

    => 10M, 128: 2.238901 (Kernel: 2.181579, Reduction: 0.056898)

    => 10M, 256: 2.278743 (Kernel: 2.215327, Reduction: 0.062982)

    => 10M, 512: 2.271698 (Kernel: 2.204813, Reduction: 0.066495)

    => 20M, 16: 2.316387 (Kernel: 2.224665, Reduction: 0.091349)

    => 20M, 32: 2.264053 (Kernel: 2.185630, Reduction: 0.078063)

    => 20M, 64: 2.302941 (Kernel: 2.229473, Reduction: 0.073087)

    => 20M, 128: 2.256671 (Kernel: 2.193457, Reduction: 0.062854)

    => 20M, 256: 2.256185 (Kernel: 2.194374, Reduction: 0.061453)

    => 20M, 512: 2.239121 (Kernel: 2.177762, Reduction: 0.061003)

    => 100M, 16: 2.351734 (Kernel: 2.219785, Reduction: 0.131625)

    => 100M, 32: 2.284867 (Kernel: 2.182241, Reduction: 0.102328)

    => 100M, 64: 2.331753 (Kernel: 2.241809, Reduction: 0.089634)

    => 100M, 128: 2.314707 (Kernel: 2.239817, Reduction: 0.074468)

    => 100M, 256: 2.272911 (Kernel: 2.204067, Reduction: 0.068538)

    => 100M, 512: 2.262099 (Kernel: 2.198868, Reduction: 0.062920)

     

    Testing device: OpenCL CPU -> Experimental OpenCL 2.1 CPU Only Platform -> Intel Core i7-6950X

    => 1M, 16: 2.256272 (Kernel: 2.208813, Reduction: 0.046615)

    => 1M, 32: 2.259307 (Kernel: 2.213463, Reduction: 0.045005)

    => 1M, 64: 2.261296 (Kernel: 2.217321, Reduction: 0.043054)

    => 1M, 128: 2.255290 (Kernel: 2.210472, Reduction: 0.043979)

    => 1M, 256: 2.276476 (Kernel: 2.228651, Reduction: 0.046952)

    => 1M, 512: 2.290972 (Kernel: 2.212072, Reduction: 0.078073)

    => 2M, 16: 2.281812 (Kernel: 2.198045, Reduction: 0.082942)

    => 2M, 32: 2.257342 (Kernel: 2.186708, Reduction: 0.069812)

    => 2M, 64: 2.274022 (Kernel: 2.210600, Reduction: 0.062644)

    => 2M, 128: 2.242322 (Kernel: 2.182171, Reduction: 0.059333)

    => 2M, 256: 2.284035 (Kernel: 2.214316, Reduction: 0.068943)

    => 2M, 512: 2.245358 (Kernel: 2.182079, Reduction: 0.062468)

    => 4M, 16: 2.314692 (Kernel: 2.223573, Reduction: 0.090524)

    => 4M, 32: 2.287976 (Kernel: 2.207894, Reduction: 0.079501)

    => 4M, 64: 2.280830 (Kernel: 2.212408, Reduction: 0.067748)

    => 4M, 128: 2.246577 (Kernel: 2.185500, Reduction: 0.060417)

    => 4M, 256: 2.267059 (Kernel: 2.196838, Reduction: 0.069591)

    => 4M, 512: 2.245285 (Kernel: 2.185138, Reduction: 0.059543)

    => 5M, 16: 2.297634 (Kernel: 2.208198, Reduction: 0.088819)

    => 5M, 32: 2.252606 (Kernel: 2.173261, Reduction: 0.078811)

    => 5M, 64: 2.289753 (Kernel: 2.219699, Reduction: 0.069525)

    => 5M, 128: 2.241125 (Kernel: 2.181959, Reduction: 0.058589)

    => 5M, 256: 2.272509 (Kernel: 2.203694, Reduction: 0.068293)

    => 5M, 512: 2.255514 (Kernel: 2.184216, Reduction: 0.070740)

    => 10M, 16: 2.283480 (Kernel: 2.197331, Reduction: 0.085667)

    => 10M, 32: 2.259312 (Kernel: 2.181606, Reduction: 0.077262)

    => 10M, 64: 2.273700 (Kernel: 2.201239, Reduction: 0.071997)

    => 10M, 128: 2.239782 (Kernel: 2.180927, Reduction: 0.058395)

    => 10M, 256: 2.288214 (Kernel: 2.223593, Reduction: 0.064161)

    => 10M, 512: 2.275962 (Kernel: 2.210551, Reduction: 0.064933)

    => 20M, 16: 2.298107 (Kernel: 2.206268, Reduction: 0.091459)

    => 20M, 32: 2.282686 (Kernel: 2.204328, Reduction: 0.077989)

    => 20M, 64: 2.264337 (Kernel: 2.189890, Reduction: 0.074074)

    => 20M, 128: 2.261340 (Kernel: 2.197828, Reduction: 0.063139)

    => 20M, 256: 2.254334 (Kernel: 2.191051, Reduction: 0.062873)

    => 20M, 512: 2.261168 (Kernel: 2.199776, Reduction: 0.060975)

    => 100M, 16: 2.363773 (Kernel: 2.221893, Reduction: 0.141445)

    => 100M, 32: 2.325590 (Kernel: 2.220586, Reduction: 0.104667)

    => 100M, 64: 2.281907 (Kernel: 2.196272, Reduction: 0.085290)

    => 100M, 128: 2.290375 (Kernel: 2.214583, Reduction: 0.075452)

    => 100M, 256: 2.274559 (Kernel: 2.203661, Reduction: 0.070590)

    => 100M, 512: 2.287639 (Kernel: 2.223986, Reduction: 0.063316)

     

    Best device found: OpenCL CPU -> Intel® OpenCL -> Intel Core i7-6950X with 5M, 128.

     

    Timer: HPET (14.32 MHz)

    Init HWiNFO: Ok

     

    OpenCL CPU: Intel Core i7-6950X (20 CUs, 3000 MHz)

    Compiling OpenCL kernels ... done.

     

    Calculating 100.000.000th digit of PI. 20 iterations.

     

    Allocated device memory : 83.89 MB

    Batch Size : 5M

    Reduction Size : 128

     

    00h 00m 00.480s Batch 1 finished.

    00h 00m 00.945s Batch 2 finished.

    00h 00m 01.403s Batch 3 finished.

    00h 00m 01.850s Batch 4 finished.

    00h 00m 02.263s Batch 5 finished.

    00h 00m 02.734s Batch 6 finished.

    00h 00m 03.201s Batch 7 finished.

    00h 00m 03.649s Batch 8 finished.

    00h 00m 04.089s Batch 9 finished.

    00h 00m 04.502s Batch 10 finished.

    00h 00m 04.980s Batch 11 finished.

    00h 00m 05.450s Batch 12 finished.

    00h 00m 05.915s Batch 13 finished.

    00h 00m 06.367s Batch 14 finished.

    00h 00m 06.784s Batch 15 finished.

    00h 00m 07.257s Batch 16 finished.

    00h 00m 07.724s Batch 17 finished.

    00h 00m 08.187s Batch 18 finished.

    00h 00m 08.639s Batch 19 finished.

    00h 00m 09.055s PI value output -> CB840E219

     

    Highest clocks measured:

    CPU: 3800.11 MHz

    GPU: 202.50 MHz

    GPU memory: 101.25 MHz

     

    Statistics:

    Calculation + Reduction time: 8.822s + 0.231s

     

    PI calculation is done!

  4. Windows 8 and above are effected by the RTC skewing bug when bclock is changed in Windows. I don't think that Skylake and Kaby Lake are any exception to this rule, but I haven't tested it myself yet.

     

    To circumvent HPET you have to use Windows 7.

     

    Edit: Rules of HWBOT allow the legacy benchmarks on SL and KL, so I guess it has been tested and it's not affecting the RTC timer. :)

    Well, with the next version GPUPI I will remove the HPET restriction on SL and KL.

  5. Oops, I meant that I am avoiding QPC when HPET is not enabled. Sorry, I have currently a lot on my plate.

     

    I can't remember if it's precisely ACPI that's vulnerable, but on Windows 7 - which is not affected by the RTC bug - QPC gets skewed if HPET is disabled. My best guess is, that it falls back to ACPI, otherwise the fallback to RTC would not produce skewed results. See my results here: https://www.overclockers.at/articles/gpupi-2-1 ... I should have displayed the timer frequencies as well, hrmpf.

  6. The broad problem is that only the TSC clock is affected by the clock skew on Windows 8+10. Does anyone mind if I change clock enforcement to blacklist TSC rather than whitelist HPET and ACPI? This should allow all other platform clocks.
    I've researched this topic for some days back when I was developing GPUPI and in my opinion the only manageable option for me was to ban TSC from 8 and 10 and only allow HPET there. Using ACPI as a timer if available is possible but depends on how it's done. I would not advise to use Windows' QPC functions for example.
×
×
  • Create New...