Jump to content
HWBOT Community Forums

havli

Members
  • Posts

    413
  • Joined

  • Last visited

  • Days Won

    3

Posts posted by havli

  1. 11 minutes ago, _mat_ said:

    SSE2 and SSE3 are the most important extensions for GPUPI using OpenCL on CPUs. SSE4a (also supported by Llano), SSE4.1 and SSE4.2 don't add anything particular interesting for the calculation, so that shouldn't make any difference in comparison to Ivy Bridge.

    Well, in that case how do you explain huge performance advantage of 45nm Core2 (SSE 4.1) over 65nm Core2 (SSSE3) ? For example http://hwbot.org/submission/3678638_havli_gpupi_for_cpu___100m_core_2_duo_e8300_1min_53sec_951ms

    and http://hwbot.org/submission/3408835_kintaro_gpupi_for_cpu___100m_core_2_duo_e6750_2min_16sec_234ms/

     

    Or AMD 15h or 16h (AVX, SSE 4.2) much faster than K10 (SSE3)? in GPUPI K10 is a lot slower, while in older benchmarks - Cinebench for instance it is the other way around. http://hwbot.org/submission/3691480_havli_gpupi_for_cpu___100m_a10_7800_1min_8sec_764ms

    and http://hwbot.org/submission/3554886_noms_gpupi_for_cpu___100m_phenom_ii_x4_965_be_1min_28sec_609ms

  2. All available evidence speaks against some magic OCL performance boost with this specific agesa. All other K10 CPUs performs more or less the same. Llano with all boards / bios versions except this one specific combination performs the same. If there was such performance gain on Llano / even for the price of instability/ ,then it would be known in public.

    There is no such change in Llano architecture that would allow such performance boost when compared to all other K10. From the other GPUPI results it seems AMD OCL driver benefits greatly from SSE 4.1... and to some extent even from SSSE3. This is the reason K10 is slow in this benchmark compared to Core2, Nehalem or even 15h based processors. K10 lacks these instructions.

    I'm sorry but this really sounds like a bug - either in OCL driver or the benchmark itself or maybe something else entirely. It is not a random thing, as it can be reproduced... after all not so long ago there was a similar problem with GPUPI on dual socket 1366 machines which also seemed to be much faster then common sense would suggest... and as it turned out, it was a bug.

    • Thanks 1
    • Sad 1
  3. 12 minutes ago, yosarianilives said:

    I'm sorry, but wut!? You're comparing 1st gen k10 to 3rd gen k10. They're not remotely the same, like at all. Huge difference in amount of cache and cache layout to start as well as actual slight improvements to the arch even over k10.5 (deneb/thuban) with things like a better hw prefetcher. A much better comparison if you want to compare to another k10 based chip would be a phenom II. 

    Anyways not sure why you're all so confused that a certain agesa is faster even if it's older, look at 1st gen ryzen where past a certain agesa wprime scores suck. That's a modern platform where newer agesa totally messed up latency for certain operations. Hell it could be just like the agesa that launched with 1st gen k10 that disabled the tlb. Just because it's worse at opencl loads doesn't mean there was a bug related to opencl.  

    Yes, I am... because they are the same. Cache and memory doesn't matter for GPUPI and those other improvements are more or less paper dragons with very small performance impact. And btw one of my links is Phenom II....

    Different agesa can be faster... but single digit percents at most, not twice as fast.

    And one more thing - CPU performance of GPUPI 2.x to 3.2 is very similar, as long as you are comparing 64 bit versions.

     

  4. Obviously, this is a bug of some kind. It is not possible to have one board with specific BIOS version twice as fast compared to other boards and/or BIOS versions.

    Also Llano simply must be in line with other K10 quad cores running on similar clock. For example http://hwbot.org/submission/3314113_havli_gpupi_for_cpu___100m_phenom_x4_9650_2min_11sec_369ms

    or http://hwbot.org/submission/3699925_havli_gpupi_for_cpu___100m_opteron_8380_2min_12sec_595ms

  5. Stage 2 - Wolfdale-3M means C2D E7xxx series only, or something else too? Like Pentium E5xxx, which are also using this exact die but with only 2MB of cache active.

    Stage 9 (and also others which are done the same way) - "single server CPU allowed" .... means only Xeon / Opteron is allowed, or C2D / Phenom / etc can be used as well?

  6. SLI can be enabled using one of the SLI hacks - for example https://www.techpowerup.com/forums/threads/sli-with-different-cards.158907/

    The problem is when the chipset or CPU (as the PCI-E controller is built in there) is newer than GPU drivers... in that case SLI isn't suported (I think). The mentioned hack probably fools the driver to think it runs on X58 or something similar... and therefore SLI will work, even in XP. http://hwbot.org/submission/3502846_havli_3dmark06_2x_geforce_8800_gtx_27622_marks

    • Like 4
  7. The best solution would be to simply add an option to users profile - just a simple checkbox woud be enough. Either show the HW library as it is now or ignore VGA brand and merge HW categories to one of each type.

    Going through 50 results and trying to find the one that is marked MSI realy isn't very effective.

  8. 12 hours ago, Leeghoofd said:

    Sorry Matt but this new version is ridicilous in performance... 5200 beating 6900MHz, nothing to do with tweaking OS, finding right CL driver...  You provided the boost for them

     

    It is simple, just rebench the 6.9GHz CPU and done.

    Those who want to stay on top must rebench stuff regularly anyway. For example in 3D with every new 115x platform launch rebench of all 3dmarks up to 06... and new HEDT = rebench of Vantage and later.

    • Like 2
    • Thanks 2
  9. Version 2.2.0 is ready for release. :)

    http://hw-museum.cz/data/hwbot/HWBOT_X265_2.2.ZIP

    What is new:

    1. there was a mistake in HPET detection of V2.1.0 - on systems that require HPET only error message was shown but the Run button wasn't deactivated... Therefore it was possible to run and submit the benchmark even without HPET (such results can be still recognized on screenshot - they contain red message "HPET timer not active"). This issue is now fixed.

    2. Coffee Lake added to the non-HPET whitelist. Currently the list contains: Skylake, Skylake-X, Kaby Lake, Kaby Lake-X, Coffee Lake.

    3. added option to select CPU name to submit. The first two options are 1) name like CPU-Z detects it 2) BIOS string. The third options leaves the field empty - this should solve the problem with unlocked AMD CPUs that are misdetected and can't be edited later.

    image.png.9e74f1fcc79f84904362e677314c1d60.png

    4. increased the score precision to 3 decimal places. There is is a catch however. It seems HWBOT doesn't support it properly after all. Internally there are 3 decimal places - as they can be seen on the pre-submit screen... and also later when editing the submission. On the score page there are only two and how the rankings are calculated I'm not sure. Let's see if better precision can be implemented on the HWBOT side at some point in the future, x265 is now ready for it.

    http://hwbot.org/submission/3793276_ 

    5. updated CPU-Z to version 1.83.

    • Like 4
  10. At the moment, I'm finishing my work on small update of x265 bench.

     

    Not so long ago there was some discussion concerning the granularity of the score. Since the beggining there were (and still are) just two decimal places. For most systems this is good enough to reflect even very small change in the performance. But not all of them - and since the attention is shifting more towards the 4k preset, maybe it is time to consider adding 3rd decimal place.

     

    Getting 3 decimal places is easy and after quick test it seems HWBOT API supports it also.

    x2658xs5q.png

     

    So the question is - switch to 3 decimal places or stay at 2? This is double-edged sword, some people might benefit from it, others would lose points.

    • Like 1
×
×
  • Create New...