Jump to content
HWBOT Community Forums

Tharamis

Members
  • Posts

    23
  • Joined

  • Last visited

Posts posted by Tharamis

  1. hi,

     

    What happened to "Super Pi for GPUs", which was mentioned @ XS long time ago? Cuda-Z sounds good, too. I think within the next 6-12 months we will see some progress when it comes to raw GPU calculation benchmarks. Fermi will help here for sure, but we need something which is suitable for graphics cards, which fulfill certain key data to be able to run the benchmark (like DX 10 for Vantage) instead of the right manufacturer logo.

     

    well, this "Super Pi" for GPUs will misslead graphiccards performance. this means:

    a card which will be able to calculate PI as the *fastest one,

    does *not mean that its the fastest while playing or doing other graphical things.

    be carefull.

     

    so i think there must be some benchs, which meansure (for example) Vertex performance

    via opengl. so why opengl? because identical interface on all systmes, Dx independent.

     

    cu

  2. hi,

     

    MaxxPI² - PreView - Multi

     

    42e25e15.png

     

    This benchmark uses as formula the Chudnovsky algorithm, unlike the MaxxPI ² - PreView - Single,

    that use the Gauss-Legendre algorithm.

     

    The advantage of the Chudnovsky algorithm is, that principally,

    multi-core capability is possible. MaxxPI ² - PreView - Multi use this.

    That means: That all available CPU cores work together on a single calculation.

     

    Technical:

    MaxxPI² - PreView - Multi needs at least a dual-core processor and supports in the current version 1.07,

    CPU's with 2,3,4 and 8 Core's.

     

    Note:

    HT core counts as real core, so 1+1HT core will accepted).

     

    Maximum depth of calculation:

    268.435.456 decimal places

     

    • v1.07, initial public release (16/07/2009) NEW!

     

    cu

  3. hi,

     

    MaxxMem is very interesting

     

    yes :)

     

    do you think the author could come up with a total score for MaxMem, something like MaxMem-Total = (MemCopy+MemRead+memWrite)*(1/MemLatency)

     

    this in combination maybe with CPu-Z memory tab algorithm detect would allow for auto submit to HWbot.

     

    for now, the *memory score is the arithmetic average between:

    "read" and "write", same as the big brother MaxxPI² does.

     

    Memory copy is not part of the memory score, because big MaxxPI²

    doesn't use memory copy at all (no need for).

     

    Reaced memory / latency score's will be comparable to MaxxPI².

     

    For further suggestions concerning hwbot, you should contact him directly, via:

     

    http://www.maxxpi.net/pages/contact.php

     

    Regards

     

    Tharamis

  4. hi all,

     

    some little news, now i'm *authorised to post this:

     

    1, MaxxPI² MultiCore ( Pre Alpha ;) ):

     

    2dc36569.jpg

     

    screen with an Q6600 at 4100mhz, first 1core below 4cores (scaling)

     

    will support 2,3,4 and 8cores (for now), calculate up to 256M (for now)

    chudnovsky used, incl. splitting.

     

    put about >78%! constant load (PMU),

    on *all cores, so be carefull. has no CPU-specific optimizations for any

    CPU manufaturers. uses mmx/sse

     

    main problem was, load balancing (especially with chudnovsky) and not

    to prefer any CPU manufacturer.

    this both slow down the calc. speed, but i think at CPU/PC -benchmarking,

    this doesn't matter at all, because comparability is the key.

     

    optimized for an major CPU manufacturer a performance

    gain from +12% to +18% is possible.

     

    @allegratorial, i know it's important for you:

     

    MaxxPI² MultiCore ( Pre Alpha ;) all x86):

    256M, with i7 at 4ghz, with 4cores/4threads (not 8): 473sec.

    256M, same machine, with 4cores/4threads (not 8): QPI: 402sec.

    256M, same machine, with 4cores/??threads, (not 8??): y-chruncher: 229sec.

     

    2, much more intressing I think, is this:

     

    MaxxMEM²

    b8232a5c.jpg

     

    this will released soon :)

     

    cu

  5. hi,

     

    Maxxpi uses GMP? :confused: So that Pi chudnovsky implementation that it uses is merely this?

     

    no, i mean that y-cruncher's characteristics match in wide areas with GMP... ;)

     

    for MaxxPI i don't know this at all, but i don't think so because it will use the gauss algo.

     

    But anyways, I'll let this thread get back to Maxxpi. Sorry I interrupted. :o

     

    no problem, fine :)

     

    cu

  6. hi,

     

    Why do MaxxPi and QuickPi's implementations for Chudnovsky's formula scale so poorly with multiple cores? Whereas y-cruncher achieves near linear scaling.

     

    nearly linear scaling with cores/threads is not possible.

    i personally think, that y-cruncher also uses binary splitting and a strong

    usage of gmp. so if this is true (or near by truth) linear scaling is im-possible.

    this is one of the reasons why MaxxPI² will calculate parallel PI results for each core.

    to keep much load as possible on the cores.

     

    Interesting thing to notice is that the author y-cruncher is merely a junior in college. His purpose for writing the program was to smash a few size records (and he did). (and as you'd mentioned: for record breaking, speed matters)

     

    only one thing matters here: TIME, if you had time, everything is possible

    and as i was in college... i had time. much time.

     

    So from a software benchmarker's standpoint, this has me wondering what y-cruncher could turn into given that it's already a killer program in terms of pure-multithreaded speed. I also wonder what will happen when this kid gets older and becomes more experienced.

     

    hmm... GMP... i think, but anyways i wish him luck!

     

    And by "fast", I mean something comparable to QuickPi at the least. If you can get him to release a GUI version of his Chudnovsky implementation, then it will satisfy both "fast" and "pretty". :)

     

    and again: MaxxPI² is only comparable (PI and all other calc.benchs) to it self.

     

    well, for me personally, MaxxPI is fast enough, it provides reliable consistent results.

    this is the most important thing.

     

    There is no need to hassle about 512M in 1sec. this is useless

    and the author of MaxxPI² shares this opinion with me (i strongly think).

     

    and don't forget this thread is written for MaxxPI²,

    not for y-cruncher and comparing against them.

     

    cu

  7. hi,

     

    True, it pretty fast. Here's what the numbers look like at 32M on a friend's 2.66 GHz Harpertowns BSEL to 3.2.

     

    MaxxPi 1.35 - 213.36

    PiFast 4.3 - 101.81

    QuickPi 4.5 (x64) - 44.51

    y-cruncher 0.3.2 (x64 SSE3) - 14.68

     

    great results!, here with i7 at 4ghz:

     

    pifast4.3: 65.6 sec

    maxxpi1.35: 127.1 sec

    superpi: 581.5 sec.

     

    pretty fast... :-)

     

    again: maxxpi does *not claim to be the fastest PI application.

    it does not need to be.

     

    fast enough to bench without sleeping *and long enough to show clearly differences between different setups. speed doesnt matter at all. it's the comparative between pc's

    that makes a benchmark a benchmark.

     

    Any idea where his Chudnovsky implementation stands? I'm sure the pi-community would like to see it.

     

    hmm good question...!?!

     

    (since all they seem to care about is speed)

     

    i think you have to read this:

    http://en.wikipedia.org/wiki/Benchmark_(computing)

    to understand.

    if you willing to get a worldrecord by calculation PI

    with xxxxM then your are right=speed matters.

     

    Also, if it's single-threaded (since it clearly is), why would it matter which formula (gauss vs. chudnovsky) is used? Regardless of resource distribution, it would still be 100% cpu over 1 core anyway.

     

    well as i said *binary splitting*, that means multicore(thread) for one calculation.

     

    chudnovsky *is the fastest formula at current,

    but it will give not that consistently cpu load as gauss do.

     

    Doesn't look like fast and pretty will ever go together...

     

    surely do, look at MaxxPI :-)

     

    but anyways, if your favorite is y-cruncher then use it!

    it's a pice of wonderfull and incedible fast software.

     

    As for speed...the slower the better. We need marathons as well as sprints for benching. SPi 1M is the first bench tried under LN2, lets be honest. We need bigger challenges not smaller ones.

     

    that's the point!

     

    cu

  8. hi,

     

    Interesting. :)

     

    Speed-wise it's absolutely pathetic compared to:

     

    http://www.numberworld.org/y-cruncher/

    and it's also a lot slower than PiFast and QuickPi.

    But it has a GUI :) Apparently, the author the y-cruncher program has plans for a GUI but nothing fancy like this.

    http://www.numberworld.org/y-cruncher/version_history.html#Future

     

    it's pretty fast, fast enough for benching *and! comparing.

    it doesnt have to be the *fastest.

    it's precise, uses CRC and the HW based clock (not winclock) and provides one result (k/sec.)

    witch is very easy to compare. it does not need *any additional librarys / installation.

     

    I know him, he specially choosed the gauss algo. because of

    its high *and continues (no fluctuation) cpu load.

     

    there are no optimizations for any cpu-manufacturer on board.

    all of them will benched with the same non-optimized code.

    As he said, this will show the real world, better.

     

    he also has an *incredible fast chudnovsky

    algorithm (incl. binary splitting), but this one will not produce

    that clean load on the cpu/memory as the gauss do.

    You can see that via performance monitoring Unit (PMU-CPU).

     

    so i don't think he will include this into maxxpi.

     

    His MaxxPI² is very professional,

    i was one of the first beta-testers on board.

     

    example:

    is there an difference between dual/tripple channel on x58?

    search the web, you will find nothing. try maxxpi2 and you will see it

    (memory <> overall-memory)

     

    this is also very interesting:

     

    5174ab2a.jpg

     

    and viewing/exporting own results (excel):

     

    f16392ba.jpg

     

    cu

×
×
  • Create New...