A new way to address CPUs (Cores/Threads) on HWBot

der8auer · August 11, 2021

Hey there!

Q3/Q4 we will see new Intel CPUs and these Intel CPUs are also going to open up a new era.

From the information that is publicly available Alder Lake (i9 12900K), will be made of 2 different types of cores: up to 8 Golden Cove and up to 8 Gracemont cores. Previous leaks described this as P-Cores and E-Cores. (8P + 8E)

The Golden Cove cores will be the "fast" cores and Gracemont cores will be "slow". Now we're thinking about how to list these CPUs at HWBot. The first thought might be obvious to list the CPU as 16C CPU, as we have 16C in total. However from what we know about future AMD CPUs, we also know that AMD will eventually move the the big/little concept. I personally also expect that this kind of CPU design will be the future for desktop CPUs and we might see it for a long time. So it's probably good to think about this precisely before throwing it into the database or apply sudden changes.

Things to consider:

Only the P-Cores will feature Hyperthreading. So even though we have a 16C CPU with HT it will have 24T in total
Future CPU designs from both Intel and AMD might have strange big/little configurations such as 8 big cores and 20 little cores - some with HT and some without HT
Simply throwing the 12900K into the 8C ranking would be unfair for the rest of the 8C CPUs because of the additional E-Cores
Simply throwing the 12900K into the 16C ranking could result in the fact that the CPU won't be competitive in some rankings and for the future it would also mean that if we have an obscure big/little configuration we might see desktop CPUs with 28C that performs worse that a 16C from 3 years ago.

Possible solutions we could think of so far:

Change the entire HWBot and judge CPUs by the amount of Threads instead of the amount of Cores. While this might make sense from a technical perspective especially for future CPUs it would result in a massive change in our database. For example a 6700K (4C) would suddenly compete with a 9700K (8C). It would change a lot regarding rankings and points and it would be a lot of work because we'd have to change benchmarks and CPU listings
We simply list the 12900K as a 16C CPU. Might sound like an easy option for now but I see that this would make it very difficult for the future years especially thinking about that AMD will eventually also use different performing cores on one single chip
We simply list the 12900K as a 8C CPU. In this method we would just go by the amount of "fast" cores and the small cores would act as a booster to the CPU. The issue I see here is that it won't reflect the performance of an old fashioned 8C CPU. Could be pretty unfair
We list the 12900K twice. This solution would be a mix of #2 and #3. We list the 12900K as:
- i9 12900K (8P + 8E) [this would be in the 16C ranking]
- i9 12900K (8P + 0E) [this would be in the 8C ranking, the user would have to manually disable the E-Cores in BIOS to participate]

At this point we would prefer option number 4. because it offers both ways to judge the CPU performance and we don't have to do fundamental changes to HWBot itself.

If you have other brilliant ideas we are open for suggestions.

Thanks!

Noxinite · August 11, 2021

I like the idea of having the small cores enabled/disabled for different rankings (no. 4). It could be implemented similar to the "unlocked shaders" tick box on submissions for graphics cards - but obviously you would have it enabled by default so the newer members wouldn't flood the database with incorrect submissions.

Leeghoofd · August 11, 2021

@Noxinite: So listing the 12900K and a 12900K(8P)

Negative_Feedback · August 11, 2021

Would it be possible to create a "mixed core" category but still go by core count? For example the 12900K scores would be under the "16c mixed core" rankings instead of in the 16c rankings.

der8auer · August 11, 2021

1 minute ago, Negative_Feedback said:

Would it be possible to create a "mixed core" category but still go by core count? For example the 12900K scores would be under the "16c mixed core" rankings instead of in the 16c rankings.

Technically possible. Things to consider with "xC Mixed" ranking would be:

- the native 8C, 12C, 16C... rankings might not change anymore as future CPUs might always feature mixed cores. In the end similar to 1C or 2C rankings which are pretty much dead after certain years.

- We might see strange CPU combinations in one ranking. E.g. 8+8, 4+12, 12+4 which could result in big performance gaps

Lucky_n00b · August 11, 2021

I like no. 4, listing the 12900K as default to compete in 16C, and 12900K (8P+0E)

And maybe use same rules for other mixed core CPUs (e.g on its own as total number of cores, and only the number of performance cores)

denvys5 · August 11, 2021

Question: do we have verification from Intel that P and E cores can do same workload in parallel? Or do they move workload, like on ARM bigLITTLE?

#4 sounds reasonable. I would also add 0P+8E core configuration to that list. Why? Coz we might see 8P+64E cpus in near future, if this architecture succeeds. And that means little cores become competitive in MT benchmarks on their own.

But, this all ranking split is possible only if core configuration can be verified for each individual submission. So 2D only benchmate subs, as far as I understand.

aperacer · August 11, 2021

In my opinion the solution @Negative_Feedback offered is the best one, but else solution #4 is also good for me

Negative_Feedback · August 11, 2021

44 minutes ago, der8auer said:

Technically possible. Things to consider with "xC Mixed" ranking would be:

- the native 8C, 12C, 16C... rankings might not change anymore as future CPUs might always feature mixed cores. In the end similar to 1C or 2C rankings which are pretty much dead after certain years.

- We might see strange CPU combinations in one ranking. E.g. 8+8, 4+12, 12+4 which could result in big performance gaps

These are some good points to consider. I think if we were to do a modified version of option 4, that would be a good solution. So "mixed core" CPUs would by default go under "xC mixed core" rankings but if you disable the little cores, you could still sub under the regular "xC" rankings.

Basically this:
- i9 12900K (8P + 8E) [this would be in the 16C Mixed Core ranking]
- i9 12900K (8P + 0E) [this would be in the 8C ranking, the user would have to manually disable the E-Cores in BIOS to participate]

This way you could still keep the old rankings alive but also have the "mixed cores" in their own ranking as well. You still kinda have the issue with weird core configs causing performance gaps but you would have that issue as well if you were to just group them by core count.

Splave · August 11, 2021

My 11900k results suck, let's put alder lake as 8 core. Thanks, same PayPal account?

Strunkenbold · August 11, 2021

We might want to wait till official benchmarks appear. If 12900k is barely faster than any 8 core we might should consider a change. But I could imagine that architectural changes will make the gap you might see today disappear with next gen anyway. And in 1-2 years hybrid categories might be superfluous.

But if the leaks are true, there is nothing to worry about, except you know already more...

Leeghoofd · August 11, 2021

26 minutes ago, Splave said:

My 11900k results suck, let's put alder lake as 8 core. Thanks, same PayPal account?

2Rolundo@paypal.com...

Rauf · August 12, 2021

Most of the proposals feature massively increased numbers of CPU rankings. Some also mean you can use a single golden cpu for multiple rankings, saturating your profile points-wise with mostly one good chip. Meanwhile we have one single GPU-ranking... (considering SLI is more or less dead)

I have always liked the idea of dividing into low-, mid-, high- and ultrahigh end. For both CPUs and GPUs. It's possible CPUs would require more categories, but you get the idea. That way the global points can always stay relevant and don't have to introduce strange categories that in one way or another won't be fair. Sure it means alot of work for the old CPUs, dividing them into categories. But the alternative would be to have possible hundreds of CPU-rankings in the near future.

One big benefit is for the GPU-rankings also. It would mean that you could compete for globals without being rich or have good sponsors as you could bench for globals in the lower ranges.

der8auer · August 12, 2021

I like the idea from a global points perspective. Realistically speaking I don't see how we could implement this within 2021. There are still too many bugs we have to clean up but at a certain point we want to change the global points anyway so we can talk about that again early next year.

cbjaust · August 12, 2021

Option 1 categorising by thread count would be very disruptive but possibly also very interesting and worth while.

Option 2 maintaining the status quo and going by actual core count is the way forward.

I think it is a mistake to pretend the little cores without HT/SMT don't exist. Having one sku eligible for more than one category (as in 8P + 8E and 8P + 0E) is also not an optimal strategy.

There are always more desireable parts to have for any given category and this will not change.

buildzoid · August 12, 2021

I really don't like the idea of 1 CPU being elligible for 2 different sets of globals. So the 12900K should only be in 1 category.

Ranking CPUs by thread count would be extremely disruptive to the current ranking system however I don't feel like it's fundamentally unfair. Especially since the whole point of hyperthreading is to make a single CPU core perform more like 2. The idea behind mixed core CPUs is the same.

I feel like the distiction between ranking by threads vs cores is basically:

Best performace in X thread benchmark VS Best performing X core CPU

A 7700K is one of the fastest quad core CPUs, however it isn't the fastest at 8 thread benchmarks.

The current system on HWbot screws over CPUs like the 7600K, 8350K, 9600K, 9700K because they aren't "proper" 4/6/8 core CPUs. The 12900K has 16 cores so for the same reason that a 9700K is grouped with the 9900K the 12900K should be grouped with the 5950X and other 16 cores. Also if we get an 8C/16T+20C CPU in the future that's not really an issue as it would be in the 28C rankings. If it's slower than say 28C/56T or 16C/32T+12C CPUs in things like cinebench it's back to the same situation as the 9700K vs 9900K and I fail to see the problem.

Noxinite · August 12, 2021

I feel we should be thinking less about point distribution being "unfair" or "unbalanced", as that is still an ongoing process with HWBot since the change in ownership. Aiming for simplicity and fitting into the database in a way that makes sense seems more important right now IMO.

_mat_ · August 12, 2021

Something to consider with option 1 and 4 is that the HWBOT submission API does not support a separation or thread count yet. And benchmarks don't support it as well. It needs to be added on both ends. Otherwise all CPUs would end up in the same category with their CPU name.

I will add support in BenchMate for whatever is decided of course.

Little chance for GPUPI 3 as I lost the source code for both 3.2 and 3.3 due to a failed SSD and pure stupidity. Might be good timing to finally get rid of the two versions and finish GPUPI 4.

Edited August 12, 2021 by _mat_

Splave · August 13, 2021

What if you just went by p core count but if they have big and little cores then you only count the big, and then you take that value and call it say 8P or 8BL for example:

11900k 8 x

12900k 8P x

It will be limited in its value of points to start sure, unless you juiced the globals a bit without requiring a certain number of subs. Or just wait they will eventually fill up. Certainly this style processor is the future and this way people can still enjoy benching standard processors without diluting the new ones into multiple categories.

This seems like the easiest way to fix this and can always be revisited down the line if we see crazy 8P 64L cpus with 80 threads etc.

SparkysAdventure · August 13, 2021

So, let's say you're running Cinebench R20.

Would the benchmark run on both the high performance and the high efficiency cores? Or will it run on just one set?

Leeghoofd · August 13, 2021

2 hours ago, Sparky's__Adventure said:

So, let's say you're running Cinebench R20.

Or will it run on just one set?

if it was the latter we would not have to debate

_mat_ · August 13, 2021

I think that's a very important question. Theoretically it depends a lot on the benchmark and the performance of the little cores. Let's say we have the perfect parallel workload, that really scales to the number of threads big+little cores offer.

A benchmark has to have some kind of workload scheduler to divide the calculation into smaller tasks that can be run in parallel. When a processor thread is finished with its last task, it gets the next one until all tasks are finished.

The size of a task depends on its overhead, weight of calculation, memory dependencies and so on. For example if you split up the tasks into one simple addition each, the overhead of the scheduling too huge and the scaling will be off (32 threads will not be 32x faster than 1 thread).

Normally the tasks have a significant size to eliminate any overhead. For CB the task size depends on the number of threads available according to the yellow boxes being rendered in parallel. In GPUPI you can select the task size with the batch size.

This normally works pretty well when you have equally fast threads. The scheduler assumes that all tasks will take an equal amount of time and will schedule the tasks in a way so that all threads will be busy until all tasks are done. It's really important that they are done at nearly the same time, because the benchmark has to wait until all tasks are finished to show you the juicy final number. That's often the reason for score variance and why single-threaded benches are more stable than running workloads with 64 threads or more. A good run might just be a little bit of luck because the tasks finished nearly at the same time.

With Big.Little we now have slower threads available to the scheduler. They will need more time to finish a single task. If the thread scheduler still assumes that all threads are equally fast and always gives out tasks to the next available thread, this will become a problem at the end of the calculation. It might hand off the last task to a slow thread and you will have to wait for it to finish while all other threads are already idle.

You can try this btw in GPUPI when you mix one or more high-end cards with a single low-end card. The low-end card will hurt the final score instead of helping it with increased CU count.

It's all highly theoretical but in my opinion B.L will hurt some benches just the same way HT can hurt. So my guess is that it will have to be disabled for some benches depending on how they are implemented.

Maybe a flag in the ranking next to the score will be good enough to show that B.L was enabled and gave a boost.

yosarianilives · August 15, 2021

Let's remember that the little cores are supposed to be roughly as fast as Skylake so not exactly completely weak cores. It should be total number of cores, if Intel OR amd choose to make half their cores suck then it just will do worse in its core rankings. No need to make hwbot any more complicated, count all the cores and that's the ranking the cpu goes in.

Mr.Scott · August 15, 2021

9 hours ago, yosarianilives said:

Let's remember that the little cores are supposed to be roughly as fast as Skylake so not exactly completely weak cores. It should be total number of cores, if Intel OR amd choose to make half their cores suck then it just will do worse in its core rankings. No need to make hwbot any more complicated, count all the cores and that's the ranking the cpu goes in.

Exactly.

Leeghoofd · August 16, 2021

There's more to it than just the sum of the cores. It is like Mat stated above, important will be how well the task scheduler/benchmark can address them ( hopefully Win10 will support it properly too ). My point of view is we can't really add the cores and compare these new CPU gens to current core models.

A new way to address CPUs (Cores/Threads) on HWBot

Recommended Posts

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

CENS

speed.fastest

Noxinite

Posted Images

Join the conversation