I worry about runtime checks that make decisions about how to run the benchmark.
Geekbench 3 and Geekbench 4 include runtime checks that modify the benchmark to accommodate older, less-capable hardware. These checks are what folks are using to "tweak" their Geekbench scores.
My preference would be to have a "one size fits all" benchmark that works the same regardless of the underlying hardware. It may limit the range of hardware it supports, but I think it's the best, most robust approach to having a sane benchmark that is resistant to hacks.