Storage benchmarks are hardly stable in output alike eg Superpi... this might be one of those run re-run benchmarks... 2nd run is always faster here on my daily... 3rd run slower... its widely used by review sites, so can't be that bad right ?
And if we only count only the GPU score, you sure it does not scale with processor architecture? A 3D scores will only be pure GPU IF the tested GPU is fully maxed by the benchmark and/or we impose low CPU clockspeeds