Tech —

Note 3’s benchmarking “adjustments” inflate scores by up to 20%

We dig into the boosting shenanigans and find they apply specifically to benchmark apps.

Note 3’s benchmarking “adjustments” inflate scores by up to 20%

We noticed an odd thing while testing the Samsung Galaxy Note 3: it scores really, really well in benchmark tests—puzzlingly well, in fact. A quick comparison of its scores to the similarly specced LG G2 makes it clear that something fishy is going on, because Samsung's 2.3GHz Snapdragon 800 blows the doors off LG's 2.3GHz Snapdragon 800. What makes one Snapdragon so different from the other?

After a good bit of sleuthing, we can confidently say that Samsung appears to be artificially boosting the US Note 3's benchmark scores with a special, high-power CPU mode that kicks in when the device runs a large number of popular benchmarking apps. Samsung did something similar with the international Galaxy S 4's GPU, but this is the first time we've seen the boost on a US device. We also found a way to disable this special CPU mode, so for the first time we can see just how much Samsung's benchmark optimizations affect benchmark scores.

Left: The Note 3 idling normally, with 3 cores off, and one in a low-power mode. Right: The Note 3 in a benchmarking app, unable to idle.
Enlarge / Left: The Note 3 idling normally, with 3 cores off, and one in a low-power mode. Right: The Note 3 in a benchmarking app, unable to idle.

The smoking gun here is CPU idle speeds, which can be viewed with a system monitor app while using the phone. The above picture shows how differently the CPU treats a benchmarking app from a normal app. Normally, while the Note 3 is idling, three of the four cores shut off to conserve power; the remaining core drops down to a low-power 300MHz mode. However, if you load up just about any popular CPU benchmarking app, the Note 3 CPU locks into 2.3GHz mode, the fastest speed possible, and none of the cores ever shut off. Stopping the CPU from idling shouldn't in and of itself affect the benchmark scores a whole lot, so this was our first sign that something was wrong. Benchmarks exist to measure the performance of a phone during normal usage, and a device should never treat a benchmark app differently than a normal app.

While it's difficult to determine every bit of special programming that affects the CPU while a benchmark is running, one sure-fire way to see what's going on is to trick the phone into not entering a special "benchmark mode" during a benchmark. If we could defeat this behavior, we could have before and after benchmark numbers and thus see just how deep the rabbit hole goes. A bit of testing showed that the device's boosted benchmark mode is triggered by the package names of the most popular benchmarking apps—loading Geekbench, for example, starts this mode. So we slapped together "Stealthbench," a renamed version of Geekbench 3. By disassembling the benchmarking app, changing only the package name, and reassembling it, we could run the app without the CPU knowing we were running a benchmark app. The Note 3 should treat our benchmark like any other app and give a true representation of the phone's performance relative to other devices.

Left: Geekbench, which triggers the benchmark booster. Right: A renamed version of Geekbench, which defeats the booster.
Enlarge / Left: Geekbench, which triggers the benchmark booster. Right: A renamed version of Geekbench, which defeats the booster.

Above is a picture of Geekbench and of Stealthbench, which is identical to Geekbench in every way except for a different package name. With Geekbench, System Monitor shows that the CPU is locked into 2.3GHz mode and all cores are active, but in Stealthbench, the CPU is allowed to idle, shut off cores, and switch power modes, the same way it does in any other app. We have successfully disabled the special benchmark mode.

Geekbench is a popular benchmarking app, so the Note is programmed to give it special treatment. It has never heard of "Stealthbench," though, so despite being the exact same app, it does not get the special benchmark boost. The Note will run this benchmark like 99.99999 percent of the other apps on the device. The next step, then, is to run the two benchmarks and compare the CPU's benchmark mode with its non-benchmark mode.

The difference is remarkable. In Geekbench's multicore test, the Note 3's benchmark mode gives the device a 20 percent boost over its "natural" score. With the benchmark boosting logic stripped away, the Note 3 drops down to LG G2 levels, which is where we initially expected the score to be, given the identical SoCs. This big of a boost means that the Note 3 is not just messing with the CPU idle levels; significantly more oomph is unlocked when the device runs a benchmark.

The next step was to figure out just which apps are affected by this. In Samsung's official statement about the international Galaxy S 4's GPU "benchmark booster," the company said that the GPU frequency boost happened in a number of other apps, "such as the S Browser, Gallery, Camera, [and] Video Player." The previously provided reasons for the speed variations seemed at least somewhat defensible, since they could help preserve battery life and generate less heat. If battery life preservation and heat reduction were the reasons behind what was going on here, the "boosting" might be similarly defensible.

But that doesn't appear to be the reason behind the Note's boosting at all. We found the file that triggers the boost behavior, and after a lot of extracting, disassembly, and file conversion, we have human-readable Java code for it:

{
BOARD_PLATFORM = SystemProperties.get("ro.board.platform");
mToken = 0;
PACKAGES_FOR_LCD_FRAME_RATE_ADJUSTMENT = new PackageInfo[0];
isEngBinary = "eng".equals(Build.TYPE);
PackageInfo[] arrayOfPackageInfo = new PackageInfo[26];
arrayOfPackageInfo[0] = new PackageInfo("com.aurorasoftworks.quadrant.ui.standard", false);
arrayOfPackageInfo[1] = new PackageInfo("com.aurorasoftworks.quadrant.ui.advanced", false);
arrayOfPackageInfo[2] = new PackageInfo("com.aurorasoftworks.quadrant.ui.professional", false);
arrayOfPackageInfo[3] = new PackageInfo("com.redlicense.benchmark.sqlite", false);
arrayOfPackageInfo[4] = new PackageInfo("com.antutu.ABenchMark", false);
arrayOfPackageInfo[5] = new PackageInfo("com.greenecomputing.linpack", false);
arrayOfPackageInfo[6] = new PackageInfo("com.greenecomputing.linpackpro", false);
arrayOfPackageInfo[7] = new PackageInfo("com.glbenchmark.glbenchmark27", false);
arrayOfPackageInfo[8] = new PackageInfo("com.glbenchmark.glbenchmark25", false);
arrayOfPackageInfo[9] = new PackageInfo("com.glbenchmark.glbenchmark21", false);
arrayOfPackageInfo[10] = new PackageInfo("ca.primatelabs.geekbench2", false);
arrayOfPackageInfo[11] = new PackageInfo("com.eembc.coremark", false);
arrayOfPackageInfo[12] = new PackageInfo("com.flexycore.caffeinemark", false);
arrayOfPackageInfo[13] = new PackageInfo("eu.chainfire.cfbench", false);
arrayOfPackageInfo[14] = new PackageInfo("gr.androiddev.BenchmarkPi", false);
arrayOfPackageInfo[15] = new PackageInfo("com.smartbench.twelve", false);
arrayOfPackageInfo[16] = new PackageInfo("com.passmark.pt_mobile", false);
arrayOfPackageInfo[17] = new PackageInfo("se.nena.nenamark2", false);
arrayOfPackageInfo[18] = new PackageInfo("com.samsung.benchmarks", false);
arrayOfPackageInfo[19] = new PackageInfo("com.samsung.benchmarks:db", false);
arrayOfPackageInfo[20] = new PackageInfo("com.samsung.benchmarks:es1", false);
arrayOfPackageInfo[21] = new PackageInfo("com.samsung.benchmarks:es2", false);
arrayOfPackageInfo[22] = new PackageInfo("com.samsung.benchmarks:g2d", false);
arrayOfPackageInfo[23] = new PackageInfo("com.samsung.benchmarks:fs", false);
arrayOfPackageInfo[24] = new PackageInfo("com.samsung.benchmarks:ks", false);
arrayOfPackageInfo[25] = new PackageInfo("com.samsung.benchmarks:cpu", false);
PACKAGES_FOR_BOOST_ALL_ADJUSTMENT = arrayOfPackageInfo;
mCameraCPUBooster = null;
mCameraCPUCoreNumBooster = null;
mCPUFrequencyTable = null;
mCPUCoreTable = null;
mRotationCPUCoreNumBooster = null;
mRotationGPUBooster = null;
}

The file we ended up with is called "DVFSHelper.java," and it contains a hard-coded list of every package that is affected by the special CPU boosting mode. According to this file, the function is used exclusively for benchmarks, and it seems to cover all the popular ones. There's Geekbench, Quadrant, Antutu, Linpack, GFXBench, and even some of Samsung's own benchmarks. The two functions applied to this list seem to be "PACKAGES_FOR_BOOST_ALL_ADJUSTMENT," which is no doubt the CPU booster, and "PACKAGES_FOR_LCD_FRAME_RATE_ADJUSTMENT," which makes it sound like the phone is also altering the display frame rate.

The inclusion of GFXBench is surprising given that it shows no unusual idling behavior in System Monitor. Between the inclusion of that and the suspicious "frame rate adjustment" string, it's clear that Samsung is doing something to the GPU as well, though those clock speeds are more difficult to access than the CPU speeds (a method used by AnandTech on the international S 4 no longer works on the Note 3).

The "DVFS" in "DVFSHelper" stands for "Dynamic frequency scaling," also known as CPU throttling, which has many legitimate uses to manage both heat and power draw. This file contains a few special settings for the camera, Gallery, and some other packed-in apps, but nothing like what is in the above section. Benchmarking apps are the only type of app that is systematically called out and boosted.

To see how some other benchmarks are affected, we made "stealth" versions of those, too—the exact same app, just with a different package name. These results back up the Geekbench findings: we're seeing artificial benchmark increases across the board of about 20 percent; Linpack showed a boosted variance of about 50 percent.

The ironic thing is that even with the benchmark booster disabled, the Note 3 still comes out faster than the G2 in this test. If the intent behind the boosting was simply to ensure that the Note 3 came out ahead in the benchmark race, it doesn't appear to have been necessary in the first place.

While benchmarks are often a good way to get an idea of a phone or tablet's general performance, they only work if they're treated the same as any other app. Now that this artificial boosting has shown up in multiple Android devices, we'll be keeping a much closer eye on how phones and tablets behave while running these benchmarks in the future. We're also seeing some evidence that Samsung's new Galaxy Note 10 tablet is exhibiting the same behavior—we'll be posting that review later today.

Channel Ars Technica