This article presents a few informal benchmarks comparing the Raspberry Pi to the Raspberry Pi 2. The original Pi has a single core ARM v6 processor. The Pi 2 is quad core, ARM v7, and clocked faster than the Pi 1. But is it really six times as fast, as the makers claim ? Short answer: yes it is. And then some.
Single Core Test
Gzipping a file is a CPU intensive task. Let’s see how both Pis perform when asked to gzip a large file of 237 MB. The Pi1 will go first:
pi1 $ ls -lh -rwxr-xr-x 1 root root 237M Feb 7 19:56 bigfile.mp4 pi1 $ time gzip bigfile.mp4 real 4m25.768s user 4m2.430s sys 0m11.770s
266 seconds on the Pi1.
Now the Pi2:
pi2 $ ls -lh -rwxr-xr-x 1 root root 237M Feb 7 19:56 bigfile.mp4 pi2 $ time gzip bigfile.mp4 real 1m31.396s user 1m24.870s sys 0m3.520s
91 seconds on the Pi2.
A speed improvement of around 2.95x. Repeating the test gave speed ratios of around 3. Sometimes just over, sometimes just under.
Yes, the Pi2 is 3 times as fast as the Pi1. Just as Eben Upton said it was, in his interview with The Register in February 2015.
Multi Core Test
Hang on though, doesn’t the Pi Foundation claim a performance improvement of “at least 6x” for the new Pi2, as mentioned in the same Register article ? Ah, well, the Pi2 has four CPU cores, and only one of them was used in the test above. Fine, lets get all four cores running, and see what happens.
Split that big file into four equal chunks as follows.
pi2 $ split -n 4 bigfile.mp4 pi2 $ ls -lh -rwxr-xr-x 1 root root 237M Feb 7 19:56 bigfile.mp4 -rwxr-xr-x 1 root root 60M Feb 22 16:53 xaa -rwxr-xr-x 1 root root 60M Feb 22 16:53 xab -rwxr-xr-x 1 root root 60M Feb 22 16:53 xac -rwxr-xr-x 1 root root 60M Feb 22 16:53 xad
The big 297MB file has been split into 4 pieces of 60MB each, called xaa, xab, xac, xad. The following command will create 4 processes, gzipping all four at the same time:
pi2 $ time ls x* | xargs -t -n 1 -P 4 gzip real 0m32.472s user 1m56.110s sys 0m5.790s
By using four cores in parallel, the Pi2 has zipped all the data in 32 seconds, improving on the Pi1’s performance by a factor of 8.3. That, as they say, is a spicy meat ball. And it certainly backs up claims of an “at least 6x” performance improvement.
Note. Here is an example of a top command run during the gzip test. Each of the Pi2’s four CPUs are running at nearly 100%.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2398 root 20 0 2292 1328 1044 R 100.0 0.2 0:28.84 gzip 2400 root 20 0 2292 1320 1032 R 100.0 0.2 0:29.02 gzip 2401 root 20 0 2292 1280 996 R 100.0 0.2 0:29.02 gzip 2399 root 20 0 2292 1280 988 R 92.1 0.2 0:27.39 gzip
More CPU Intensive
Pi2 is outperforming Pi1 by a factor of 8.3. But can it do more ? In order to flex those Pi2 muscles properly, let’s take an even more CPU bound task. Calculating a checksum on that big file will get those CPU cores really humming. Check out the Pi1:
pi1 $ time sha512sum bigfile.mp4 real 5m37.850s user 5m36.140s sys 0m1.270s
and the Pi2:
pi2 $ time sha512sum bigfile.mp4 real 1m28.688s user 1m28.120s sys 0m0.560s
With a more CPU intensive task, the Pi2, using only one core, just outperformed the P1 by a factor of 337/88 = 3.8.
And when we let all 4 cores in on the act…
pi2 $ time ls x* | xargs -t -n 1 -P 4 sha512sum real 0m23.397s user 1m29.300s sys 0m0.500s
The Pi2 outperformed the Pi1 by a factor of 15.6 (366/23.4). The Pi2 has also exceeded its own single-core performance by a factor of 88.6/23.4 = 3.49, close to a factor of 4x which the number of cores would naturally suggest.
Note The “time” command records 1 minute and 29 seconds of “user” time elapsing, even though only 23.397 seconds passed in real time. This is a quirk of the time command – adding up “user” time for each core.
For single threaded, CPU intensive processes, the Pi2 is typically 3 times faster than the Pi1, rising to 3.8 or so for very CPU bound processes. For multi-threaded applications, the improvement is 6x or more. For example it is up to 8.3 when compressing a file, and up to 15x for pure CPU bound activities such as checksumming data. This is easily demonstrated on the command line and confirms the Pi Foundation’s claims.
The performance jump offered by the Pi2 will also be obvious to anyone using the LXDE GUI, the Epiphany web browser or any other large, multi-threaded application. To check if an app is multi-threaded, use ps -eLfP. For example, Epiphany is now running on my Pi2 with 11 threads, scattered like snow across the 4 processors. The PSR column identifies the core:
pi2 $ ps -eLfP | grep epip UID PID PPID LWP PSR C NLWP STIME TTY TIME CMD pi 2336 1 2336 2 6 11 23:03 tty1 00:00:09 epiphany-browser pi 2336 1 2337 2 0 11 23:03 tty1 00:00:00 epiphany-browser pi 2336 1 2338 3 0 11 23:03 tty1 00:00:00 epiphany-browser pi 2336 1 2342 2 0 11 23:03 tty1 00:00:00 epiphany-browser pi 2336 1 2348 3 0 11 23:03 tty1 00:00:00 epiphany-browser pi 2336 1 2350 1 0 11 23:03 tty1 00:00:00 epiphany-browser pi 2336 1 2353 0 0 11 23:03 tty1 00:00:00 epiphany-browser pi 2336 1 2354 1 0 11 23:03 tty1 00:00:00 epiphany-browser pi 2336 1 2355 0 0 11 23:03 tty1 00:00:00 epiphany-browser pi 2336 1 2356 2 0 11 23:03 tty1 00:00:00 epiphany-browser pi 2336 1 2377 0 0 11 23:04 tty1 00:00:00 epiphany-browser
The same command (without the grep) will show many LXDE background processes and threads running across all four cores.
Question: In the test on the Pi2 above, how do we know that there is a gzip process running on each CPU core ? Might they not all be running on one core ?
Answer: Take a look at the first top output above. Four gzip processes each taking over 90% CPU time. A total of 490%. That is only possible if each gzip is running on a separate CPU core. Also, note the “R” in the Status (“S”) column. R means running, indicating that all 4 processes are on the CPU (in a core) at the same time.
We could also have typed ps -elfP while the test was running:
pi2 $ ps -elfP | grep gzip F S UID PID PPID PSR C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD 0 R root 2935 2934 2 91 80 0 - 573 ? 20:29 pts/0 00:00:11 gzip xaa 0 R root 2936 2934 3 91 80 0 - 573 ? 20:29 pts/0 00:00:11 gzip xab 0 R root 2937 2934 1 90 80 0 - 573 ? 20:29 pts/0 00:00:11 gzip xac 0 R root 2938 2934 0 87 80 0 - 573 ? 20:29 pts/0 00:00:11 gzip xad
The PSR column shows the ID of the CPU running each process. It can be seen that each gzip command is indeed in a different CPU (ie. core). And again, the S (“Status”) column shows every gzip is running.
All tests were performed using the same external USB2 disk.
The timings shown in the article are averaged, calculated from 3 or 4 runs of each test.