Raspberry Pi Zero Wネットワーク通信性能

Raspberry Pi Zero W稼働中

今回、Raspberry Pi Zero W (以下RPi0W)が無線LANインターフェースしか持っていないということで無線LAN APから間に障害物が無いところ距離1mに置き、ケースの影響を受けないよう基板剥き出しで計測した。無線LAN APの性能も計測結果に影響するかもしれないが、こちらは敢えて無視で。計測はルーターとRPi0Wの間にL2SWが1台、無線LAN APが1台。無線LAN APとRPi0Wの間のみが無線で、それ以外は有線。ネットワーク的には1hopとなっている。
それぞれ3回ずつ計測し、中間の値を取る。

pi@raspberrypi:~ $ iperf -c 192.168.6.1
------------------------------------------------------------
Client connecting to 192.168.6.1, TCP port 5001
TCP window size: 43.8 KByte (default)
------------------------------------------------------------
[  3] local 192.168.6.226 port 58398 connected with 192.168.6.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  50.1 MBytes  42.0 Mbits/sec

RPi0W → PCルーター (IPv4): 42.0Mbps

pc@router:~ % iperf -c 192.168.6.226
------------------------------------------------------------
Client connecting to 192.168.6.226, TCP port 5001
TCP window size: 1.00 MByte (default)
------------------------------------------------------------
[  3] local 192.168.6.1 port 37838 connected with 192.168.6.226 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  40.9 MBytes  34.2 Mbits/sec

PCルーター → RPi0W (IPv4): 34.2Mbps

pi@raspberrypi:~ $ iperf -V -c fdc1::1
------------------------------------------------------------
Client connecting to fdc1::1, TCP port 5001
TCP window size: 43.8 KByte (default)
------------------------------------------------------------
[  3] local fdc1::8000 port 38738 connected with fdc1::1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  40.2 MBytes  33.7 Mbits/sec

RPi0W → PCルーター (IPv6): 33.7Mbps

pc@router:~ % iperf -V -c fdc1::8000
------------------------------------------------------------
Client connecting to fdc1::8000, TCP port 5001
TCP window size: 1.00 MByte (default)
------------------------------------------------------------
[  3] local fdc1::1 port 27810 connected with fdc1::8000 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  44.5 MBytes  37.2 Mbits/sec

PCルーター → RPi0W (IPv4): 37.2Mbps

おおよそ30〜40Mbpsというところ。ちょっとどころかかなり遅い。できればこの10倍くらい欲しいんだけどね。
受信についてはIPv6でもIpv4に比べて落ち込まない。IPv6の送信だけが何故か落ち込む。

公開iperf3サーバに接続してみた。

pi@raspberrypi:~ $ iperf -c iperf.he.net
------------------------------------------------------------
Client connecting to iperf.he.net, TCP port 5001
TCP window size: 43.8 KByte (default)
------------------------------------------------------------
[  3] local 192.168.6.226 port 50184 connected with 216.218.227.10 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.1 sec  44.0 MBytes  36.4 Mbits/sec

この程度の速度だと1hopとなりとで測るのとあまり差が無かった。

関連記事:

Raspberry Pi Zero Wベンチマーク

Raspberry Pi Zero W (以下RPi0W)はBroadcom BCM2835というSoCを搭載していて、 ARM1176JZF-S (ARMv6系ARM11) というシングルコアのCPUが1GHzで動くらしい。無線無しモデルのRaspberry Pi Zeroも同じ。

RaspbianでCPU情報を見る

pi@raspberrypi:~ $ lscpu
Architecture:          armv6l
Byte Order:            Little Endian
CPU(s):                1
On-line CPU(s) list:   0
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             1
Model:                 7
Model name:            ARMv6-compatible processor rev 7 (v6l)
CPU max MHz:           1000.0000
CPU min MHz:           700.0000
BogoMIPS:              697.95
Flags:                 half thumb fastmult vfp edsp java tls

CPUの最大周波数が1000.0000になってるけどARM1176JZF-Sって990MHzじゃなかったかしら。数値丸めてる?

UnixBenchで計測

pi@raspberrypi:~ $ ./UnixBench/Run -c 1 -c 2 -c 4 
make all
make[1]: Entering directory '/home/pi/UnixBench'
Checking distribution of files
./pgms  exists
./src  exists
./testdir  exists
./tmp  exists
./results  exists
make[1]: Leaving directory '/home/pi/UnixBench'
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
sh: 1: 3dinfo: not found

   #    #  #    #  #  #    #          #####   ######  #    #   ####   #    #
   #    #  ##   #  #   #  #           #    #  #       ##   #  #    #  #    #
   #    #  # #  #  #    ##            #####   #####   # #  #  #       ######
   #    #  #  # #  #    ##            #    #  #       #  # #  #       #    #
   #    #  #   ##  #   #  #           #    #  #       #   ##  #    #  #    #
    ####   #    #  #  #    #          #####   ######  #    #   ####   #    #

   Version 5.1.3                      Based on the Byte Magazine Unix Benchmark

   Multi-CPU version                  Version 5 revisions by Ian Smith,
                                      Sunnyvale, CA, USA
   January 13, 2011                   johantheghost at yahoo period com

Wide character in print at ./Run line 1511.
Wide character in printf at ./Run line 1542.
Use of uninitialized value in printf at ./Run line 1379.
Use of uninitialized value in printf at ./Run line 1380.
Use of uninitialized value in printf at ./Run line 1589.
Use of uninitialized value in printf at ./Run line 1590.

1 x Dhrystone 2 using register variables  1 2 3 4 5 6 7 8 9 10

1 x Double-Precision Whetstone  1 2 3 4 5 6 7 8 9 10

1 x Execl Throughput  1 2 3

1 x File Copy 1024 bufsize 2000 maxblocks  1 2 3

1 x File Copy 256 bufsize 500 maxblocks  1 2 3

1 x File Copy 4096 bufsize 8000 maxblocks  1 2 3

1 x Pipe Throughput  1 2 3 4 5 6 7 8 9 10

1 x Pipe-based Context Switching  1 2 3 4 5 6 7 8 9 10

1 x Process Creation  1 2 3

1 x System Call Overhead  1 2 3 4 5 6 7 8 9 10

1 x Shell Scripts (1 concurrent)  1 2 3

1 x Shell Scripts (8 concurrent)  1 2 3
Wide character in printf at ./Run line 1484.

2 x Dhrystone 2 using register variables  1 2 3 4 5 6 7 8 9 10

2 x Double-Precision Whetstone  1 2 3 4 5 6 7 8 9 10

2 x Execl Throughput  1 2 3

2 x File Copy 1024 bufsize 2000 maxblocks  1 2 3

2 x File Copy 256 bufsize 500 maxblocks  1 2 3

2 x File Copy 4096 bufsize 8000 maxblocks  1 2 3

2 x Pipe Throughput  1 2 3 4 5 6 7 8 9 10

2 x Pipe-based Context Switching  1 2 3 4 5 6 7 8 9 10

2 x Process Creation  1 2 3

2 x System Call Overhead  1 2 3 4 5 6 7 8 9 10

2 x Shell Scripts (1 concurrent)  1 2 3

2 x Shell Scripts (8 concurrent)  1 2 3
Wide character in printf at ./Run line 1484.

4 x Dhrystone 2 using register variables  1 2 3 4 5 6 7 8 9 10

4 x Double-Precision Whetstone  1 2 3 4 5 6 7 8 9 10

4 x Execl Throughput  1 2 3

4 x File Copy 1024 bufsize 2000 maxblocks  1 2 3

4 x File Copy 256 bufsize 500 maxblocks  1 2 3

4 x File Copy 4096 bufsize 8000 maxblocks  1 2 3

4 x Pipe Throughput  1 2 3 4 5 6 7 8 9 10

4 x Pipe-based Context Switching  1 2 3 4 5 6 7 8 9 10

4 x Process Creation  1 2 3

4 x System Call Overhead  1 2 3 4 5 6 7 8 9 10

4 x Shell Scripts (1 concurrent)  1 2 3

4 x Shell Scripts (8 concurrent)  1 2 3
Wide character in printf at ./Run line 1484.

========================================================================
   BYTE UNIX Benchmarks (Version 5.1.3)

   System: raspberrypi: GNU/Linux
   OS: GNU/Linux -- 4.9.41+ -- #1023 Tue Aug 8 15:47:12 BST 2017
   Machine: armv6l (unknown)
   Language: en_US.utf8 (charmap="ANSI_X3.4-1968", collate="ANSI_X3.4-1968")
   CPU 0: ARMv6-compatible processor rev 7 (v6l) (0.0 bogomips)
          
   15:40:32 up 3 min,  3 users,  load average: 0.90, 0.85, 0.38; runlevel 5

------------------------------------------------------------------------
Benchmark Run: 日  9月 03 2017 15:40:32 - 16:10:03
1 CPU in system; running 1 parallel copy of tests

Dhrystone 2 using register variables        2279562.3 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                      462.4 MWIPS (10.0 s, 7 samples)
Execl Throughput                                236.2 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks         45718.5 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           14660.5 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        119375.5 KBps  (30.0 s, 2 samples)
Pipe Throughput                              186043.5 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  27950.5 lps   (10.0 s, 7 samples)
Process Creation                                628.8 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                    542.7 lpm   (60.1 s, 2 samples)
Shell Scripts (8 concurrent)                     69.8 lpm   (60.2 s, 2 samples)
System Call Overhead                         469639.0 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0    2279562.3    195.3
Double-Precision Whetstone                       55.0        462.4     84.1
Execl Throughput                                 43.0        236.2     54.9
File Copy 1024 bufsize 2000 maxblocks          3960.0      45718.5    115.5
File Copy 256 bufsize 500 maxblocks            1655.0      14660.5     88.6
File Copy 4096 bufsize 8000 maxblocks          5800.0     119375.5    205.8
Pipe Throughput                               12440.0     186043.5    149.6
Pipe-based Context Switching                   4000.0      27950.5     69.9
Process Creation                                126.0        628.8     49.9
Shell Scripts (1 concurrent)                     42.4        542.7    128.0
Shell Scripts (8 concurrent)                      6.0         69.8    116.3
System Call Overhead                          15000.0     469639.0    313.1
                                                                   ========
System Benchmarks Index Score                                         113.6

------------------------------------------------------------------------
Benchmark Run: 日  9月 03 2017 16:10:03 - 16:41:23
1 CPU in system; running 2 parallel copies of tests

Dhrystone 2 using register variables        2377183.7 lps   (10.1 s, 7 samples)
Double-Precision Whetstone                      913.7 MWIPS (9.9 s, 7 samples)
Execl Throughput                                247.5 lps   (29.8 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks         45899.5 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           14610.9 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        119704.5 KBps  (30.0 s, 2 samples)
Pipe Throughput                              180469.6 lps   (10.1 s, 7 samples)
Pipe-based Context Switching                  25944.8 lps   (10.0 s, 7 samples)
Process Creation                                608.2 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                    536.9 lpm   (60.1 s, 2 samples)
Shell Scripts (8 concurrent)                     68.7 lpm   (61.2 s, 2 samples)
System Call Overhead                         476431.6 lps   (10.1 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0    2377183.7    203.7
Double-Precision Whetstone                       55.0        913.7    166.1
Execl Throughput                                 43.0        247.5     57.5
File Copy 1024 bufsize 2000 maxblocks          3960.0      45899.5    115.9
File Copy 256 bufsize 500 maxblocks            1655.0      14610.9     88.3
File Copy 4096 bufsize 8000 maxblocks          5800.0     119704.5    206.4
Pipe Throughput                               12440.0     180469.6    145.1
Pipe-based Context Switching                   4000.0      25944.8     64.9
Process Creation                                126.0        608.2     48.3
Shell Scripts (1 concurrent)                     42.4        536.9    126.6
Shell Scripts (8 concurrent)                      6.0         68.7    114.5
System Call Overhead                          15000.0     476431.6    317.6
                                                                   ========
System Benchmarks Index Score                                         119.6

------------------------------------------------------------------------
Benchmark Run: 日  9月 03 2017 16:41:23 - 17:17:00
1 CPU in system; running 4 parallel copies of tests

Dhrystone 2 using register variables        2365565.8 lps   (10.1 s, 7 samples)
Double-Precision Whetstone                     1836.2 MWIPS (10.0 s, 7 samples)
Execl Throughput                                247.9 lps   (29.3 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks         42586.5 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           13446.9 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        113641.8 KBps  (30.0 s, 2 samples)
Pipe Throughput                              178788.4 lps   (10.1 s, 7 samples)
Pipe-based Context Switching                  23390.6 lps   (10.1 s, 7 samples)
Process Creation                                609.8 lps   (30.1 s, 2 samples)
Shell Scripts (1 concurrent)                    531.9 lpm   (60.3 s, 2 samples)
Shell Scripts (8 concurrent)                     66.4 lpm   (61.4 s, 2 samples)
System Call Overhead                         472401.4 lps   (10.1 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0    2365565.8    202.7
Double-Precision Whetstone                       55.0       1836.2    333.9
Execl Throughput                                 43.0        247.9     57.7
File Copy 1024 bufsize 2000 maxblocks          3960.0      42586.5    107.5
File Copy 256 bufsize 500 maxblocks            1655.0      13446.9     81.3
File Copy 4096 bufsize 8000 maxblocks          5800.0     113641.8    195.9
Pipe Throughput                               12440.0     178788.4    143.7
Pipe-based Context Switching                   4000.0      23390.6     58.5
Process Creation                                126.0        609.8     48.4
Shell Scripts (1 concurrent)                     42.4        531.9    125.4
Shell Scripts (8 concurrent)                      6.0         66.4    110.7
System Call Overhead                          15000.0     472401.4    314.9
                                                                   ========
System Benchmarks Index Score                                         122.9

ということで、インデックススコアはシングルで113.6、4パラレルで122.9ということだった。僅かな差だけど4スレッドの方が数値良いんだ・・・

UnixBench結果を比較

UnixBench比較
せっかくなので過去に計測したNanoPi NEO, NEO2とも比較してみた。

NanoPi NEO(1.2GHz時)はシングルで197.4、4パラレルで536.4だったので、インデックススコアの数値で比較するとRPi0WはそれぞれNanoPi NEO(無印)の58%と 23%という結果。

NanoPi NEO2ではシングルで298.9、4パラレルで771.2ということで、インデックススコアの数値で比較するとRPi0WはそれぞれNanoPi NEO2の38%と16%というとても寂しい結果となった。

RPi0Wはシングルコア・シングルスレッドなCPUなので4パラレルでは数値が上がらずかなり非力に見える。
でも、発熱少ないから冷却はあまり意識しなくて良さそう。それに用途さえ間違えなければ性能としては十分。

関連記事:
Up