kvm性能测试

w12dw · 发表于 2015-5-20 09:55:31

kvm性能测试
测试的主要目的是观察kvm上安装的虚拟机对物力资源的性能损耗。这次主要是对kvm虚拟化的cpu，内存和io进行性能对比测试，具体方法是：在非虚拟化的原生系统中执行某个基准测试程序，然后将该测试程序放到与原生系统配置相近的虚拟客户机中执行，接着对比在虚拟化和非虚拟化环境中该测试程序执行的性能。为了测试的准确性，尽量保证测试环境和原生系统环境的一致性。在/etc/grub/grub.cfg文件中，在启动内核的那一行添加maxcpus=2 nr_cpus=2 mem=2G这几个参数即可限制linux内核加载的cpu核心数和内存大小。

1
2
3
4

set root='(hd0,msdos1)'
      search --no-floppy --fs-uuid --set=root 3940bb4d-c220-4cb5-b4f5-6dd11c5ecb44
      linux /boot/vmlinuz-3.2.0-83-generic root=UUID=3940bb4d-c220-4cb5-b4f5-6dd11c5ecb44 ro quiet maxcpus=2 nr_cpus=2 mem=2G
      initrd  /boot/initrd.img-3.2.0-83-generic

上面是ubuntu中的/etc/grub/grub.cfg文件内容，redhat系的略有区别。
原生系统和虚拟机上的系统都是1颗cpu，2个核心，2G的内存。由于对io测试时，仅读取512M的大小进行测试，所以物理机和虚拟机上磁盘大小的区别影响不大。

cpu性能测试
对cpu的性能测试选择Super PI这个工具，本次Super PI的基准测试中选择计算圆周率π的小数点后面2的20次方个数据位和2的24次方个数据位.在计算完成后，程序会输出本次计算所花费的时间。命令如下：

1 2	root@ubuntu:~/super_pi# ./super_pi 20 ...

1 2	root@ubuntu:~/super_pi# ./super_pi 24 ...

在x86_64架构的系统上运行Super PI执行程序，可能会找不到ld-linux.so.2共享库，这是由于Super PI程序比较老，在ubuntu上安装下libc6-i386包即可。

1
2
3
4

root@ubuntu:~/super_pi# apt-cache search libc6-i386
libc6-i386 - Embedded GNU C Library: 32-bit shared libraries for AMD64
root@ubuntu:~/super_pi# apt-get install libc6-i386
...

程序运行结束后，会输出Total calculation(I/O) time：

./pi 20	第一次测试	第二次	第三次	第四次	第五次
host_ubuntu	12.037	11.785	11.744	11.911	11.852
virt_ubuntu	11.986	11.925	11.994	12.04	11.919

./pi 24
host_ubuntu	333.967	332.558	331.512	335.048	331.745
virt_ubuntu	342.457	342.003	339.275	342.685	343.375

通过比较可以看出kvm虚拟化中cpu性能为原生系统的97%左右。

内存性能测试

内存的测试使用LMbench这款工具，LMbench中包含很多简单的基准测试，覆盖了文档读写、内存操作、管道、系统调用、上下文切换、进程创建和销毁、网络等多方面的性能测试。另外，LMbench能够对同级别的系统进行比较测试，反映不同系统的优劣势，通过选择不同的库函数就能够比较库函数的性能。

接下来从网上下载LMbench，下载得到lmbench3.tar.gz，解压之后，运行make即可进行编译。

1
2
3
4

root@ubuntu:/home/luyi# tar -zx -f lmbench3.tar.gz -C lmbench3
root@ubuntu:/home/luyi# cd lmbench3/
root@ubuntu:/home/luyi/lmbench3# make
...

在编译过程中可能会遇到如下错误提示：

1
2
3
4
5

make[2]: *** No rule to make target `../SCCS/s.ChangeSet', needed by `bk.ver'. Stop.
make[2]: Leaving directory `/home/luyi/lmbench3/lmbench3/src'
make[1]: *** [lmbench] Error 2
make[1]: Leaving directory `/home/luyi/lmbench3/lmbench3/src'
make: *** [build] Error 2

新建相关目录和文件即可绕过该错误，然后运行make results来进行测试：

1
2
3
4
5

root@ubuntu:/home/luyi/lmbench3/lmbench3# mkdir SCCS ; touch SCCS/s.ChangeSet
root@ubuntu:/home/luyi/lmbench3/lmbench3# make
...
root@ubuntu:/home/luyi/lmbench3/lmbench3# make results
...

运行make results后，在正式运行测试之前，会有一些交互式的操作以便确认测试时使用的具体配置，多数提示只需要按Enter键选择默认值即可在本次测试中，没有使用默认值的配置有3个：LMbench测试的内存值、处理器时钟频率，以及是否将测试结果发到LMbench3的官方邮箱。
cpu的时钟频率可以参考这个：

1
2
3

luyi@ubuntu:~$ cat /proc/cpuinfo | grep "model name"
model name : Pentium(R) Dual-Core CPU E5700 @ 3.00GHz
model name : Pentium(R) Dual-Core CPU E5700 @ 3.00GHz

没有使用默认值的配置：

1
2
3
4
5
6
7

MB [default 2744] 1024 #测试的内存越大，需要的时间越长
Checking to see if you have 1024 MB; please wait for a moment...
...
Processor mhz [default 2997 MHz, 0.3337 nanosec clock] 3000
...
Mail results [default yes] no
OK, no results mailed.

LMbench根据配置文档执行完成所需要的测试项之后，在results目录下根据系统类型、系统名和操作系统类型等生成一个子目录，测试结果文档按照“主机名+序号”的命名方式存放于该目录下。运行make see命令可以查看测试结果报告及其说明。

1	root@ubuntu:/home/luyi/lmbench3/lmbench3/results# make see

可以将测试的结果文档统一放在lmbench3/lmbench3/results/x86_64-linux-gnu目录下，然后运行make see命令即可查看到非常直观的结果对比报告。下面是测试的两组数据，原生系统上和虚拟化环境中各3次测试：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123

L M B E N C H  3 . 0 S U M M A R Y
               ------------------------------------
      (Alpha software, do not distribute)

Basic system parameters
------------------------------------------------------------------------------
Host                OS Description             Mhz  tlb  cache  mem scal
                                                   pages line par load
                                                         bytes
--------- ------------- ----------------------- ---- ----- ----- ------ ----
baby-ubun Linux 3.16.0-       x86_64-linux-gnu 2300 32 128 3.6500 1
baby-ubun Linux 3.16.0-       x86_64-linux-gnu 2300 32 128 3.6000 1
baby-ubun Linux 3.16.0-       x86_64-linux-gnu 2300 32 128 3.7100 1
virt-ubun Linux 3.16.0-       x86_64-linux-gnu 2300 32 128 3.7800 1
virt-ubun Linux 3.16.0-       x86_64-linux-gnu 2300 32 128 3.7800 1

Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host                OS  Mhz null null    open slct sig  sig  fork exec sh
                           call  I/O stat clos TCP  inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
baby-ubun Linux 3.16.0- 2300 0.05 0.12 0.34 1.12 3.04 0.12 0.78 91.6 261. 607.
baby-ubun Linux 3.16.0- 2300 0.05 0.12 0.35 1.11 2.93 0.12 0.78 97.8 266. 610.
baby-ubun Linux 3.16.0- 2300 0.05 0.12 0.34 1.11 2.91 0.12 0.77 95.8 261. 605.
virt-ubun Linux 3.16.0- 2300 0.05 0.12 0.35 1.20 2.97 0.12 0.88 99.0 289. 653.
virt-ubun Linux 3.16.0- 2300 0.05 0.12 0.35 1.11 2.93 0.12 0.86 98.4 278. 641.
virt-ubun Linux 3.16.0- 2300 0.05 0.12 0.35 1.14 2.92 0.12 0.90 105. 290. 660.

Basic integer operations - times in nanoseconds - smaller is better
-------------------------------------------------------------------
Host                OS  intgr intgr  intgr  intgr  intgr
                        bit add mul div mod
--------- ------------- ------ ------ ------ ------ ------
baby-ubun Linux 3.16.0- 0.3500 0.0500 1.0700 7.8100 8.7300
baby-ubun Linux 3.16.0- 0.3500 0.0500 1.0700 7.8300 8.7600
baby-ubun Linux 3.16.0- 0.3500 0.0500 1.0700 7.8100 8.7400
virt-ubun Linux 3.16.0- 0.3500 0.0500 1.0700 7.8200 8.7600
virt-ubun Linux 3.16.0- 0.3500 0.0500 1.0700 7.8300 8.7600
virt-ubun Linux 3.16.0- 0.3500 0.0500 1.0700 7.8500 8.7500

Basic float operations - times in nanoseconds - smaller is better
-----------------------------------------------------------------
Host                OS  float  float  float  float
                     add mul div bogo
--------- ------------- ------ ------ ------ ------
baby-ubun Linux 3.16.0- 1.0400 1.7300 4.9600 5.0000
baby-ubun Linux 3.16.0- 1.0400 1.7300 4.9600 5.0000
baby-ubun Linux 3.16.0- 1.0400 1.7300 4.9500 5.0000
virt-ubun Linux 3.16.0- 1.0400 1.7300 4.9600 4.9700
virt-ubun Linux 3.16.0- 1.0400 1.7400 4.9700 4.9800
virt-ubun Linux 3.16.0- 1.0400 1.7300 4.9600 4.9900

Basic double operations - times in nanoseconds - smaller is better
------------------------------------------------------------------
Host                OS  double double double double
                     add mul div bogo
--------- ------------- ------  ------ ------ ------
baby-ubun Linux 3.16.0- 1.0400 1.7300 7.7200 7.6100
baby-ubun Linux 3.16.0- 1.0400 1.7300 7.7200 7.6100
baby-ubun Linux 3.16.0- 1.0400 1.7300 7.7400 7.6100
virt-ubun Linux 3.16.0- 1.0400 1.7300 7.7400 7.6300
virt-ubun Linux 3.16.0- 1.0400 1.7300 7.7400 7.6300
virt-ubun Linux 3.16.0- 1.0400 1.7300 7.7300 7.6200

Context switching - times in microseconds - smaller is better
-------------------------------------------------------------------------
Host                OS  2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
                     ctxsw  ctxsw  ctxsw ctxsw  ctxsw ctxsw ctxsw
--------- ------------- ------ ------ ------ ------ ------ ------- -------
baby-ubun Linux 3.16.0- 1.2700 1.1700 1.2900 1.6000 1.9200 1.75000 2.06000
baby-ubun Linux 3.16.0- 1.2500 1.2100 1.2700 1.5600 1.9300 1.73000 2.16000
baby-ubun Linux 3.16.0- 1.2800 1.2400 1.2400 1.5800 1.9800 1.72000 2.04000
virt-ubun Linux 3.16.0- 1.2600 1.2000 1.4500 1.6200 2.2200 1.83000 2.42000
virt-ubun Linux 3.16.0- 1.2000 1.2300 1.4800 1.7000 2.1900 1.81000 2.32000
virt-ubun Linux 3.16.0- 1.2900 1.2800 1.7200 1.7900 2.5500 2.06000 2.72000

*Local* Communication latencies in microseconds - smaller is better
---------------------------------------------------------------------
Host                OS 2p/0K  Pipe AF    UDP  RPC/ TCP  RPC/ TCP
                     ctxsw    UNIX       UDP       TCP conn
--------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
baby-ubun Linux 3.16.0- 1.270 3.210 4.51  13.2       15.6       28.
baby-ubun Linux 3.16.0- 1.250 3.211 4.36  13.0       15.3       27.
baby-ubun Linux 3.16.0- 1.280 3.266 4.38  13.2       15.6       27.
virt-ubun Linux 3.16.0- 1.260 3.230 4.59 7.849       10.9       18.
virt-ubun Linux 3.16.0- 1.200 3.095 4.31 7.716       31.5       33.
virt-ubun Linux 3.16.0- 1.290 3.373 4.61 7.964       11.2       32.

File & VM system latencies in microseconds - smaller is better
-------------------------------------------------------------------------------
Host                OS 0K File    10K File    Mmap Prot Page 100fd
                     Create Delete Create Delete Latency Fault  Fault  selct
--------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
baby-ubun Linux 3.16.0- 7.3678 5.6366 16.6 9.1331  6604.0 0.234 0.21320 1.222
baby-ubun Linux 3.16.0- 7.0883 5.6481 16.8 9.1536  6637.0 0.239 0.21540 1.221
baby-ubun Linux 3.16.0- 7.0470 5.7672 16.2 8.9707  6625.0 0.248 0.21420 1.223
virt-ubun Linux 3.16.0- 7.1741 5.8160 16.4 9.0638  7085.0 0.297 0.23500 1.224
virt-ubun Linux 3.16.0- 7.0921 5.7670 16.4 9.0966  7162.0 0.296 0.23360 1.225
virt-ubun Linux 3.16.0- 7.1873 5.8700 16.9 9.2117  7485.0 0.258 0.28000 1.227

*Local* Communication bandwidths in MB/s - bigger is better
-----------------------------------------------------------------------------
Host             OS  Pipe AF TCP  File Mmap  Bcopy  Bcopy  Mem Mem
                           UNIX    reread reread (libc) (hand) read write
--------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- -----
baby-ubun Linux 3.16.0- 5196 5689 4410 5486.4 8603.2 4637.3 3200.9 7920 4828.
baby-ubun Linux 3.16.0- 5205 5639 4350 5463.1 8593.6 4634.6 3201.2 7917 4827.
baby-ubun Linux 3.16.0- 5199 5648 4473 5472.4 8599.0 4636.9 3201.2 7920 4828.
virt-ubun Linux 3.16.0- 4810 5518 3499 6370.3  11.0K 3176.1 3144.7 7828 4745.
virt-ubun Linux 3.16.0- 4909    3711 6013.2  10.2K 3179.5 3145.5 7795 4739.
virt-ubun Linux 3.16.0- 4587 5221 3606 5524.6 8882.7 3153.5 3116.3 7750 4688.

Memory latencies in nanoseconds - smaller is better
(WARNING - may not be correct, check graphs)
------------------------------------------------------------------------------
Host                OS Mhz L1 $ L2 $ Main mem Rand mem Guesses
--------- ------------- --- ---- ---- -------- -------- -------
baby-ubun Linux 3.16.0-  2300 1.3830 4.1490 20.9       76.8
baby-ubun Linux 3.16.0-  2300 1.3830 4.1500 21.4       78.7
baby-ubun Linux 3.16.0-  2300 1.3830 4.1490 21.4       77.5
virt-ubun Linux 3.16.0-  2300 1.3850 5.4010 21.4    122.5
virt-ubun Linux 3.16.0-  2300 1.3860 4.3540 22.0    125.5
virt-ubun Linux 3.16.0-  2300 1.3850 4.1840 22.4    125.8

从上面的测试结果可以看出，kvm虚拟化中内存的带宽和延迟，与原生系统相比都比较接近的。所以，可以粗略的得出结论：在硬件提供的内存虚拟化技术（如Intel的EPT）支持下，QEMU/KVM的内存虚拟化性能比较良好，可以达到原生系统95%以上的性能。
磁盘I/O性能测试
采用IOzone工具来进行测试，IOzone可以通过多种文件系统操作（如普通的读写、重读、重写、随机的读写）来衡量一个文件系统的性能。
下载IOzone源代码，解压后进入iozone3_414/src/current目录下运行make linux-AMD64命令即可编译。编译完成后，当前目录就生成了iozone可执行文件。

1	root@ubuntu:/home/luyi/iozone3_414/iozone3_414/src/current# ./iozone -s 512m -r 8k -S 2048 -L 64 -I -i 0 -i 1 -i 2 -Rab iozone.xls

在上面的命令参数中，-s 512m表示用于测试的文件大小为512M，-r 8k表示一条记录的大小（一次读写操作的大小）位8kb，-S 2048表示本机的缓存大小是2048kb，-L 64表示缓存线路大小位64字节，-I表示使用直接I/O方式读写绕过也页面缓存，-i 0 -i 1 -i 2表示运行“0=write/rewrite，1=read/re-read，2=random-read/write”这三种测试，-Rab iozone.xls表示运行完整的自动模式进行测试并生成Excel格式的报告iozone.xls。其中-S、-L的值通过如下命令查询得到，这两个值也可以让IOzone自己决定：

1
2
3
4
5

root@ubuntu:/home/luyi/iozone3_414/iozone3_414/src/current# cat /proc/cpuinfo | grep cache
cache size : 2048 KB
cache_alignment : 64
cache size : 2048 KB
cache_alignment : 64

1k(一次读写操作的大小)	Writer Report	Re-writer Report	Reader Report	Re-reader Report	Random Read Report	Random Write Report
host（物理机）	1.67m/s	9.31m/s	14.02m/s	13.89m/s	0.17m/s	0.26m/s
virt-none（虚拟机，cache=none）	1.47m/s	6.71m/s	7.37m/s	7.65m/s	0.17m/s	0.25m/s
virt-default（虚拟机，cache=default（writeback））	17.56m/s	16.15m/s	17.87m/s	18.62m/s	21.17m/s	2.52m/s
virt-writethrough（虚拟机，cache=writethrough）	0.11m/s	0.11m/s	21.84m/s	21.83m/s	21.65m/s	0.12m/s

8k(一次读写操作的大小)	Writer Report	Re-writer Report	Reader Report	Re-reader Report	Random Read Report	Random Write Report
host	63.15m/s	67.02m/s	71.75m/s	70.45m/s	1.10m/s	1.83m/s
virt-none	41.02m/s	40.75m/s	43.78m/s	45.30m/s	1.01m/s	1.82m/s
virt-default	125.04m/s	146.71m/s	161.26m/s	161.11m/s	160.65m/s	16.01m/s
virt-writethrough	0.91m/s	0.91m/s	164.16m/s	164.19m/s	163.01m/s	0.85m/s

1m（..）	Writer Report	Re-writer Report	Reader Report	Re-reader Report	Random Read Report	Random Write Report
host	98.03m/s	98.58m/s	101.67m/s	101.67m/s	57.26m/s	61.18m/s
virt-none	95.73m/s	98.34m/s	100.54m/s	100.22m/s	56.71m/s	65.30m/s
virt-default	168.07m/s	173.64m/s	2609.87m/s	2676.79m/s	2777.94m/s	141.64m/s
virt-writethrough	52.83m/s	52.52m/s	3283.17m/s	3317.91m/s	3221.16m/s	40.71m/s

8m（..）	Writer Report	Re-writer Report	Reader Report	Re-reader Report	Random Read Report	Random Write Report
host	100.30m/s	100.80m/s	102.05m/s	101.88m/s	92.71m/s	82.89m/s
virt-none	81.90m/s	86.50m/s	97.41m/s	98.81m/s	90.49m/s	80.74m/s
virt-default	210.12m/s	171.43m/s	2691.20m/s	2700.01m/s	2682.85m/s	185.32m/s
virt-writethrough	62.02m/s	64.35m/s	2546.63m/s	2624.13m/s	2663.87m/s	59.44m/s

通过设置虚拟磁盘的读写方式以及测试时一次读写操作的大小得到以上数据。
虚拟磁盘的cache_mode选择none，可以绕过页面缓存（页面缓存可以大大提高虚拟磁盘的访问速度，所以当cache=writeback时虚拟磁盘的性能非常不错，但是意外断电，可能会造成数据丢失），如果是要观察虚拟磁盘的性能损耗，可以观察host和virt-none这两组数据。I/O性能的好坏与“一次完成的读写操作的大小”有关，当一次完成的读写操作的大小比较大时（1m或8m），虚拟磁盘的性能与物理磁盘的性能越是接近，当一次完成的读写操作的大小较小时（1k或8k），虚拟磁盘的性能大概是物理磁盘的60%-70%。

账号		自动登录	找回密码
密码			立即注册

大疆运维招人啦，

C++ :try 语句块和异常处理

C++的多态

Red Hat RHCE 8 (EX294) Cert Guide

Java/C++ 区别：看完这一篇，就够用！

别再用过时库了！这 13 个顶级 C++ 库才是

c++ size_t 和 int 的区别

[经验分享] kvm性能测试

浏览过的版块

扫码加入运维网微信交流群