Software RAID Perf Test

오늘은 1TB x 10 장, NVME x 1장으로 Software RAID 구성해서 성능 테스트한 결과를 정리 해볼려고 합니다. 아래는 구성되어 있는 block device 구성 환경 기준으로 테스트 한것입니다.

  • Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz
  • CentOS7.4 / 3.10.0-693.11.6.el7.x86_64
  • IO Controller , Onboard SATA Controller
  • 64GB Memory 2400Mhz
  • 1TB 7200RPM SATA3 Disk x 10
  • 1.6TB NVME Disk x 1

block device 목록

[root@hci1 ~]# lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda           8:0    0 931.5G  0 disk 
└─sda1        8:1    0 931.5G  0 part 
sdb           8:16   0 931.5G  0 disk 
└─sdb1        8:17   0 931.5G  0 part 
sdc           8:32   0 931.5G  0 disk 
└─sdc1        8:33   0 931.5G  0 part 
sdd           8:48   0 931.5G  0 disk 
└─sdd1        8:49   0 931.5G  0 part 
sde           8:64   0 931.5G  0 disk 
└─sde1        8:65   0 931.5G  0 part 
sdf           8:80   0 931.5G  0 disk 
└─sdf1        8:81   0 931.5G  0 part 
sdg           8:96   0 931.5G  0 disk 
└─sdg1        8:97   0 931.5G  0 part 
sdh           8:112  0 931.5G  0 disk 
└─sdh1        8:113  0 931.5G  0 part 
sdi           8:128  0 931.5G  0 disk 
└─sdi1        8:129  0 931.5G  0 part 
sdj           8:144  0 931.5G  0 disk 
└─sdj1        8:145  0 931.5G  0 part 
nvme0n1     259:0    0   1.5T  0 disk 
├─nvme0n1p1 259:1    0   200M  0 part /boot/efi
├─nvme0n1p2 259:2    0     1G  0 part /boot
├─nvme0n1p3 259:3    0     1G  0 part [SWAP]
├─nvme0n1p4 259:4    0   128G  0 part /
└─nvme0n1p5 259:5    0   1.3T  0 part 

SATA 1TB 7200RPM 6Gbps Link Disk x 10EA Performance , MD-RAID0 , Write

----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw 
  0   5  95   0   0   0|   0   537M| 726B  350B|   0     0 |2496   439 
  0   7  92   1   0   0|   0  1805M| 666B  358B|   0     0 |6071  1933 
  0   7  92   1   0   0|   0  1837M| 666B  358B|   0     0 |6148  1962 
  0   7  92   1   0   1|   0  1835M| 666B  358B|   0     0 |6144  2177 
  0   7  92   1   0   1|   0  1730M| 666B  358B|   0     0 |5940  2281 
  0   7  92   1   0   0|   0  1686M| 726B  358B|   0     0 |5900  2456 
  0   7  92   1   0   1|   0  1834M| 666B  358B|   0     0 |6264  2645 
  0   5  92   2   0   0|   0  1496M| 666B  358B|   0     0 |5189  2270 
  0   6  92   2   0   1|   0  1725M| 666B  358B|   0     0 |5626  2331 
  0   6  92   2   0   1|   0  1746M| 666B  358B|   0     0 |5864  2933 
  0   6  92   2   0   1|   0  1578M| 726B  358B|   0     0 |5443  2724 
  0   6  92   2   0   1|   0  1743M| 606B  358B|   0     0 |5665  2202 
  0   6  92   2   0   1|   0  1722M| 666B  358B|   0     0 |5767  2775 
  0   5  92   2   0   1|   0  1610M| 666B  358B|   0     0 |  13k   51k
  0   2  93   5   0   0|   0  1027M| 666B  358B|   0     0 |  33k  167k
  0   3  95   2   0   0|   0  1883M| 726B  358B|   0     0 |5213  1712 
  0   2  95   2   0   1|   0  1846M| 606B  358B|   0     0 |5153  2122 
  0   3  94   2   0   1|   0  1788M| 666B  358B|   0     0 |5192  2296 
  0   3  95   2   0   1|   0  1818M| 666B  358B|   0     0 |5148  2262 
  0   2  95   2   0   0|   0  1434M| 666B  358B|   0     0 |4143  2050 
  0   2  97   1   0   0|   0    92M| 726B  358B|   0     0 | 965   649

SATA 1TB 7200RPM 6Gbps Link Disk x 10EA Performance , MD-RAID6 , Sync (Checking). AVX-512를 사용함으로써, RAID6의 Parity 계산을 함에 있어 CPU를 대략 5% 정도선에서만 사용율을 보여주고 있습니다.

----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw 
  0   1  98   0   0   0| 309M  103M|   0     0 |   0     0 |1993  3843 
  0   4  95   0   0   0|1736M    0 | 666B  830B|   0     0 |5860    12k
  0   4  95   0   0   0|1524M    0 | 726B  358B|   0     0 |6477    19k
  0   4  95   0   0   0|1605M   30k| 726B  358B|   0     0 |5540    11k
  0   5  95   0   0   1|1847M    0 | 726B  366B|   0     0 |6188    12k
  0   5  95   0   0   0|1652M   25k| 666B  358B|   0     0 |6111    11k
  0   5  95   0   0   1|1870M    0 | 666B  366B|   0     0 |7082    12k
  0   4  95   0   0   0|1730M    0 | 666B  358B|   0     0 |5820    11k
  0   4  95   0   0   0|1377M   30k| 666B  358B|   0     0 |5834    16k
  0   5  95   0   0   1|1858M    0 | 726B  366B|   0     0 |6137    13k
  0   5  95   0   0   0|1833M    0 | 666B  358B|   0     0 |6147    12k
  0   5  94   0   0   0|1879M    0 | 666B  358B|   0     0 |6247    13k
  0   5  95   0   0   0|1705M    0 | 666B  358B|   0     0 |5866    12k
  0   4  96   0   0   0|1277M   30k| 666B  358B|   0     0 |5706    16k
  0   5  95   0   0   1|1873M    0 | 726B  366B|   0     0 |6126    13k
  0   4  95   0   0   1|1711M   25k| 666B  358B|   0     0 |5833    11k
  0   5  94   0   0   0|1865M    0 | 666B  366B|   0     0 |6212    13k
  0   5  95   0   0   0|1786M    0 | 666B  358B|   0     0 |6088    12k
  0   4  96   0   0   0|1187M   30k| 666B  358B|   0     0 |5757    15k
  0   5  95   0   0   1|1885M    0 | 726B  366B|   0     0 |6162    13k
  0   5  95   0   0   1|1853M    0 | 666B  358B|   0     0 |6188    12k
  0   5  95   0   0   1|1834M    0 | 666B  358B|   0     0 |6021    12k
  0   5  94   0   0   0|1845M    0 | 666B  358B|   0     0 |6081    12k

SATA 1TB 7200RPM 6Gbps Link Disk x 10EA Performance , MD-RAID6 , Write. File System 유무 관계 없이, Block Device 성능 및 xfs 기반의 파일 시스템 성능 모두 유사하게 나옴. RAID0 보다 상대적으로 Throughput 이 떨어지는것을 확인 할 수 있음. 아래의 결과는 대략 8~9%의 CPU를 사용하는것으로 보이고 있습니다.

----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw 
  0   6  92   2   0   0| 948k  970M| 726B  366B|   0     0 |4453  1518 
  0   6  92   2   0   0| 192k 1065M| 666B  366B|   0     0 |4604  1576 
  0   7  91   2   0   0| 492k 1134M| 666B  366B|   0     0 |5884  2197 
  0   7  91   2   0   0| 496k 1067M| 666B  366B|   0     0 |4659  1804 
  0   7  91   2   0   0| 168k 1066M| 666B  366B|   0     0 |4658  1610 
  0   7  91   2   0   0| 792k 1140M| 726B  366B|   0     0 |4877  1668 
  0   7  91   2   0   0| 292k 1105M| 666B  366B|   0     0 |4845  1773 
  0   7  91   2   0   0| 456k 1075M| 666B  366B|   0     0 |4713  1685 
  0   7  91   2   0   0| 220k 1130M| 666B  366B|   0     0 |4895  1839 
  0   7  91   2   0   0| 604k 1101M| 666B  366B|   0     0 |4964  1702 
  0   7  91   2   0   0| 224k 1099M| 726B  366B|   0     0 |5033  2009 
  0   7  91   2   0   0| 116k 1134M| 666B  366B|   0     0 |5046  1876 
  0   7  91   2   0   0|1812k 1060M| 850B  366B|   0     0 |4875  1795 
  0   7  91   2   0   0| 220k 1196M| 726B  366B|   0     0 |5288  1858 
  0   7  91   2   0   0| 284k 1046M| 910B  366B|   0     0 |4714  1752 
  0   7  91   2   0   0| 200k 1119M| 726B  366B|   0     0 |4815  1640 
  0   7  91   2   0   0| 156k 1075M|1043B  366B|   0     0 |4875  1756 
  0   7  91   2   0   0|1352k 1145M| 726B  366B|   0     0 |4964  1714 
  0   7  91   2   0   0| 228k 1086M| 726B  366B|   0     0 |4877  1710 
  0   7  91   2   0   0| 168k 1094M| 666B  366B|   0     0 |4761  1924

Intel NVME Performance , Single Disk

----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw 
  0   6  93   1   0   0|   0  1305M| 666B  350B|   0     0 |6946   356 
  0   6  92   2   0   0|   0  2189M| 666B  358B|   0     0 |  10k  554 
  0   6  92   2   0   0|   0  2754M| 726B  358B|   0     0 |  13k  657 
  0   5  92   3   0   0|   0  2683M| 666B  358B|   0     0 |  12k  732 
  0   5  92   3   0   0|   0  2656M| 726B  358B|   0     0 |  12k  638 
  0   5  92   3   0   0|   0  2771M| 666B  358B|   0     0 |  13k  639 
  0   5  92   3   0   0|   0  2745M| 666B  358B|   0     0 |  13k  645 
  0   5  92   3   0   0|   0  2648M| 726B  358B|   0     0 |  12k  686 
  0   5  92   3   0   0|   0  2738M| 666B  358B|   0     0 |  13k  637 
  0   5  92   3   0   0|   0  2731M| 666B  358B|   0     0 |  13k  638 
  0   5  92   3   0   0|   0  2625M| 726B  358B|   0     0 |  12k  649 
  0   5  92   3   0   0|   0  2616M| 726B  358B|   0     0 |  12k  682 
  0   5  92   3   0   0|   0  2722M| 726B  358B|   0     0 |  12k  622 
  0   5  92   3   0   0|   0  2695M| 666B  358B|   0     0 |  12k  701 
  0   5  92   3   0   0|   0  2617M| 666B  358B|   0     0 |  12k  666 
  0   5  92   3   0   0|   0  2658M| 666B  358B|   0     0 |  12k  697 
  0   5  92   3   0   0|   0  2697M| 666B  358B|   0     0 |  12k  656 
  0   5  92   3   0   0|   0  2628M| 726B  358B|   0     0 |  23k   50k
  0   3  93   4   0   0|   0  2584M| 666B  358B|   0     0 |  51k  146k
  0   4  93   3   0   0|   0  2777M| 666B  358B|   0     0 |  61k  171k
  0   3  94   3   0   0|   0  2809M| 666B  358B|   0     0 |  31k   52k
  0   2  96   2   0   0|   0  2822M| 666B  358B|   0     0 |  12k  621 
  0   2  96   2   0   0|   0  2869M| 726B  358B|   0     0 |  12k  642 
  0   2  96   2   0   0|   0  2872M| 666B  358B|   0     0 |  12k  708 
  0   1  96   3   0   0|   0  2326M| 666B  358B|   0     0 |  10k 1592

SATA 1TB 7200RPM 6Gbps Link Disk x 10EA Performance , MD-RAID6 , xfs , iozone throughput test. 페이지 위에 있는 내용으로 RAID6 에서의 Write 성능은 지난번 Single Disk 보다 더 크게 나왔지만 1장 대비 10장구성으로 선형적으로 늘어나지 않으며, 대략 1장당 100~120MB/sec 정도 Cover하는것을 확인, Random IO 의 경우,  Random Write 의 경우, Disk  Spindle 과 관계 없이, Single Disk 나 10장구성의 RAID6 환경의 구성에서 유사한 성능이 나옴.

[root@hci1 home]# ./iozone -s 4g -t 8 -i 0 -i 1 -i 2
	Iozone: Performance Test of File I/O
	        Version $Revision: 3.471 $
		Compiled for 64 bit mode.
		Build: linux-AMD64 

	Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
	             Al Slater, Scott Rhine, Mike Wisner, Ken Goss
	             Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
	             Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
	             Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
	             Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
	             Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer,
	             Vangel Bojaxhi, Ben England, Vikentsi Lapa,
	             Alexey Skidanov.

	Run began: Wed Jan 17 22:25:07 2018

	File size set to 4194304 kB
	Command line used: ./iozone -s 4g -t 8 -i 0 -i 1 -i 2
	Output is in kBytes/sec
	Time Resolution = 0.000001 seconds.
	Processor cache size set to 1024 kBytes.
	Processor cache line size set to 32 bytes.
	File stride size set to 17 * record size.
	Throughput test with 8 processes
	Each process writes a 4194304 kByte file in 4 kByte records

	Children see throughput for  8 initial writers 	= 1247005.56 kB/sec
	Parent sees throughput for  8 initial writers 	=  694897.33 kB/sec
	Min throughput per process 			=  148029.62 kB/sec 
	Max throughput per process 			=  159457.70 kB/sec
	Avg throughput per process 			=  155875.70 kB/sec
	Min xfer 					= 3895900.00 kB

	Children see throughput for  8 rewriters 	= 1151809.86 kB/sec
	Parent sees throughput for  8 rewriters 	=  685879.07 kB/sec
	Min throughput per process 			=  137109.09 kB/sec 
	Max throughput per process 			=  146567.69 kB/sec
	Avg throughput per process 			=  143976.23 kB/sec
	Min xfer 					= 3925492.00 kB

	Children see throughput for  8 readers 		= 20429868.50 kB/sec
	Parent sees throughput for  8 readers 		= 18093564.42 kB/sec
	Min throughput per process 			= 2377411.00 kB/sec 
	Max throughput per process 			= 3031219.00 kB/sec
	Avg throughput per process 			= 2553733.56 kB/sec
	Min xfer 					= 3289624.00 kB

	Children see throughput for 8 re-readers 	= 18819642.00 kB/sec
	Parent sees throughput for 8 re-readers 	= 17961653.25 kB/sec
	Min throughput per process 			= 2113620.50 kB/sec 
	Max throughput per process 			= 3047904.75 kB/sec
	Avg throughput per process 			= 2352455.25 kB/sec
	Min xfer 					= 2908644.00 kB

	Children see throughput for 8 random readers 	= 15020925.50 kB/sec
	Parent sees throughput for 8 random readers 	= 14607780.82 kB/sec
	Min throughput per process 			= 1851182.75 kB/sec 
	Max throughput per process 			= 1887379.25 kB/sec
	Avg throughput per process 			= 1877615.69 kB/sec
	Min xfer 					= 4113880.00 kB

	Children see throughput for 8 random writers 	=   72139.77 kB/sec
	Parent sees throughput for 8 random writers 	=   38052.40 kB/sec
	Min throughput per process 			=    8953.10 kB/sec 
	Max throughput per process 			=    9042.77 kB/sec
	Avg throughput per process 			=    9017.47 kB/sec
	Min xfer 					= 4152944.00 kB



iozone test complete.

Purley Platform 에서는 MD-RAID 성능이 최근 화자되고 있는 Security Patch를 반영후에도, 기존의 Broadwell Platform 과는 다르게 성능에 크게 영향을 주지는 않는듯 합니다. 또한, 이번 platform 에서는 avx512도 추가되어, RAID6 환경의 성능이 많이 향상된 부문도 있네요.

[root@hci1 ~]# dmesg | grep -i avx
[    1.109820] sha1_ssse3: Using AVX optimized SHA-1 implementation
[    1.109871] sha256_ssse3: Using AVX2 optimized SHA-256 implementation
[ 2085.000908]    avx       : 14896.000 MB/sec
[ 2085.086700] raid6: avx2x1   gen() 18562 MB/s
[ 2085.103659] raid6: avx2x2   gen() 21414 MB/s
[ 2085.120618] raid6: avx2x4   gen() 22753 MB/s
[ 2085.137578] raid6: avx512x1 gen() 21757 MB/s
[ 2085.154535] raid6: avx512x2 gen() 27000 MB/s
[ 2085.171494] raid6: avx512x4 gen() 31460 MB/s
[ 2085.171495] raid6: using algorithm avx512x4 gen() (31460 MB/s)
[ 2085.171497] raid6: using avx512x2 recovery algorithm

12Bay x64 Server 환경에 Software RAID를 사용할려고 하는 분들께서는 유용한 결과가 될듯 합니다. 예전대비 SAS Controller 없어도, 10G NIC의 Full 대역폭을 사용할수 있는 정도의 성능이 나오기 때문에, 최소의 비용으로 SW Raid 를 활용하여, RAID를 아주 유연하게 구성하여 운용가능할것입니다.

답글 남기기

이메일은 공개되지 않습니다. 필수 입력창은 * 로 표시되어 있습니다.