You are not logged in.

#1 2021-02-10 08:40:01

Registered: 2016-01-06
Posts: 64

[solved] md0 3x slower than single ssd?

[Edit]: Got the answer. Sometime one just has to write down things to see clearly. The single drives of course had a higher parallelism, but have in terms of bytes per second been slower. Sorry for the noise


I've just run some fio benchmarks and am a bit baffled by the results. I've put four ~500G SSD into an md5, md0 and single configuration an ran fio over the respective raw devices. Then I put a filesystem onto it and run fstrim on the empty devices. After each run I've wipefs'd and blkdicarded the ssd before the next run.
Especially the iops went in times down to 10/sec and below. I did not expect this to happen with ssd.

An example, here from a single ssd.
Jobs: 4 (f=4): [w(4)][97.6%][w=768KiB/s][w=12 IOPS][eta 51m:22s]       

The results:
md raid5:

# mdadm --create /dev/md5 --metadata 1.2 --chunk=64K --level=5 --raid-devices=4 /dev/raidmd5/md5disk[1234]
# time fio --rw=randwrite --name=test --filename=/dev/md5  --numjobs=4 --direct=1 --group_reporting --bs=64k
Run status group 0 (all jobs):
  WRITE: bw=9.00MiB/s (10.5MB/s), 9.00MiB/s-9.00MiB/s (10.5MB/s-10.5MB/s), io=5364GiB (5760GB), run=549357751-549357751msec

Disk stats (read/write):
    md5: ios=193/87883852, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=44232540/92896209, aggrmerge=657895998/655741185, aggrticks=164471017/362353523, aggrin_queue=301009632, aggrutil=60.03%
  sdn: ios=44231217/92900345, merge=657893757/655737139, ticks=154006437/336971822, in_queue=265684940, util=54.34%
  sdm: ios=44232615/92886653, merge=657898023/655750715, ticks=171414414/379706964, in_queue=322384590, util=58.29%
  sdl: ios=44233935/92879654, merge=657895439/655757670, ticks=178172274/396188413, in_queue=346081510, util=60.03%
  sdk: ios=44232395/92918187, merge=657896776/655719217, ticks=154290943/336546893, in_queue=269887490, util=55.55%

real    9155m58,264s
user    42m9,441s
sys     238m54,187s

That's over 6 days for filling the volume 4 times. When given a filesystem, the fstrim command runs for about 3 hours.

md raid0:

# mdadm --create /dev/md0 --metadata 1.2 --chunk=64K --level=0 --raid-devices=4 /dev/raidmd5/md5disk[1234]
# time fio --rw=randwrite --name=test --filename=/dev/md0  --numjobs=4 --direct=1 --group_reporting --bs=64k
Run status group 0 (all jobs):
  WRITE: bw=27.7MiB/s (29.1MB/s), 27.7MiB/s-27.7MiB/s (29.1MB/s-29.1MB/s), io=7152GiB (7679GB), run=264174828-264174828msec

Disk stats (read/write):
    md0: ios=305/117179292, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=75/29294836, aggrmerge=6/0, aggrticks=724/260996913, aggrin_queue=224584482, aggrutil=47.47%
  sdo: ios=64/29294836, merge=10/0, ticks=361/218596537, in_queue=187761190, util=40.77%
  sdn: ios=54/29294836, merge=5/0, ticks=851/311932153, in_queue=269415340, util=47.47%
  sdm: ios=34/29294836, merge=10/0, ticks=308/271047711, in_queue=232777870, util=44.97%
  sdl: ios=148/29294836, merge=0/0, ticks=1378/242411253, in_queue=208383530, util=42.41%
real    4402m55,224s
user    32m16,145s
sys     83m15,420s

Thats 3 days+. The fstrim command takes runs around 90-100 minutes, when given a filesystem

Simultaneously running fio on all 4 single devices, here is one example, they all have been within ~2h of completion time:

# time fio --rw=randwrite --name=test --filename=/dev/sdl  --numjobs=4 --direct=1 --group_reporting --bs=64k 
Run status group 0 (all jobs):                                                                                                    
  WRITE: bw=17.0MiB/s (17.9MB/s), 17.0MiB/s-17.0MiB/s (17.9MB/s-17.9MB/s), io=1789GiB (1920GB), run=107467219-107467219msec       
Disk stats (read/write):                                                                                                          
  sdl: ios=139/29303214, merge=0/0, ticks=692/428209730, in_queue=364955680, util=100.00% 

real    1791m7,506s                                                                                                               
user    9m18,451s                                                                                                                 
sys     27m7,265s   

Thats around a day and a quarter. I may add, the writecache had been disabled for all 4 ssd. What is going on with mdraid? Have I made a fundamental mistake for getting such lousy numbers?

Thanks for any insight

Last edited by EdeWolf (2021-02-10 08:46:08)


Board footer

Powered by FluxBB