You are not logged in.

#1 2024-10-08 09:40:26

FireCulex
Member
Registered: 2024-04-10
Posts: 2

[SOLVED] Extremely Poor Raid 5 Write Performance

When copying to my Raid 5 in Arch, it copies about a gigabyte in samba and then it dies for all intents and purposes.I assume thats the cache. The load average goes above 20, idk what it's waiting for.

Linux Titan 6.11.2-arch1-1 #1 SMP PREEMPT_DYNAMIC Fri, 04 Oct 2024 21:51:11 +0000 x86_64 GNU/Linux

Iostat -

Device            r/s     rkB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wkB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dkB/s   drqm/s  %drqm d_await dareq-sz     f/s f_await  aqu-sz  %util

loop20           0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00 100.00
loop21           0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
loop22           0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
md127            0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00 100.00

800kb/s??!

[culex@Titan share]$ fio --name=writefile --ioengine=libaio --rw=write --bs=1M --size=1G --numjobs=1 --runtime=30 --time_based --direct=1
writefile: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=1
fio-3.37
Starting 1 process
writefile: Laying out IO file (1 file / 1024MiB)
Jobs: 1 (f=1): [W(1)][2.3%][eta 22m:13s]
writefile: (groupid=0, jobs=1): err= 0: pid=10174: Mon Oct  7 23:11:58 2024
  write: IOPS=0, BW=781KiB/s (800kB/s)(25.0MiB/32765msec); 0 zone resets
    slat (usec): min=162, max=29141, avg=1356.42, stdev=5788.61
    clat (usec): min=1473, max=32672k, avg=1309198.25, stdev=6533901.21
     lat (usec): min=1652, max=32672k, avg=1310554.67, stdev=6533664.34
    clat percentiles (usec):
     |  1.00th=[    1467],  5.00th=[    1614], 10.00th=[    1614],
     | 20.00th=[    1647], 30.00th=[    1680], 40.00th=[    1762],
     | 50.00th=[    1893], 60.00th=[    1909], 70.00th=[    1975],
     | 80.00th=[    2008], 90.00th=[    2147], 95.00th=[   16188],
     | 99.00th=[17112761], 99.50th=[17112761], 99.90th=[17112761],
     | 99.95th=[17112761], 99.99th=[17112761]
   bw (  KiB/s): min=49152, max=49152, per=100.00%, avg=49152.00, stdev= 0.00, samples=1
   iops        : min=   48, max=   48, avg=48.00, stdev= 0.00, samples=1
  lat (msec)   : 2=76.00%, 4=16.00%, 20=4.00%, >=2000=4.00%
  cpu          : usr=0.02%, sys=0.00%, ctx=29, majf=0, minf=11
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,25,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=781KiB/s (800kB/s), 781KiB/s-781KiB/s (800kB/s-800kB/s), io=25.0MiB (26.2MB), run=32765-32765msec

Disk stats (read/write):
    dm-1: ios=0/24, sectors=0/49152, merge=0/0, ticks=0/80, in_queue=80, util=99.89%, aggrios=0/141, aggsectors=0/52128, aggrmerge=0/0, aggrticks=0/3148260, aggrin_queue=3148260, aggrutil=99.70%
    md127: ios=0/141, sectors=0/52128, merge=0/0, ticks=0/3148260, in_queue=3148260, util=99.70%, aggrios=38/3282, aggsectors=309/26251, aggrmerge=0/0, aggrticks=12/32824, aggrin_queue=65493, aggrutil=99.77%
  loop21: ios=44/3277, sectors=352/26209, merge=0/0, ticks=15/32874, in_queue=65594, util=99.77%
  loop22: ios=72/3249, sectors=576/25985, merge=0/0, ticks=21/32741, in_queue=65343, util=99.40%
  loop20: ios=0/3321, sectors=0/26561, merge=0/0, ticks=0/32858, in_queue=65543, util=99.71%

When I hook up my old box to the same RAID using Ubuntu it transfers 39.0MB/s according to fio.

Device            r/s     rkB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wkB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dkB/s   drqm/s  %drqm d_await dareq-sz     f/s f_await  aqu-sz  %util
loop20           0.00      0.00     0.00   0.00    0.00     0.00 1368.00   5472.00     0.00   0.00    1.52     4.00    0.00      0.00     0.00   0.00    0.00     0.00    4.00  109.00    2.52  51.20
loop21           0.00      0.00     0.00   0.00    0.00     0.00 1366.00   5464.00     0.00   0.00    0.34     4.00    0.00      0.00     0.00   0.00    0.00     0.00    2.00   35.50    0.54  14.00
loop22           0.00      0.00     0.00   0.00    0.00     0.00 1376.00   5504.00     0.00   0.00    5.59     4.00    0.00      0.00     0.00   0.00    0.00     0.00    6.00  193.17    8.85 122.00
md127            0.00      0.00     0.00   0.00    0.00     0.00   14.00  10796.00     0.00   0.00  534.29   771.14    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    7.48 132.80

I'm uncertain how to diagnose what's causing this.  When I do a sync_action check on Ubuntu or Arch, the speed is the same.I tried both kernels 6.8.7 and 6.11.2.

edit: I suspect it might be the AX88179A

[culex@Titan share]$ iperf -c 192.168.1.11
------------------------------------------------------------
Client connecting to 192.168.1.11, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  1] local 192.168.1.210 port 59160 connected with 192.168.1.11 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.0000-10.3027 sec  14.1 MBytes  11.5 Mbits/sec
[culex@Titan share]$ iperf -c 192.168.1.11
------------------------------------------------------------
Client connecting to 192.168.1.11, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  1] local 192.168.1.210 port 58666 connected with 192.168.1.11 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.0000-10.0522 sec   420 MBytes   351 Mbits/sec

Client connecting to 192.168.1.13, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  1] local 192.168.1.210 port 56552 connected with 192.168.1.13 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.0000-10.2295 sec  14.0 MBytes  11.5 Mbits/sec
[culex@Titan share]$ iperf -c 192.168.1.13
------------------------------------------------------------
Client connecting to 192.168.1.13, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  1] local 192.168.1.210 port 60200 connected with 192.168.1.13 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.0000-10.0309 sec   419 MBytes   351 Mbits/sec

update: I switched from the AX88179A driver to cdc_ncm and no more issue.

Last edited by FireCulex (2024-10-11 09:02:54)

Offline

Board footer

Powered by FluxBB