Update via pacman takes forever on RAID6

kay94 · 2025-02-27 10:28:48

Hi!

I have a computer running Arch Linux with a RAID6 consisting of 14 HDDs.

/dev/md/raid6:
           Version : 1.2
     Creation Time : Thu Dec 28 23:41:01 2023
        Raid Level : raid6
        Array Size : 5852196864 (5.45 TiB 5.99 TB)
     Used Dev Size : 487683072 (465.09 GiB 499.39 GB)
      Raid Devices : 14
     Total Devices : 14
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Thu Feb 27 11:20:08 2025
             State : active 
    Active Devices : 14
   Working Devices : 14
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : bitmap

              Name : hp:raid6
              UUID : b9b3f1e2:4f51943d:67f524d8:971d9a77
            Events : 19703

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       8       18        1      active sync   /dev/sdb2
       2       8       34        2      active sync   /dev/sdc2
       3       8       50        3      active sync   /dev/sdd2
       4       8       66        4      active sync   /dev/sde2
       5       8       82        5      active sync   /dev/sdf2
       6       8       98        6      active sync   /dev/sdg2
       7       8      114        7      active sync   /dev/sdh2
       8       8      130        8      active sync   /dev/sdi2
       9       8      146        9      active sync   /dev/sdj2
      10       8      162       10      active sync   /dev/sdk2
      11       8      178       11      active sync   /dev/sdl2
      12       8      210       12      active sync   /dev/sdn2
      13       8      194       13      active sync   /dev/sdm2

The CPU is not super high-end and the machine is equipped with 4GB of RAM:

lscpu
Architecture:             x86_64
  CPU op-mode(s):         32-bit, 64-bit
  Address sizes:          40 bits physical, 48 bits virtual
  Byte Order:             Little Endian
CPU(s):                   4
  On-line CPU(s) list:    0-3
Vendor ID:                AuthenticAMD
  Model name:             AMD Athlon(tm) 5150 APU with Radeon(tm) R3
    CPU family:           22
    Model:                0
    Thread(s) per core:   1
    Core(s) per socket:   4
    Socket(s):            1
    Stepping:             1
    CPU(s) scaling MHz:   100%
    CPU max MHz:          1600.0000
    CPU min MHz:          800.0000
    BogoMIPS:             3195.26
    Flags:                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht 
                          syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid
                           aperfmperf pni pclmulqdq monitor ssse3 cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c lahf_lm cm
                          p_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt topoext perfc
                          tr_nb bpext perfctr_llc hw_pstate proc_feedback ssbd vmmcall bmi1 xsaveopt arat npt lbrv svm_lock nri
                          p_save tsc_scale flushbyasid decodeassists pausefilter pfthreshold overflow_recov
Virtualization features:  
  Virtualization:         AMD-V
Caches (sum of all):      
  L1d:                    128 KiB (4 instances)
  L1i:                    128 KiB (4 instances)
  L2:                     2 MiB (1 instance)
NUMA:                     
  NUMA node(s):           1
  NUMA node0 CPU(s):      0-3
Vulnerabilities:          
  Gather data sampling:   Not affected
  Itlb multihit:          Not affected
  L1tf:                   Not affected
  Mds:                    Not affected
  Meltdown:               Not affected
  Mmio stale data:        Not affected
  Reg file data sampling: Not affected
  Retbleed:               Mitigation; untrained return thunk; SMT disabled
  Spec rstack overflow:   Not affected
  Spec store bypass:      Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:             Mitigation; Retpolines; STIBP disabled; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
  Srbds:                  Not affected
  Tsx async abort:        Not affected

Now I have the problem than an update via pacman takes essentially forever (more than 2 hours), with e.g. "Total Installed Size: 1526.67 MiB". Issue is not the download speed, but installation of packages takes reaaaally long. On the other hand, when I use that machine to store backups, that process is really fast.

A small test using dd also does not suggest, IO would be terribly slow:

time dd if=/dev/zero bs=1M count=512 of=test.bin conv=sync status=progress
525336576 bytes (525 MB, 501 MiB) copied, 6 s, 86.2 MB/s
512+0 records in
512+0 records out
536870912 bytes (537 MB, 512 MiB) copied, 6.36534 s, 84.3 MB/s

real	0m6.429s
user	0m0.008s
sys	0m1.497s

Does someone have any idea, what this could be related to?

Regards
Kay

Whoracle · 2025-02-27 11:00:13

On a hunch I'd suspect the package signing checks, which might take long on your APU? Out of my comfort zone here, but depending on the algorithms used this might have need to be done in software, since maybe your APU doesn't have hardware support for the crypto part, which would make that even slower. Glad if someone can correct me if I'm horribly wrong though. Could also be the unpacking, since while you're downloading "just" 1.5 gigs of data, it needs to be decompressed, which again might be slow on your APU.

kay94 · 2025-02-27 11:29:29

Thank you for your thoughts!
What I can say is, that the actual package installation also takes very long (2 hours). I did not observe, how long the "Checking package integrity" step takes (where the signatures are cryptographically verified), but it is at least not the only thing being slow.

I made following additional observations:

$> time cp /var/cache/pacman/pkg/linux-lts-6.12.16-1-x86_64.pkg.tar.zst ~

real	1m39.182s
user	0m0.000s
sys	0m0.655s
$> du -h /var/cache/pacman/pkg/linux-lts-6.12.16-1-x86_64.pkg.tar.zst
138M	/var/cache/pacman/pkg/linux-lts-6.12.16-1-x86_64.pkg.tar.zst
$> time tar xf linux-lts-6.12.16-1-x86_64.pkg.tar.zst 

real	0m1.503s
user	0m0.462s
sys	0m1.747s

So copying seems to be veeery slow, but extraction wasn't. Why, though?

Regards
Kay

Whoracle · 2025-02-27 11:46:49

Hm. What FS do you use on top of the RAID? Did you run SMART checks recently?

Also, please post the output of lsblk.

cryptearth · 2025-02-27 17:54:30

had to look up socket AM1 and the list of cpus - but even socket AM1 is from mid-2010s it seems its design for ultra-low end systems in the realms equivalent to intels celeron family
14-wide raid6 comming up in just 6TB? that sounds like a wild mix of all kinds of different drives
how they are connected? I use one of those chinese pci-e 3.0 x1 to s-ata hba myself - but with my 8 drive array using zfs I get about 800mb/s - pretty much maxing out the pci-e link
sounds very interesting - would like to get a closer look to that frankenstein system

kay94 · 2025-02-27 19:53:21

@Whoracle
Thanks for your input.

1. I am using ext4.
2. I just ran smartctl --test=long on all drives without any errors reported. For two HDDs, strangely, I had to poll the status every 10 seconds, otherwise they would suspend and the test would be "Aborted by user" (probably much longer than 10 s would worked as well).
3. Output of lsblk:

$ lsblk
NAME                  MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
sda                     8:0    1 465.8G  0 disk  
|-sda1                  8:1    1    32M  0 part  
`-sda2                  8:2    1 465.7G  0 part  
  `-md127               9:127  0   5.5T  0 raid6 
    `-md127p1         259:0    0   5.5T  0 part  
      |-vg--arch-swap 253:0    0     4G  0 lvm   [SWAP]
      `-vg--arch-root 253:1    0   5.4T  0 lvm   /
sdb                     8:16   1 465.8G  0 disk  
|-sdb1                  8:17   1    32M  0 part  
`-sdb2                  8:18   1 465.7G  0 part  
  `-md127               9:127  0   5.5T  0 raid6 
    `-md127p1         259:0    0   5.5T  0 part  
      |-vg--arch-swap 253:0    0     4G  0 lvm   [SWAP]
      `-vg--arch-root 253:1    0   5.4T  0 lvm   /
sdc                     8:32   0 465.8G  0 disk  
|-sdc1                  8:33   0    32M  0 part  
`-sdc2                  8:34   0 465.7G  0 part  
  `-md127               9:127  0   5.5T  0 raid6 
    `-md127p1         259:0    0   5.5T  0 part  
      |-vg--arch-swap 253:0    0     4G  0 lvm   [SWAP]
      `-vg--arch-root 253:1    0   5.4T  0 lvm   /
sdd                     8:48   0 465.8G  0 disk  
|-sdd1                  8:49   0    32M  0 part  
`-sdd2                  8:50   0 465.7G  0 part  
  `-md127               9:127  0   5.5T  0 raid6 
    `-md127p1         259:0    0   5.5T  0 part  
      |-vg--arch-swap 253:0    0     4G  0 lvm   [SWAP]
      `-vg--arch-root 253:1    0   5.4T  0 lvm   /
sde                     8:64   0 465.3G  0 disk  
|-sde1                  8:65   0    32M  0 part  
`-sde2                  8:66   0 465.2G  0 part  
  `-md127               9:127  0   5.5T  0 raid6 
    `-md127p1         259:0    0   5.5T  0 part  
      |-vg--arch-swap 253:0    0     4G  0 lvm   [SWAP]
      `-vg--arch-root 253:1    0   5.4T  0 lvm   /
sdf                     8:80   0 465.3G  0 disk  
|-sdf1                  8:81   0    32M  0 part  
`-sdf2                  8:82   0 465.2G  0 part  
  `-md127               9:127  0   5.5T  0 raid6 
    `-md127p1         259:0    0   5.5T  0 part  
      |-vg--arch-swap 253:0    0     4G  0 lvm   [SWAP]
      `-vg--arch-root 253:1    0   5.4T  0 lvm   /
sdg                     8:96   0 465.3G  0 disk  
|-sdg1                  8:97   0    32M  0 part  
`-sdg2                  8:98   0 465.2G  0 part  
  `-md127               9:127  0   5.5T  0 raid6 
    `-md127p1         259:0    0   5.5T  0 part  
      |-vg--arch-swap 253:0    0     4G  0 lvm   [SWAP]
      `-vg--arch-root 253:1    0   5.4T  0 lvm   /
sdh                     8:112  0 465.3G  0 disk  
|-sdh1                  8:113  0    32M  0 part  
`-sdh2                  8:114  0 465.2G  0 part  
  `-md127               9:127  0   5.5T  0 raid6 
    `-md127p1         259:0    0   5.5T  0 part  
      |-vg--arch-swap 253:0    0     4G  0 lvm   [SWAP]
      `-vg--arch-root 253:1    0   5.4T  0 lvm   /
sdi                     8:128  0 465.3G  0 disk  
|-sdi1                  8:129  0    32M  0 part  
`-sdi2                  8:130  0 465.2G  0 part  
  `-md127               9:127  0   5.5T  0 raid6 
    `-md127p1         259:0    0   5.5T  0 part  
      |-vg--arch-swap 253:0    0     4G  0 lvm   [SWAP]
      `-vg--arch-root 253:1    0   5.4T  0 lvm   /
sdj                     8:144  0 465.3G  0 disk  
|-sdj1                  8:145  0    32M  0 part  
`-sdj2                  8:146  0 465.2G  0 part  
  `-md127               9:127  0   5.5T  0 raid6 
    `-md127p1         259:0    0   5.5T  0 part  
      |-vg--arch-swap 253:0    0     4G  0 lvm   [SWAP]
      `-vg--arch-root 253:1    0   5.4T  0 lvm   /
sdk                     8:160  0 465.3G  0 disk  
|-sdk1                  8:161  0    32M  0 part  
`-sdk2                  8:162  0 465.2G  0 part  
  `-md127               9:127  0   5.5T  0 raid6 
    `-md127p1         259:0    0   5.5T  0 part  
      |-vg--arch-swap 253:0    0     4G  0 lvm   [SWAP]
      `-vg--arch-root 253:1    0   5.4T  0 lvm   /
sdl                     8:176  0 465.3G  0 disk  
|-sdl1                  8:177  0    32M  0 part  
`-sdl2                  8:178  0 465.2G  0 part  
  `-md127               9:127  0   5.5T  0 raid6 
    `-md127p1         259:0    0   5.5T  0 part  
      |-vg--arch-swap 253:0    0     4G  0 lvm   [SWAP]
      `-vg--arch-root 253:1    0   5.4T  0 lvm   /
sdm                     8:192  0 465.8G  0 disk  
|-sdm1                  8:193  0    32M  0 part  
`-sdm2                  8:194  0 465.7G  0 part  
  `-md127               9:127  0   5.5T  0 raid6 
    `-md127p1         259:0    0   5.5T  0 part  
      |-vg--arch-swap 253:0    0     4G  0 lvm   [SWAP]
      `-vg--arch-root 253:1    0   5.4T  0 lvm   /
sdn                     8:208  0 465.8G  0 disk  
|-sdn1                  8:209  0    32M  0 part  
`-sdn2                  8:210  0 465.7G  0 part  
  `-md127               9:127  0   5.5T  0 raid6 
    `-md127p1         259:0    0   5.5T  0 part  
      |-vg--arch-swap 253:0    0     4G  0 lvm   [SWAP]
      `-vg--arch-root 253:1    0   5.4T  0 lvm   /
sdo                     8:224  1     0B  0 disk  
sdp                     8:240  1     0B  0 disk  
sdq                    65:0    1     0B  0 disk  
sdr                    65:16   1     0B  0 disk

I know I could have avoided that 32M on every disk, e.g. by using a single boot disk, but I found this to be the best and easiest for my case.
Also I am not sure what sdo, sdp, sdq and sdr are.

kay94 · 2025-02-27 20:04:39

@cryptearth
Yes, the machine also is pretty old and is third hand already. xD But I thought, for a simple storage system, this should really suffice.

Regarding the size: I am using 500GB used HDDs I bought off of ebay for only a few € a piece. Found this to be both friendly to the ressources of the planet and my wallet (note: the system is not running 24/7, but normally only an hour a week max... if I don't do updates!).

Frankenstein system indeed pretty much is on point. xD After obtaining the basic machine for my build, I also got hand on two retired Dell PowerEdge 2900 systems. I first planned on using them to crunch numbers for BOINC, but that would have been insanely costly and ineffective due to the huge power consumption. I then decided to slaughter them and obtained two RAID controllers with 8 SATA ports each including two SATA backplanes that (one of them) magically just fit into the actually pretty small housing of the other computer. The second set is backup hardware. I also used some fans of the PowerEdges, connected PWM to GND and now have some very quit but yet not too weak fans too keep the 14 HDDs and especially the RAID controller cool. I also added a mini PCIe x1 SATA card with two slots and also used the onboard SATA ports. With that the PCIe capabilties of the mainboard are maxed out.

Maybe I can take a picture or two tomorrow or so to demonstrate how much of an Frankestein system it actually is, if you happen to be interested. xD

xerxes_ · 2025-02-27 21:22:40

It is strange for me that you didn't observe process of update on this system (even once) and when it is slowing down. My bet is that when whole RAM is used and system is starting to swap, then process of update is slowing down. Additionally it has to manage data between all those drives in raid. And 4GB is too few for smooth update without tuning, especially medium to big. How many packages do you have installed in this system ( pacman -Qsq|wc -l ) ?

What you can do:
1. Do updates more often, so they are smaller and maybe fit in that amount of RAM.
2. Setup ZSWAP (compressed swap) which minimize IO between RAM and drives. You may do even 4GB big ZSWAP.
3. You may do clear cache to clear RAM just before update (but not during update!) and optionally after update, by command:

sync; echo 1 > /proc/sys/vm/drop_caches     # or
sync; echo 3 > /proc/sys/vm/drop_caches

4. Hardware solution (force, like nvidia do): buy more RAM (to 8GB or even 16GB), change hdd drives to ssd (probably you would like to avoid this).

I have old laptop with 4GB of RAM, hdd drive and arch installed on it. ZSWAP helps a lot. Previously I tested ZRAM on this system, but ZSWAP is better.

Last edited by xerxes_ (2025-02-27 21:47:41)

kay94 · 2025-02-28 11:24:30

Hi @xerxes_,

thanks for your reply.

I dit observe the update process more than once. I just have not paid special attention to the cryptographic verification process. I would therefor assume, that it does not take very long. What I did observe, as said, is that the actual installation of each package takes very long. And I could see (as shown above), that copying a file from /var/cache/pacman/pkg to ~ was very slow.

To be honest I am not sure why a small or large update should be a problem with "only" 4G RAM, when packages are processed one after another. Also, the idle RAM utilization of that machine is very low (500M-1G at max.). No GUI at all, only VT. So even if you would expect all packages (1500M total, when extracted) to be cached, that would still fit in RAM easily, probably even including the compressed package archives.
pacman, of course, also isn't really memory hungry.

Also I never observed Swap to be used at all.

Regards
Kay

EDIT: I just checked: Idle RAM usage is 171M (MiBytes, probably).

Last edited by kay94 (2025-02-28 12:18:20)

cryptearth · 2025-02-28 14:20:18

oh - that's exactly my kind of tech gore
somewhat like that on line in gta v: "Here's the problem. I don't know what I want. It's a bit, well, like pornography or a perfect turd, I can't quite describe it, but I'll know it when I see it."
would like to see some pictures
I'm not sure but the low amount of ram could be an issue even with MD - with zfs it would be a problem for sure
the main issue I see is the swap is within the raid - not along it - so I would either try like
sda - 465g total
sda1 - 4g swap
sda2 - 460g raid
+md
++vg-raid
or use one drive for os with a big swap like 32g but not part of the raid
but that depends if you're able to kill the setup and start over
as for re-using a lot of old cheap small drives: that's pretty much what raid was for at first: redundant array of inexpensive disks - before storage got cheap and the meaning changed from inexpensive to independent - the only downside is the requirement for lot of ports and high power demand
to keep a system bootable one can enable power-up-in-standby so the drives start up one after another - this keeps the inrush current low so even an old 300w psu can power 10 drives
as you have a backplane you could also use staggered spinup
you mentioned you got some raid cards - are these actual hardware raid controllers? if so: with MD you use them as simple dumb HBA - a mode professional hw-raid cards are not designed for without reflash of "it-mode" firmware

kay94 · 2025-02-28 18:37:41

@cryptearth
Regarding swap you have a good point. Regarding RAM I am still not sure, to be honest, because I never saw high RAM utilization (or even swap, as said).

Here have some pictures (clickable links):

I even forgot I have one 3.5" in there and two USB drives (which must be 16 years old by now, but very few hours). I could connect even more with the Mainboard USB-Ports on the back, which are still untouched. Or even USB-Hubs, of course.
As you can see, a lot of cardboard is involved, too. xD
But regardless of that it has been running without issues for a while now, if I leave out the performance issues during updating (only). And I'm happy how small the entire machine still is.

While running the SMART test yesterday, I even noticed, that one of the HDDs I bought off of ebay already has 46000 running hours! That is almost 5 years non-stop. Since I bought enough spare ones, I will just look and see how long it lasts and replace it someday. But no mercy for now!

to keep a system bootable one can enable power-up-in-standby so the drives start up one after another - this keeps the inrush current low so even an old 300w psu can power 10 drives

Wouldn't that have to be a BIOS feature? The BIOS of this mainboard is very limited, unfortunately.

you mentioned you got some raid cards - are these actual hardware raid controllers? if so: with MD you use them as simple dumb HBA - a mode professional hw-raid cards are not designed for without reflash of "it-mode" firmware

Yes, I have a HW RAID controller for 8 of the 14 HDDs and told it to use all its HDDs as single disk RAID1 (or was it 0?), otherwise they won't appear as blockdevices. To be honest I don't understand your last sentence of this quote; any chance you could explain in other words?

cryptearth · 2025-02-28 20:21:11

ah - this could be a lead
HBAs come in two flavours:
- hardware raid controller
- jbod adapter
the main difference:
a hw-raid controller is designed to offload the raid calculations from the cpu onto its own chip and represent any array back as logical drive rather than physical drives - when using in the intented way this is no problem and exactly what you want: one logical volume presented to the os while the raid card does the magic - the downside is often there's no direct smart access to the individual drives but that has to be monitored by the raid card
a jbod hba on the other hand is just a dumb controller providing physical ports but doesn't do anything else logically - it just provides low level raw access to the drives directly - this is also called IT-mode and requires special firmware on hw-controllers
with MD this shouldn't be much of an issue - but don't try zfs with such a card: zfs expects dumb HBAs as it brings its own i/o stack
so if you want to use your hw-raid card as jbod hba search for if an it-mode firmware is available and how to flash it - otherwise in its current hw-raid configuration its just unsuited for how you want to use it

// edit
a bit addition about power-up in stand-by / staggered spinup
power-up in stand-by is a feature of the controller - and only relevant to the bios/uefi if you want to boot from a drive in that mode - which I recommend against
for my setup with my cheap asmedia sata hba udev does the job: the system boots up with all drives not spinning up - and after selecting arch from grub is when udev starts them up one by one while sowing this first few lines about udev and systemd
it's takes some sweet time - about 7-10 seconds per hdd - so it adds about 1 minute to 1m30s to overall boot - but it's easier for the power supply to handle the slow increase instead of the high in-rush current when all drives try to start up along with the system, gpu, fans and water cooling pump
fun fact: I tried this with a less powerfull psu: it tried but it often kept failing and just shutting off when all drives were connected - instead I had to connect the one by one by hand (my hba luckly is set to hot-plug) - after setting the drives to powerup-in-standby it was no problem

staggered spinup is something to be supported by a backplane: one pin of the power connector plays double-duty in such a setup
simple explained instead of connecting that pin just to ground and by this spin up the drive as soon as power is applied the backplane keeps it high-Z which tells the drive to not spin up until either the backplane brings the pin to ground or the controller (often a sas-hba) tells the drive to spin up
it's a feature rarely seen on consumer sata drives but more on professional sas drives (although its part of the sata spec)
I never had a backplane or special power connectors to play around with that feature - and from the pictures it looks like you have hooked up SSDs to the plane for which that doesn't matter anyway

Last edited by cryptearth (2025-03-03 22:37:49)

kay94 · 2025-03-06 11:08:01

Thank you, that is a good hint! I will try to find out, wether there is another firmware available for that particular controller and report the results.

Regarding your EDIT:
I only have HDDs in that system, no SSDs.

Arch Linux

#1 2025-02-27 10:28:48

Update via pacman takes forever on RAID6

#2 2025-02-27 11:00:13

Re: Update via pacman takes forever on RAID6

#3 2025-02-27 11:29:29

Re: Update via pacman takes forever on RAID6

#4 2025-02-27 11:46:49

Re: Update via pacman takes forever on RAID6

#5 2025-02-27 17:54:30

Re: Update via pacman takes forever on RAID6

#6 2025-02-27 19:53:21

Re: Update via pacman takes forever on RAID6

#7 2025-02-27 20:04:39

Re: Update via pacman takes forever on RAID6

#8 2025-02-27 21:22:40

Re: Update via pacman takes forever on RAID6

#9 2025-02-28 11:24:30

Re: Update via pacman takes forever on RAID6

#10 2025-02-28 14:20:18

Re: Update via pacman takes forever on RAID6

#11 2025-02-28 18:37:41

Re: Update via pacman takes forever on RAID6

#12 2025-02-28 20:21:11

Re: Update via pacman takes forever on RAID6

#13 2025-03-06 11:08:01

Re: Update via pacman takes forever on RAID6

Board footer