You are not logged in.
Hi!
I have a computer running Arch Linux with a RAID6 consisting of 14 HDDs.
/dev/md/raid6:
Version : 1.2
Creation Time : Thu Dec 28 23:41:01 2023
Raid Level : raid6
Array Size : 5852196864 (5.45 TiB 5.99 TB)
Used Dev Size : 487683072 (465.09 GiB 499.39 GB)
Raid Devices : 14
Total Devices : 14
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Thu Feb 27 11:20:08 2025
State : active
Active Devices : 14
Working Devices : 14
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Consistency Policy : bitmap
Name : hp:raid6
UUID : b9b3f1e2:4f51943d:67f524d8:971d9a77
Events : 19703
Number Major Minor RaidDevice State
0 8 2 0 active sync /dev/sda2
1 8 18 1 active sync /dev/sdb2
2 8 34 2 active sync /dev/sdc2
3 8 50 3 active sync /dev/sdd2
4 8 66 4 active sync /dev/sde2
5 8 82 5 active sync /dev/sdf2
6 8 98 6 active sync /dev/sdg2
7 8 114 7 active sync /dev/sdh2
8 8 130 8 active sync /dev/sdi2
9 8 146 9 active sync /dev/sdj2
10 8 162 10 active sync /dev/sdk2
11 8 178 11 active sync /dev/sdl2
12 8 210 12 active sync /dev/sdn2
13 8 194 13 active sync /dev/sdm2
The CPU is not super high-end and the machine is equipped with 4GB of RAM:
lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 40 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Vendor ID: AuthenticAMD
Model name: AMD Athlon(tm) 5150 APU with Radeon(tm) R3
CPU family: 22
Model: 0
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
Stepping: 1
CPU(s) scaling MHz: 100%
CPU max MHz: 1600.0000
CPU min MHz: 800.0000
BogoMIPS: 3195.26
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht
syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid
aperfmperf pni pclmulqdq monitor ssse3 cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c lahf_lm cm
p_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt topoext perfc
tr_nb bpext perfctr_llc hw_pstate proc_feedback ssbd vmmcall bmi1 xsaveopt arat npt lbrv svm_lock nri
p_save tsc_scale flushbyasid decodeassists pausefilter pfthreshold overflow_recov
Virtualization features:
Virtualization: AMD-V
Caches (sum of all):
L1d: 128 KiB (4 instances)
L1i: 128 KiB (4 instances)
L2: 2 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-3
Vulnerabilities:
Gather data sampling: Not affected
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Reg file data sampling: Not affected
Retbleed: Mitigation; untrained return thunk; SMT disabled
Spec rstack overflow: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Retpolines; STIBP disabled; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
Srbds: Not affected
Tsx async abort: Not affected
Now I have the problem than an update via pacman takes essentially forever (more than 2 hours), with e.g. "Total Installed Size: 1526.67 MiB". Issue is not the download speed, but installation of packages takes reaaaally long. On the other hand, when I use that machine to store backups, that process is really fast.
A small test using dd also does not suggest, IO would be terribly slow:
time dd if=/dev/zero bs=1M count=512 of=test.bin conv=sync status=progress
525336576 bytes (525 MB, 501 MiB) copied, 6 s, 86.2 MB/s
512+0 records in
512+0 records out
536870912 bytes (537 MB, 512 MiB) copied, 6.36534 s, 84.3 MB/s
real 0m6.429s
user 0m0.008s
sys 0m1.497s
Does someone have any idea, what this could be related to?
Regards
Kay
Offline
On a hunch I'd suspect the package signing checks, which might take long on your APU? Out of my comfort zone here, but depending on the algorithms used this might have need to be done in software, since maybe your APU doesn't have hardware support for the crypto part, which would make that even slower. Glad if someone can correct me if I'm horribly wrong though. Could also be the unpacking, since while you're downloading "just" 1.5 gigs of data, it needs to be decompressed, which again might be slow on your APU.
Offline
Thank you for your thoughts!
What I can say is, that the actual package installation also takes very long (2 hours). I did not observe, how long the "Checking package integrity" step takes (where the signatures are cryptographically verified), but it is at least not the only thing being slow.
I made following additional observations:
$> time cp /var/cache/pacman/pkg/linux-lts-6.12.16-1-x86_64.pkg.tar.zst ~
real 1m39.182s
user 0m0.000s
sys 0m0.655s
$> du -h /var/cache/pacman/pkg/linux-lts-6.12.16-1-x86_64.pkg.tar.zst
138M /var/cache/pacman/pkg/linux-lts-6.12.16-1-x86_64.pkg.tar.zst
$> time tar xf linux-lts-6.12.16-1-x86_64.pkg.tar.zst
real 0m1.503s
user 0m0.462s
sys 0m1.747s
So copying seems to be veeery slow, but extraction wasn't. Why, though?
Regards
Kay
Offline
Hm. What FS do you use on top of the RAID? Did you run SMART checks recently?
Also, please post the output of lsblk.
Offline
had to look up socket AM1 and the list of cpus - but even socket AM1 is from mid-2010s it seems its design for ultra-low end systems in the realms equivalent to intels celeron family
14-wide raid6 comming up in just 6TB? that sounds like a wild mix of all kinds of different drives
how they are connected? I use one of those chinese pci-e 3.0 x1 to s-ata hba myself - but with my 8 drive array using zfs I get about 800mb/s - pretty much maxing out the pci-e link
sounds very interesting - would like to get a closer look to that frankenstein system
Offline
@Whoracle
Thanks for your input.
1. I am using ext4.
2. I just ran smartctl --test=long on all drives without any errors reported. For two HDDs, strangely, I had to poll the status every 10 seconds, otherwise they would suspend and the test would be "Aborted by user" (probably much longer than 10 s would worked as well).
3. Output of lsblk:
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 1 465.8G 0 disk
|-sda1 8:1 1 32M 0 part
`-sda2 8:2 1 465.7G 0 part
`-md127 9:127 0 5.5T 0 raid6
`-md127p1 259:0 0 5.5T 0 part
|-vg--arch-swap 253:0 0 4G 0 lvm [SWAP]
`-vg--arch-root 253:1 0 5.4T 0 lvm /
sdb 8:16 1 465.8G 0 disk
|-sdb1 8:17 1 32M 0 part
`-sdb2 8:18 1 465.7G 0 part
`-md127 9:127 0 5.5T 0 raid6
`-md127p1 259:0 0 5.5T 0 part
|-vg--arch-swap 253:0 0 4G 0 lvm [SWAP]
`-vg--arch-root 253:1 0 5.4T 0 lvm /
sdc 8:32 0 465.8G 0 disk
|-sdc1 8:33 0 32M 0 part
`-sdc2 8:34 0 465.7G 0 part
`-md127 9:127 0 5.5T 0 raid6
`-md127p1 259:0 0 5.5T 0 part
|-vg--arch-swap 253:0 0 4G 0 lvm [SWAP]
`-vg--arch-root 253:1 0 5.4T 0 lvm /
sdd 8:48 0 465.8G 0 disk
|-sdd1 8:49 0 32M 0 part
`-sdd2 8:50 0 465.7G 0 part
`-md127 9:127 0 5.5T 0 raid6
`-md127p1 259:0 0 5.5T 0 part
|-vg--arch-swap 253:0 0 4G 0 lvm [SWAP]
`-vg--arch-root 253:1 0 5.4T 0 lvm /
sde 8:64 0 465.3G 0 disk
|-sde1 8:65 0 32M 0 part
`-sde2 8:66 0 465.2G 0 part
`-md127 9:127 0 5.5T 0 raid6
`-md127p1 259:0 0 5.5T 0 part
|-vg--arch-swap 253:0 0 4G 0 lvm [SWAP]
`-vg--arch-root 253:1 0 5.4T 0 lvm /
sdf 8:80 0 465.3G 0 disk
|-sdf1 8:81 0 32M 0 part
`-sdf2 8:82 0 465.2G 0 part
`-md127 9:127 0 5.5T 0 raid6
`-md127p1 259:0 0 5.5T 0 part
|-vg--arch-swap 253:0 0 4G 0 lvm [SWAP]
`-vg--arch-root 253:1 0 5.4T 0 lvm /
sdg 8:96 0 465.3G 0 disk
|-sdg1 8:97 0 32M 0 part
`-sdg2 8:98 0 465.2G 0 part
`-md127 9:127 0 5.5T 0 raid6
`-md127p1 259:0 0 5.5T 0 part
|-vg--arch-swap 253:0 0 4G 0 lvm [SWAP]
`-vg--arch-root 253:1 0 5.4T 0 lvm /
sdh 8:112 0 465.3G 0 disk
|-sdh1 8:113 0 32M 0 part
`-sdh2 8:114 0 465.2G 0 part
`-md127 9:127 0 5.5T 0 raid6
`-md127p1 259:0 0 5.5T 0 part
|-vg--arch-swap 253:0 0 4G 0 lvm [SWAP]
`-vg--arch-root 253:1 0 5.4T 0 lvm /
sdi 8:128 0 465.3G 0 disk
|-sdi1 8:129 0 32M 0 part
`-sdi2 8:130 0 465.2G 0 part
`-md127 9:127 0 5.5T 0 raid6
`-md127p1 259:0 0 5.5T 0 part
|-vg--arch-swap 253:0 0 4G 0 lvm [SWAP]
`-vg--arch-root 253:1 0 5.4T 0 lvm /
sdj 8:144 0 465.3G 0 disk
|-sdj1 8:145 0 32M 0 part
`-sdj2 8:146 0 465.2G 0 part
`-md127 9:127 0 5.5T 0 raid6
`-md127p1 259:0 0 5.5T 0 part
|-vg--arch-swap 253:0 0 4G 0 lvm [SWAP]
`-vg--arch-root 253:1 0 5.4T 0 lvm /
sdk 8:160 0 465.3G 0 disk
|-sdk1 8:161 0 32M 0 part
`-sdk2 8:162 0 465.2G 0 part
`-md127 9:127 0 5.5T 0 raid6
`-md127p1 259:0 0 5.5T 0 part
|-vg--arch-swap 253:0 0 4G 0 lvm [SWAP]
`-vg--arch-root 253:1 0 5.4T 0 lvm /
sdl 8:176 0 465.3G 0 disk
|-sdl1 8:177 0 32M 0 part
`-sdl2 8:178 0 465.2G 0 part
`-md127 9:127 0 5.5T 0 raid6
`-md127p1 259:0 0 5.5T 0 part
|-vg--arch-swap 253:0 0 4G 0 lvm [SWAP]
`-vg--arch-root 253:1 0 5.4T 0 lvm /
sdm 8:192 0 465.8G 0 disk
|-sdm1 8:193 0 32M 0 part
`-sdm2 8:194 0 465.7G 0 part
`-md127 9:127 0 5.5T 0 raid6
`-md127p1 259:0 0 5.5T 0 part
|-vg--arch-swap 253:0 0 4G 0 lvm [SWAP]
`-vg--arch-root 253:1 0 5.4T 0 lvm /
sdn 8:208 0 465.8G 0 disk
|-sdn1 8:209 0 32M 0 part
`-sdn2 8:210 0 465.7G 0 part
`-md127 9:127 0 5.5T 0 raid6
`-md127p1 259:0 0 5.5T 0 part
|-vg--arch-swap 253:0 0 4G 0 lvm [SWAP]
`-vg--arch-root 253:1 0 5.4T 0 lvm /
sdo 8:224 1 0B 0 disk
sdp 8:240 1 0B 0 disk
sdq 65:0 1 0B 0 disk
sdr 65:16 1 0B 0 disk
I know I could have avoided that 32M on every disk, e.g. by using a single boot disk, but I found this to be the best and easiest for my case.
Also I am not sure what sdo, sdp, sdq and sdr are.
Offline
@cryptearth
Yes, the machine also is pretty old and is third hand already. xD But I thought, for a simple storage system, this should really suffice.
Regarding the size: I am using 500GB used HDDs I bought off of ebay for only a few € a piece. Found this to be both friendly to the ressources of the planet and my wallet (note: the system is not running 24/7, but normally only an hour a week max... if I don't do updates!).
Frankenstein system indeed pretty much is on point. xD After obtaining the basic machine for my build, I also got hand on two retired Dell PowerEdge 2900 systems. I first planned on using them to crunch numbers for BOINC, but that would have been insanely costly and ineffective due to the huge power consumption. I then decided to slaughter them and obtained two RAID controllers with 8 SATA ports each including two SATA backplanes that (one of them) magically just fit into the actually pretty small housing of the other computer. The second set is backup hardware. I also used some fans of the PowerEdges, connected PWM to GND and now have some very quit but yet not too weak fans too keep the 14 HDDs and especially the RAID controller cool. I also added a mini PCIe x1 SATA card with two slots and also used the onboard SATA ports. With that the PCIe capabilties of the mainboard are maxed out.
Maybe I can take a picture or two tomorrow or so to demonstrate how much of an Frankestein system it actually is, if you happen to be interested. xD
Offline
It is strange for me that you didn't observe process of update on this system (even once) and when it is slowing down. My bet is that when whole RAM is used and system is starting to swap, then process of update is slowing down. Additionally it has to manage data between all those drives in raid. And 4GB is too few for smooth update without tuning, especially medium to big. How many packages do you have installed in this system ( pacman -Qsq|wc -l ) ?
What you can do:
1. Do updates more often, so they are smaller and maybe fit in that amount of RAM.
2. Setup ZSWAP (compressed swap) which minimize IO between RAM and drives. You may do even 4GB big ZSWAP.
3. You may do clear cache to clear RAM just before update (but not during update!) and optionally after update, by command:
sync; echo 1 > /proc/sys/vm/drop_caches # or
sync; echo 3 > /proc/sys/vm/drop_caches
4. Hardware solution (force, like nvidia do): buy more RAM (to 8GB or even 16GB), change hdd drives to ssd (probably you would like to avoid this).
I have old laptop with 4GB of RAM, hdd drive and arch installed on it. ZSWAP helps a lot. Previously I tested ZRAM on this system, but ZSWAP is better.
Last edited by xerxes_ (2025-02-27 21:47:41)
Offline
Hi @xerxes_,
thanks for your reply.
I dit observe the update process more than once. I just have not paid special attention to the cryptographic verification process. I would therefor assume, that it does not take very long. What I did observe, as said, is that the actual installation of each package takes very long. And I could see (as shown above), that copying a file from /var/cache/pacman/pkg to ~ was very slow.
To be honest I am not sure why a small or large update should be a problem with "only" 4G RAM, when packages are processed one after another. Also, the idle RAM utilization of that machine is very low (500M-1G at max.). No GUI at all, only VT. So even if you would expect all packages (1500M total, when extracted) to be cached, that would still fit in RAM easily, probably even including the compressed package archives.
pacman, of course, also isn't really memory hungry.
Also I never observed Swap to be used at all.
Regards
Kay
EDIT: I just checked: Idle RAM usage is 171M (MiBytes, probably).
Last edited by kay94 (2025-02-28 12:18:20)
Offline
oh - that's exactly my kind of tech gore
somewhat like that on line in gta v: "Here's the problem. I don't know what I want. It's a bit, well, like pornography or a perfect turd, I can't quite describe it, but I'll know it when I see it."
would like to see some pictures
I'm not sure but the low amount of ram could be an issue even with MD - with zfs it would be a problem for sure
the main issue I see is the swap is within the raid - not along it - so I would either try like
sda - 465g total
sda1 - 4g swap
sda2 - 460g raid
+md
++vg-raid
or use one drive for os with a big swap like 32g but not part of the raid
but that depends if you're able to kill the setup and start over
as for re-using a lot of old cheap small drives: that's pretty much what raid was for at first: redundant array of inexpensive disks - before storage got cheap and the meaning changed from inexpensive to independent - the only downside is the requirement for lot of ports and high power demand
to keep a system bootable one can enable power-up-in-standby so the drives start up one after another - this keeps the inrush current low so even an old 300w psu can power 10 drives
as you have a backplane you could also use staggered spinup
you mentioned you got some raid cards - are these actual hardware raid controllers? if so: with MD you use them as simple dumb HBA - a mode professional hw-raid cards are not designed for without reflash of "it-mode" firmware
Offline
@cryptearth
Regarding swap you have a good point. Regarding RAM I am still not sure, to be honest, because I never saw high RAM utilization (or even swap, as said).
Here have some pictures (clickable links):
I even forgot I have one 3.5" in there and two USB drives (which must be 16 years old by now, but very few hours). I could connect even more with the Mainboard USB-Ports on the back, which are still untouched. Or even USB-Hubs, of course.
As you can see, a lot of cardboard is involved, too. xD
But regardless of that it has been running without issues for a while now, if I leave out the performance issues during updating (only). And I'm happy how small the entire machine still is.
While running the SMART test yesterday, I even noticed, that one of the HDDs I bought off of ebay already has 46000 running hours! That is almost 5 years non-stop. Since I bought enough spare ones, I will just look and see how long it lasts and replace it someday. But no mercy for now!
to keep a system bootable one can enable power-up-in-standby so the drives start up one after another - this keeps the inrush current low so even an old 300w psu can power 10 drives
Wouldn't that have to be a BIOS feature? The BIOS of this mainboard is very limited, unfortunately.
you mentioned you got some raid cards - are these actual hardware raid controllers? if so: with MD you use them as simple dumb HBA - a mode professional hw-raid cards are not designed for without reflash of "it-mode" firmware
Yes, I have a HW RAID controller for 8 of the 14 HDDs and told it to use all its HDDs as single disk RAID1 (or was it 0?), otherwise they won't appear as blockdevices. To be honest I don't understand your last sentence of this quote; any chance you could explain in other words?
Offline
ah - this could be a lead
HBAs come in two flavours:
- hardware raid controller
- jbod adapter
the main difference:
a hw-raid controller is designed to offload the raid calculations from the cpu onto its own chip and represent any array back as logical drive rather than physical drives - when using in the intented way this is no problem and exactly what you want: one logical volume presented to the os while the raid card does the magic - the downside is often there's no direct smart access to the individual drives but that has to be monitored by the raid card
a jbod hba on the other hand is just a dumb controller providing physical ports but doesn't do anything else logically - it just provides low level raw access to the drives directly - this is also called IT-mode and requires special firmware on hw-controllers
with MD this shouldn't be much of an issue - but don't try zfs with such a card: zfs expects dumb HBAs as it brings its own i/o stack
so if you want to use your hw-raid card as jbod hba search for if an it-mode firmware is available and how to flash it - otherwise in its current hw-raid configuration its just unsuited for how you want to use it
// edit
a bit addition about power-up in stand-by / staggered spinup
power-up in stand-by is a feature of the controller - and only relevant to the bios/uefi if you want to boot from a drive in that mode - which I recommend against
for my setup with my cheap asmedia sata hba udev does the job: the system boots up with all drives not spinning up - and after selecting arch from grub is when udev starts them up one by one while sowing this first few lines about udev and systemd
it's takes some sweet time - about 7-10 seconds per hdd - so it adds about 1 minute to 1m30s to overall boot - but it's easier for the power supply to handle the slow increase instead of the high in-rush current when all drives try to start up along with the system, gpu, fans and water cooling pump
fun fact: I tried this with a less powerfull psu: it tried but it often kept failing and just shutting off when all drives were connected - instead I had to connect the one by one by hand (my hba luckly is set to hot-plug) - after setting the drives to powerup-in-standby it was no problem
staggered spinup is something to be supported by a backplane: one pin of the power connector plays double-duty in such a setup
simple explained instead of connecting that pin just to ground and by this spin up the drive as soon as power is applied the backplane keeps it high-Z which tells the drive to not spin up until either the backplane brings the pin to ground or the controller (often a sas-hba) tells the drive to spin up
it's a feature rarely seen on consumer sata drives but more on professional sas drives (although its part of the sata spec)
I never had a backplane or special power connectors to play around with that feature - and from the pictures it looks like you have hooked up SSDs to the plane for which that doesn't matter anyway
Last edited by cryptearth (2025-03-03 22:37:49)
Offline
Thank you, that is a good hint! I will try to find out, wether there is another firmware available for that particular controller and report the results.
Regarding your EDIT:
I only have HDDs in that system, no SSDs.
Offline