[solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

Jason P. · 2019-03-06 14:16:56

Hi,

After updating kernel to version 5.0.0, lvm2 monitor and pvscan services began to slow down boot times.

➜ % ~  uname -r
5.0.0-arch1-1-ARCH
➜ % ~  pacman -Qi lvm2
Name: lvm2
Version: 2.02.183-2

➜ % ~  systemd-analyze blame 
         23.841s lvm2-monitor.service
         23.509s lvm2-pvscan@8:17.service
         23.412s lvm2-pvscan@254:1.service
         23.173s lvm2-pvscan@254:3.service

I think the problem is related with lvmetad cache:

➜ % ~  journalctl -b 0 | grep lvmetad
mar 06 14:40:08 lvm[369]:   WARNING: lvmetad is being updated by another command (pid 472).
mar 06 14:40:08 lvm[369]:   WARNING: Not using lvmetad because cache update failed.

Here it is the boot log: https://termbin.com/zr90

Reboot is also slower because systemd waits for lvm2 monitor to stop.

Last edited by Jason P. (2019-04-08 15:34:18)

brom · 2019-03-06 18:23:27

I am getting the same issue after updating to linux 5.0, which is slowing down my boot time tremendously. I am not currently even using any LVM volumes on my system. I haven't had time to find a solution for the problem. For the time being, I booted the lts kernel.

bugsmanagement · 2019-03-06 19:29:29

Hello everyone,

AFAIK, lvmetad is depreciated and not needed anymore, how about disabling and removing it, does it solve it for you?

Regards

Last edited by bugsmanagement (2019-03-06 19:29:45)

loqs · 2019-03-06 19:32:58

bugsmanagement wrote:

AFAIK, lvmetad is depreciated and not needed anymore, how about disabling and removing it, does it solve it for you?

The PKGBUILD enables it see
https://git.archlinux.org/svntogit/pack … s/lvm2#n46
https://git.archlinux.org/svntogit/pack … s/lvm2#n52
it is deprecated and has been removed with 2.03 but that is not a stable release.
Edit:
brom as you do not use lvm2 on your system you could try masking lvm2-monitor.service

Last edited by loqs (2019-03-06 19:36:28)

bugsmanagement · 2019-03-06 19:47:11

Yeah, thanks for the heads up, in fact, it is running here. Though, I am not using Kernel 5.0.0 so it might be something else. AFAIK, according to Linus, 5.0 didn't introduce a lot changes, and he wanted to end 4.x scheme.

After reviewing the log, I think this might be the culprit:

mar 06 14:39:46 lvm[556]:   /dev/sdd: open failed: No medium found
mar 06 14:39:56 lvm[369]:   /dev/sde: open failed: No medium found
mar 06 14:39:56 lvm[456]:   /dev/sde: open failed: No medium found
mar 06 14:39:56 lvm[472]:   /dev/sde: open failed: No medium found
mar 06 14:39:56 lvm[556]:   /dev/sde: open failed: No medium found
mar 06 14:40:02 lvm[369]:   /dev/sdf: open failed: No medium found
mar 06 14:40:02 lvm[456]:   /dev/sdf: open failed: No medium found
mar 06 14:40:02 lvm[472]:   /dev/sdf: open failed: No medium found
mar 06 14:40:02 lvm[556]:   /dev/sdf: open failed: No medium found
mar 06 14:40:08 lvm[369]:   /dev/sdg: open failed: No medium found
mar 06 14:40:08 lvm[472]:   /dev/sdg: open failed: No medium found
mar 06 14:40:08 lvm[556]:   /dev/sdg: open failed: No medium found
mar 06 14:40:08 lvm[456]:   /dev/sdg: open failed: No medium found
mar 06 14:40:08 lvm[369]:   WARNING: lvmetad is being updated by another command (pid 472).
mar 06 14:40:08 lvm[369]:   WARNING: Not using lvmetad because cache update failed.
mar 06 14:40:08 lvm[369]:   /dev/sdd: open failed: No medium found
mar 06 14:40:08 lvm[369]:   /dev/sde: open failed: No medium found
mar 06 14:40:08 lvm[369]:   /dev/sdf: open failed: No medium found
mar 06 14:40:08 lvm[369]:   /dev/sdg: open failed: No medium found

Which takes roughly 22 seconds.

Last edited by bugsmanagement (2019-03-06 19:51:32)

Jason P. · 2019-03-06 23:19:16

Those /dev entries correspond to slots of a multi card reader which is plugged in without any media inserted. See this previous issue: https://bbs.archlinux.org/viewtopic.php?id=243969

What happens since upgrading the kernel is that it takes much more time to filter out those devices. The only recent change in lvm2 was [1] which doesn't seem that could affect.

[1] https://git.archlinux.org/svntogit/pack … 763b38b756

bugsmanagement · 2019-03-07 00:44:27

I wonder if something could be done to ignore those devices if you're not planning on using for LVM?

loqs · 2019-03-07 02:37:52

bugsmanagement wrote:

I wonder if something could be done to ignore those devices if you're not planning on using for LVM?

See /etc/lvm/lvm.conf

brom · 2019-03-10 07:34:10

Jason P. wrote:

Those /dev entries correspond to slots of a multi card reader which is plugged in without any media inserted. See this previous issue: https://bbs.archlinux.org/viewtopic.php?id=243969
What happens since upgrading the kernel is that it takes much more time to filter out those devices. The only recent change in lvm2 was [1] which doesn't seem that could affect.

[1] https://git.archlinux.org/svntogit/pack … 763b38b756

It appears to be a USB multi card reader causing the issue for me as well. This has never been an issue for me prior to kernel 5.0. With the device plugged in, startup will hang on loading the kernel for over a minute; unplugged, start up time is more reasonable. I tried masking lvm2-monitor.service, but that does not solve the issue.

On kernel 4.19 LTS with the card reader connected:

Startup finished in 30.686s (firmware) + 2.967s (loader) + 26.692s (kernel) + 7.239s (userspace) = 1min 7.585s 
graphical.target reached after 7.239s in userspace

On kernel 5.0 with the card reader connected:

Startup finished in 30.721s (firmware) + 3.867s (loader) + 1min 9.524s (kernel) + 24.374s (userspace) = 2min 8.487s 
graphical.target reached after 24.374s in userspace

Just wondering if anyone else is still dealing with this; at the moment, I can either unplug the multi card reader prior to each boot or use the LTS kernel for the time being

loqs · 2019-03-10 11:53:36

There have been a number of workarounds but I am not aware of anyone investigating the cause of the issue and bugsmanagement reported having the issue without the 5.0 upgrade.
For the workarounds if the system does not use lvm2 you can mask lvm2-monitor.service. If the system uses lvm2 try use_lvmetad = 0 /etc/lvm/lvm.conf or adding a filter entry.
If someone has the time they could bisect between 4.20 and 5.0 to see if they can identify the cause.

DonMartio · 2019-03-12 21:44:00

Hi there,

i did the bisect and this is the result:

5d2ee7122c73be6a3b6bfe90d237e8aed737cfaa is the first bad commit
commit 5d2ee7122c73be6a3b6bfe90d237e8aed737cfaa
Author: Jens Axboe <axboe@kernel.dk>
Date: Thu Nov 29 17:36:41 2018 -0700

sbitmap: optimize wakeup check

Even if we have no waiters on any of the sbitmap_queue wait states, we
still have to loop every entry to check. We do this for every IO, so
the cost adds up.

Shift a bit of the cost to the slow path, when we actually have waiters.
Wrap prepare_to_wait_exclusive() and finish_wait(), so we can maintain
an internal count of how many are currently active. Then we can simply
check this count in sbq_wake_ptr() and not have to loop if we don't
have any sleepers.

Convert the two users of sbitmap with waiting, blk-mq-tag and iSCSI.

Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

:040000 040000 7c8497c534b348087910aca6ca11325728f9fab8 67bcab7342745007920d132983f52f832ef8cfe9 M block
:040000 040000 f565266256f93bcd4d67a53fbbf346544acf2922 f913bae357375333e2255f17c43a27ee6b6f0b48 M drivers
:040000 040000 7c2c82561fa9869c42ff0539b611e32e18ef4418 cd1c3f9aca356edb3cfadb7d79f8bb9744a87cc3 M include
:040000 040000 6d95881497cbecc01af72b497b0c387201840040 e17e4cb991d3b93a2b3f718d1ebda42b790400ea M lib

I don't have more time to dig deeper in this tonight.

Cipher · 2019-03-13 00:29:06

can join the party? https://bbs.archlinux.org/viewtopic.php?id=244900

another effect, cant install grub while using a cache disk..

loqs · 2019-03-13 00:32:58

@cipher did the workarounds other than a kernel downgrade help with the cache disk installation?

Cipher · 2019-03-13 00:36:30

loqs wrote:

@cipher did the workarounds other than a kernel downgrade help with the cache disk installation?

downloading 4.20 to check if that fix it.

update: still not working cache disk but the system boot faster again, 0 errors related to lvm.

update2: downgrading to 4.19 and still not work with cache, maybe a systemd bug?

Last edited by Cipher (2019-03-13 02:35:10)

DonMartio · 2019-03-13 22:32:36

Regarding the multicard reader related slow down i did a little bit digging and narrowed it down to this:

diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index a89fbe7cf6ca..ca2595875514 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -510,8 +510,8 @@ static struct sbq_wait_state *sbq_wake_ptr(struct sbitmap_queue *sbq)
{
int i, wake_index;

- if (!atomic_read(&sbq->ws_active))
- return NULL;
+// if (!atomic_read(&sbq->ws_active))
+// return NULL;

wake_index = atomic_read(&sbq->wake_index);
for (i = 0; i < SBQ_WAIT_QUEUES; i++) {

If i apply this patch my system starts as fast as before.
Those two lines where added with the first bad commit mentioned above.
Could someone check this out, to verify this is not a red herring?

diwr · 2019-03-20 21:34:44

My boot times were fine, I only had a very slow reboot. Gone with lvmetad = 0 as I don't need it.

DonMartio · 2019-04-08 10:49:16

The Problem seems to be gone (in my case). Don't really know since when, but at least since 5.0.5.
A diff shows a lot of small changes on the offending code parts so since time is short i think i will forget about this.

Jason P. · 2019-04-08 15:33:41

For me also since last updates in lvm2 package.

jkhsjdhjs · 2019-04-08 16:43:33

My system boots faster than before with the latest updates, but I wouldn't say it's completely fixed.
Checking the output of systemd-analyze blame I notice that lvm2-pvscan@254:0.service still takes 1.686s. On 4.20.13 it took 235ms.

EDIT: 75ms with 5.0.7, don't know if there were changes to lvm in this release, but you are right, it has been fixed.

Last edited by jkhsjdhjs (2019-04-11 14:21:28)

NoMore201 · 2019-05-20 19:24:51

I still have this problem, with kernel 5.1.3, reboot/shutdown hangs undefinitely because of LVM. Setting lvmetad = 0 solves the problem.

Arch Linux

#1 2019-03-06 14:16:56

[solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

#2 2019-03-06 18:23:27

Re: [solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

#3 2019-03-06 19:29:29

Re: [solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

#4 2019-03-06 19:32:58

Re: [solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

#5 2019-03-06 19:47:11

Re: [solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

#6 2019-03-06 23:19:16

Re: [solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

#7 2019-03-07 00:44:27

Re: [solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

#8 2019-03-07 02:37:52

Re: [solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

#9 2019-03-10 07:34:10

Re: [solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

#10 2019-03-10 11:53:36

Re: [solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

#11 2019-03-12 21:44:00

Re: [solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

#12 2019-03-13 00:29:06

Re: [solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

#13 2019-03-13 00:32:58

Re: [solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

#14 2019-03-13 00:36:30

Re: [solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

#15 2019-03-13 22:32:36

Re: [solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

#16 2019-03-20 21:34:44

Re: [solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

#17 2019-04-08 10:49:16

Re: [solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

#18 2019-04-08 15:33:41

Re: [solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

#19 2019-04-08 16:43:33

Re: [solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

#20 2019-05-20 19:24:51

Re: [solved] Slow boot in recent kernel 5.0.0 probably related to lvmetad

Board footer