You are not logged in.

#1 2020-04-23 19:51:54

PeterJansen
Member
Registered: 2019-12-01
Posts: 8

dhcpcd seemds to be buggy since the last update

I hope that I post it in the correct (sub)forum.

Since the last round of updates my system wouldn't shut down properly. Journalctl indicates that dhcpcd wouldn't properly shut down, if I interpret it correctly. The last shutdown before I downgraded dhcpcd was initiated at 21:17:59 , it took around 90 seconds to shut down the computer. Here the relevant part of the output (I removed the username):

Apr 23 21:17:59 audit: AUDIT1334 prog-id=7 op=UNLOAD
Apr 23 21:19:29 systemd[1]: dhcpcd@enp4s0.service: State 'stop-sigterm' timed out. Killing.
...
pr 23 21:19:29 systemd-journald[414]: Journal stopped
-- Reboot --

I see from the Pacman-log that dhcpcd got upgraded during the last round of updates.  I downgraded dhcpcd to the previous version and the the system did shut down in a normal way again, after 4 seconds.

As I see it I have two options: I can freeze that package, at least for a few rounds of updates, or I can install another network-service/manager (not sure if the manager includes the service?). Which is the best choice? If I would change to another service, which services have or which service has the best track record in regard to stability and performance?

Offline

#2 2020-04-23 21:18:51

seth
Member
Registered: 2012-09-03
Posts: 50,957

Re: dhcpcd seemds to be buggy since the last update

https://bbs.archlinux.org/viewtopic.php?id=254961

There're various threads reg. dhcpcd 9.0.x - and frequent updates:
https://git.archlinux.org/svntogit/pack … ges/dhcpcd

I'd just sit it out. If I had to pick an alternative, it'd be dhclient.
Every tool has it's hiccups. Eg. Networkmanager quite recently had massive problems when they completely changed the dhcp implementation base.
But the more complex the stack is, the more likely those hiccups are (and dhcpcd is already on the less complex side…)

Offline

#3 2020-04-24 11:07:52

rsmarples
Member
Registered: 2009-05-12
Posts: 287

Re: dhcpcd seemds to be buggy since the last update

dhcpcd-9 gained a lot of complexity with the new privilege separation code.
I don't think any other DHCP related tool on Linux does this, so dhcpcd is a lot more secure by default.

EDIT: There maybe an issue with dhcpcd and netctl .... fix is here if someone can test it.
https://roy.marples.name/cgit/dhcpcd.gi … aea166eac0

Last edited by rsmarples (2020-04-24 11:09:02)

Offline

#4 2020-04-26 10:22:08

stefan
Member
Registered: 2013-03-22
Posts: 104

Re: dhcpcd seemds to be buggy since the last update

Referring to

    https://bbs.archlinux.org/viewtopic.php?id=255058
   
I'm still seeing adverse effects with `dhcpcd-9.0.2-1`.  It boots now,
`/etc/{hostname,resolv.conf,resolvconf.conf}` seem good.  But `xterm`
fails with

    xterm: Error 32, errno 19: No such device
    Reason: get_pty: not enough ptys

With `dhcpcd-9.0.2-1`, my box gets a different IPv4 address than
before.

Login via SSH fails with

    PTY allocation request failed

Firefox fails with

    [1364, Main Thread] WARNING: failed to open shm: Permission denied: file /build/firefox/src/firefox-75.0/ipc/chromium/src/base/shared_memory_posix.cc, line 246
    ExceptionHandler::GenerateDump cloned child 1391
    ExceptionHandler::SendContinueSignalToChild sent continue signal to child
    ExceptionHandler::WaitForContinueSignal waiting for continue signal...

And I cannot create named semaphores (which might be the same issue as
with Firefox), `sem_open`(3) fails with "access denied".

Referring to above comment #3: `netctl` is not installed here.

Offline

#5 2020-04-26 11:42:19

rsmarples
Member
Registered: 2009-05-12
Posts: 287

Re: dhcpcd seemds to be buggy since the last update

None of those errors look even remotely connected to dhcpcd.

Offline

#6 2020-04-26 11:54:53

seth
Member
Registered: 2012-09-03
Posts: 50,957

Re: dhcpcd seemds to be buggy since the last update

cat /proc/sys/kernel/pty/max /proc/sys/kernel/pty/nr
sudo lsof -n | grep pts
ls -l /dev/pts 

But since I also doubt that this has an at least direct connection to dhpcd, you'd better start a new thread about this.

Offline

#7 2020-04-28 09:22:22

stefan
Member
Registered: 2013-03-22
Posts: 104

Re: dhcpcd seemds to be buggy since the last update

Ok, the problem is solved, seth has pointed out a misconfiguration on my side: I've had a `dhcpcd` on all interfaces, plus a `dhcpcd@wlp2s0` running.  Disabling the latter made the problems disappear.

rsmarples wrote:

None of those errors look even remotely connected to dhcpcd.

I totally agree, that's why I  was so freaked out about it.

Unfortunately, while I was able to reproducibly trigger these effects (no PTYs, access denied to shared memory) with up- and downgrading of `dhcpcd`, I'm not able to reproduce them now with enabling/disabling of `dhcpcd@wlp2s0`.  In other words, there must have been a (positive) side effect of disabling `dhcpcd@wlp2s0`, maybe some cleaning up, that is not reverted.

So I don't kow how my misconfiguration stopped the box from booting.  But I'm fine now (kinda unsatisfied, though).

Cheers
Stefan

Offline

#8 2020-05-13 21:30:53

elbeardmorez
Member
Registered: 2020-05-13
Posts: 1

Re: dhcpcd seemds to be buggy since the last update

Hi,

Thanks for the info!

Same experience, something's definitely awry with the current set of packages. I performed my update May 05 (took me over a week to get to rebooting though). I also had the same underlying config faux-pas and so am currently content with the 'fix':

systemctl disable dhcpcd@wlp3s0.service

But air quotes indeed, because this isn't a fix, it's a workaround. The misconfiguration was prevalent for going on a year now without this no-pts issue. Given the work around will result in only a single dhcpcd binary being called, and having seen the peculiar mount results below, my 'wild stab in the dark guess' is some sort of race condition on resources the dhcpcd binary requires?

Having 'fixed' it, I've just gone back and deliberately broken it (enabling the dhcp client daemon on the single interface) to ensure that I wasn't going mad with some of the command results I'd seen the night before.

So from the top post reboot following pacman -Syu, I started an X session and was greeted with:

> DISPLAY=:0 tilix
2020-05-13T20:38:23.809 [error] terminal.d:2683:outputError Environment used to execute process:
2020-05-13T20:38:23.809 [error] terminal.d:2685:outputError 	env 0=TILIX_ID=7c5525c7-e067-4de1-90b7-a1e049a34b30
2020-05-13T20:38:23.809 [error] terminal.d:2685:outputError 	env 1=PWD=/root
2020-05-13T20:38:38.996 [error] terminal.d:2678:outputError Unexpected error occurred: Failed to open PTY: No such file or directory
2020-05-13T20:38:38.996 [error] terminal.d:2679:outputError Working Directory=/root
2020-05-13T20:38:38.996 [error] terminal.d:2680:outputError Arguments used to execute process:
2020-05-13T20:38:38.996 [error] terminal.d:2682:outputError 	arg 0=tmux

sad so then

> DISPLAY=:0 xterm
xterm: Error 32, errno 19: No such device
Reason: get_pty: not enough ptys

sad ..straight to `strace`

> DISPLAY=:0 strace xterm

...
openat(AT_FDCWD, "/dev/ptmx", O_RDWR)   = -1 ENODEV (No such device)
write(2, "xterm: Error 32, errno 19: ", 27) = 27
...
write(2, "No such device\n", 15)        = 15
write(2, "Reason: get_pty: not enough ptys\n", 33) = 33
ioctl(-1, TCFLSH, TCOFLUSH)             = -1 EBADF (Bad file descriptor)
close(-1)                               = -1 EBADF (Bad file descriptor)

..which pointed me at:

> ls -l /dev/ptmx /dev/pts
crw-rw-rw- 1 root tty  5, 2 May 13 20:35 /dev/ptmx

/dev/pts:
total 0

> ls -l /dev | grep pts
drwxr-xr-x 2 root root          40 May 13 20:35 pts

..but the file does exist? There's nothing in the /dev/pts/ directory though, so is that fs mounted?

> mount | grep devpts
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)

Yes.. but actually NO! ..and that's what I wasted quite some time on!

See..

> umount devpts
umount: /dev/pts: not mounted.

Erm, did I not type that correctly? I must have given 'devpts' was resolved .. devpts -> /dev/pts. What state is this? mount suggests it's mounted, umount says otherwise. Now it's working again, I get the familiar:

> umount devpts
umount: /dev/pts: target is busy 

So that left the obvious step of explicitly mounting the unmounted fs:

> mount -t devpts devpts /dev/pts

I tried this prior to seeing this post regarding disabling the second dhcpcd service, and it worked, in so far as I had pts working again! However, look at this final command output, taken immediately after the last:

> mount | grep pts
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
devpts on /dev/pts type devpts (rw,relatime,mode=600,ptmxmode=000)
devpts on /var/lib/dhcpcd/dev/pts type devpts (rw,relatime,mode=600,ptmxmode=000)
devpts on /dev/pts type devpts (rw,relatime,mode=600,ptmxmode=000)
devpts on /var/lib/dhcpcd/dev/pts type devpts (rw,relatime,mode=600,ptmxmode=000)

..the original single entry and some! /var/lib/dhcpcd/dev/pts is some kind of fallback, taking its rundir as root??

What's strange is that throughout, my interface has been working just fine (requesting / picking up IPs from my router), in fact if it wasn't for the NEW no-pts side-effect as of this 5th May (latest) round of updates (previous full update ~t-1m), then I would never have known of the config faux-pas at all. Spending a day having to work from 'real' ttys and flicking back 'n' forth to an X instance for graphical stuff was a little painful in the end though.

PS. Note, another recent post 255433 with what I imagine will be the same root cause. I just tried ssh'ing into the box and got the same error too.

Offline

#9 2020-05-13 21:48:07

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,645

Re: dhcpcd seemds to be buggy since the last update

Are you sure this is the only dhcpcd client running still? If you have a non instanced dhcpcd service enabled that is again racing for access and leading to breakage. This was always known that this can lead to issues. There were some internal changes to dhcpcd that amplified this effect, but this was and is never a good/supported method.

Offline

#10 2020-05-14 05:55:26

seth
Member
Registered: 2012-09-03
Posts: 50,957

Re: dhcpcd seemds to be buggy since the last update

Having 'fixed' it, I've just gone back and deliberately broken it

From what I understood, fixing the misconfiguration solved the problem as expected, but he doesn't consider that a fix and in a somewhat unrelated effort re-broke it to ensure the collision was the actual cause of the PTS situation.

@elbeardmorez, as the network wiki notes, you can and always could have *ONE* service per NIC, that should™ include *ONE* dhcpcd instance per NIC (regardless of whether that makes any sense) but it would seem you run a global dhcpcd instance and a special one on wlp3s0, resulting in *TWO* services banging on wlp3s0
While the PTY depletion is a weird and somewhat nasty outcome and may or not hint to a bug in dhcpcd, the collision was always troublesome and causing undefined behavior.

In case you worry that you now can't "dhcpcd -n" or so, this is not a problem. The second intance will talk to the running master process (given you started the general dhcpcd service/daemon)

Offline

Board footer

Powered by FluxBB