You are not logged in.

#1 2019-01-30 10:44:15

jesusofsuburb1a
Member
Registered: 2016-07-29
Posts: 6

sshd seems to eat up pids

Hi everyone,

today I couldn't reach my server anymore.

I run sshd.service, PubKey only, and use fail2ban.


After rebooting I noticed these recurring errors from sshd:

error: fork: Resource temporarily unavailable
fatal: fork of unprivileged child failed

Google suggests this is an issue with too few PIDs available.

The exact maximum PID at that point is unknown, but logs show  PIDs at least as high as 25000.
I think I hit pid_max, which was set to the default 32768, and is now temporarily increased to 64-bit max.

I also noticed that after reboot, the amount of <defunct> ssh child processes builds up quite quickly and consumes pids.


Is this issue caused by sshd eating up my pids and then not being able to spawn shells on connect?
How could I possibly prevent this? Is this some kind of fork bomb?

Fail2ban doesn't seem to help a lot, almost no bans.
But I see plenty of denied attempts, each with a different IP.


How can I protect sshd better? Can I somehow prevent sshd from eating up new PIDs?

Any help on this issue? My google-fu failed me on this topic, which I find strange, can I really be the only one?


Thanks in advance!

Offline

#2 2019-01-30 15:00:45

seth
Member
Registered: 2012-09-03
Posts: 51,213

Re: sshd seems to eat up pids

Don't raise pid_max, that will allow sshd attacks to bleed into your system.
Instead use the "MaxStartups=start:rate:full" for sshd, (man 5 sshd_config) and consider lowering the LoginGraceTime - there's not much you can do against a DDoS attack unless you can whitelist or blacklist IPs (ranges) or face a stupid script kiddie attack and can move the ssh Port away from 22.
Also ensure to not allow root logins and preferably enforce key logins (no password logins)

Offline

#3 2019-01-31 01:16:06

ngoonee
Forum Fellow
From: Between Thailand and Singapore
Registered: 2009-03-17
Posts: 7,356

Re: sshd seems to eat up pids

seth wrote:

Don't raise pid_max, that will allow sshd attacks to bleed into your system.
Instead use the "MaxStartups=start:rate:full" for sshd, (man 5 sshd_config) and consider lowering the LoginGraceTime - there's not much you can do against a DDoS attack unless you can whitelist or blacklist IPs (ranges) or face a stupid script kiddie attack and can move the ssh Port away from 22.
Also ensure to not allow root logins and preferably enforce key logins (no password logins)

Does anyone still run ssh publicly on port 22? The amount of drive-by logins you'd get with that is insane.


Allan-Volunteer on the (topic being discussed) mailn lists. You never get the people who matters attention on the forums.
jasonwryan-Installing Arch is a measure of your literacy. Maintaining Arch is a measure of your diligence. Contributing to Arch is a measure of your competence.
Griemak-Bleeding edge, not bleeding flat. Edge denotes falls will occur from time to time. Bring your own parachute.

Offline

#4 2019-01-31 02:32:04

jesusofsuburb1a
Member
Registered: 2016-07-29
Posts: 6

Re: sshd seems to eat up pids

Thanks seth.

For some reason, the port really was 22. I'm quite sure this has been for a while and I am unsure why that is, I think someone complained about "not nice looking git clone paths".
Oh my.

Anyway, LoginGraceTime and the port changed. I will consider other options in the future.

e: just found out running sudo in a ssh session creates a defunct child pid. not sure what that causes.

Last edited by jesusofsuburb1a (2019-01-31 02:51:03)

Offline

#5 2019-01-31 03:20:02

eschwartz
Fellow
Registered: 2014-08-08
Posts: 4,097

Re: sshd seems to eat up pids

ngoonee wrote:

Does anyone still run ssh publicly on port 22? The amount of drive-by logins you'd get with that is insane.

Yes, lots of people. Kind of like the way we publicly run websites on ports 80 and 443 despite drive-by spiders and automated vulnerability exploitation suites.

Making it super awkward to login anywhere without remembering/looking up ssh ports and adding it as command-line options or tons of ssh_config boilerplate, I'd assume most people running sshd are doing so on port 22 with root disabled, password-auth turned off, and possibly fail2ban as well.

In fact I've never bothered with fail2ban myself, but I've never been explicitly targeted for DDoS, merely seen the usual drive-by failures failing to get in. If I was being explicitly targeted for DDoS, I don't see how changing the port would help, since the attacker would make sure to find whichever other port is actually showing up in scans as being opened...


Managing AUR repos The Right Way -- aurpublish (now a standalone tool)

Offline

#6 2019-01-31 08:57:44

seth
Member
Registered: 2012-09-03
Posts: 51,213

Re: sshd seems to eat up pids

eschwartz wrote:

I don't see how changing the port would help, since the attacker would make sure to find whichever other port is actually showing up in scans as being opened...

seth wrote:

or face a stupid script kiddie attack and can move the ssh Port away from 22

It only works against the unskilled.

fail2ban will most likely not protect you against a massive DDoS attack (since the connections come from everywhere), but shorter grace times and heuristic connection failures help to keep the system open.
You should not, and I mean *never*, allow password logins on a public ssh, especially not on a multi-user system (unless you can convince users to use "correcthorsebatterystaple" - the only truely secure password ;-)
Odds are that some genius has something out of a dictionary (because hackers surely don't know 1337…)

Offline

#7 2019-01-31 10:11:13

jesusofsuburb1a
Member
Registered: 2016-07-29
Posts: 6

Re: sshd seems to eat up pids

Okay, so I just completely resolved this problem.

In case anyone ever stumbles here from google, here is what happened:


A few users have access to this system, and they mainly work with docker.
The system has multiple external IPs and one is used for applications running on the host (ssh and git only, to be precise),
others are used for routing domains to Traefik (reverse proxy) running in a container or for other containers.


Now the issue was that another sshd instance ran in one of the docker containers.
Some automation process; and it even ran as user (we try to avoid root in the containers).

Even though the instance ran in a separate container, the host sshd (and only the host, according to pstree) had some trouble recycling PIDs.
I don't really know how this comes.

What I know is that the container sshd was under attack as well and had about 3 times the amount of login attempts, which spawned lots of childs.
Apparently this caused current_PID to race up to 6 digits within a day.

After killing the container, everything went back to normal.

I think this is some very special kind of a fork bomb, unintended by the attackers, and undoubtedly stupid on my side as the main admin of this server.


Having port 22 open is not an issue at all for sshd, even with all the drive-by logins.
I moved the port back to 22, just to try (and as users prefer it), and it makes no difference.
sshd kills its children without any problem, no more <defunct> processes.


This shows once more that containers can have unintended side effects; and if you're no real expert you might run into trouble.


big thanks to everyone

Offline

#8 2019-01-31 15:28:05

eschwartz
Fellow
Registered: 2014-08-08
Posts: 4,097

Re: sshd seems to eat up pids

seth wrote:
eschwartz wrote:

I don't see how changing the port would help, since the attacker would make sure to find whichever other port is actually showing up in scans as being opened...

seth wrote:

or face a stupid script kiddie attack and can move the ssh Port away from 22

It only works against the unskilled.

Yes, and those are already stopped by their inability to login to the low-hanging fruit usernames with low-hanging fruit passwords. any targeted attack is usually not a mere script kiddie.

fail2ban will most likely not protect you against a massive DDoS attack (since the connections come from everywhere), but shorter grace times and heuristic connection failures help to keep the system open.

And fail2ban is at least as good as moving the port, because it totally destroys a one-host script kiddie's ability to attack under any conditions, and reduces log noise greatly as well. Anyone sophisticated enough to attack from everywhere via a botnet, is likely sophisticated enough to know how to use nmap or Shodan to find the ports with sshd on them. big_smile

You should not, and I mean *never*, allow password logins on a public ssh, especially not on a multi-user system (unless you can convince users to use "correcthorsebatterystaple" - the only truely secure password ;-)
Odds are that some genius has something out of a dictionary (because hackers surely don't know 1337…)

https://mostsecure.pw/ is much more secure than using "correcthorsebatterystaple".


Managing AUR repos The Right Way -- aurpublish (now a standalone tool)

Offline

Board footer

Powered by FluxBB