Proper SIGTERM handling

foowtf · 2015-08-12 21:41:56

How should SIGTERM best be handled by a progam spawning child processes, like a script-executing shell? I ask for SIGTERM specifically because it has two use cases whereas SIGINT for instance is in practice only ever sent to the terminal's process group.
Those two cases are:
(a) A user calls kill/pkill to terminate a blocking program.
(b) /bin/init sends SIGTERM to process groups when shutting down.

As far as I can say, there should be four possible ways to handle SIGTERM resulting in eight combinations because it is, I think, impossible to know whether the signal came from (a) or (b).
(1) Terminate right away (this is what SIG_DFL does).
(a) This leaves child processes running in the background.
(b) Each process can do cleanup work but the parent won't know for example what exit status its children had.
(2) Wait for children, then terminate.
(a) Nothing happens until a child terminates because of other reasons.
(b) Status of children can be logged/output.
(3) Send SIGTERM to own process group, then wait for children.
(a) Parent gets two SIGTERMs (easily handled).
(b) Each process gets two SIGTERMs, rest like (2b).
(4) Send SIGTERM to all direct children (not grand-children, etc.) whose pids are kept in a custom list, then wait for them.
(a) This somehow depends on the children also having this policy. Imagine anacron as a child (see below).
(b) Each direct child gets two SIGTERMs.

Web servers using processes to handle connections normally wait for them to finish in one way or another. Apache does it like in (4), I think. Anacron goes the (1) way, as does bash unless a trap handler is specified. Especially in the case of anacron I am unsure if this isn't a bug.

Normal execution of anacron:
* locks timestamp files of jobs about to be run
* runs jobs
* unlocks files, mails output, updates timestamps, etc.

Scenario with (a):
* locks files
* runs jobs
* terminates while jobs are still running, files are automatically unlocked
Now if anacron is run again, it will execute the jobs a second time although it otherwise enforces that a job is in execution at most once at a time.
Using (3) would terminate the jobs properly while using (2) would not terminate anything.

It seems (3) is the most correct and most universal solution despite the double SIGTERM problem it generates.
However, a custom SIGTERM handler usually only cleans up and then sends SIGTERM to self to make WIFSIGNALED true so two signals shouldn't be a problem for most programs.
In the case that the parent process is itself spawned by another process (e.g. a shell script), like
-script
--parent
---child
, a SIGTERM to the process group would also kill script which might be a problem.

So how should a proper unix program behave?
I guess my actual question is whether the definition of SIGTERM is rather “Terminate the process with the specified pid as soon as possible while avoiding at least rudimentary data corruption” or more like “Inform the program it should terminate soon but let it for itself decide when in order to make sure no inconsistent states arise?”

dimich · 2015-08-15 01:58:08

Concerning case (b), there is a possibility to specify method how to send SIGTERM to a unit process group in systemd. So unit file should be correctly written depending on process behaviour.
I think the only action on SIGTERM handling shoud be preventing data corruption and avoid failure on next execution.

Arch Linux

#1 2015-08-12 21:41:56

Proper SIGTERM handling

#2 2015-08-15 01:58:08

Re: Proper SIGTERM handling

Board footer