Doing something before systemd shuts your supervisord down
If you are running your server applications via supervisord on a Linux distro running systemd, you may find this post useful.
Problem Scenario
An example scenario to help us establish the utility for this post is as follows:
systemdstarts the shutdown processsystemdstopssupervisordsupervisordstops your processes- You see in-flight requests being dropped
 
Solution
What we want to do is prevent in-flight requests being dropped when a system is shutting down as part of a power off cycle (AWS instance termination, for example). We can do so in two ways:
- Our server application is intelligent enough to not exit (and hence halt instance shutdown) if a request is in progress
 - We hook into the shutdown process above so that we stop new requests from coming in once the shutdown process has started and give our application server enough time to finish doing what it is doing.
 
The first approach has more theoretical “guarantee” around what we want, but can be hard to implement correctly. In fact, I couldn’t get it right even after trying all sorts of signal handling tricks. Your mileage may vary of course and if you have an example you have, please let me know.
So, I went ahead with the very unclean second approach:
- Register a shutdown “hook” which gets invoked when 
systemdwants to stopsupervisord - This hook takes the service instance out of the healthy pool
 - The proxy/load balancer detects the above event and stops sending traffic
 - As part of the “hook”, after we have gotten ourself out of the healthy service pool, we sleep for an arbitary time so that existing requests can finish
 
When you are using a software like linkerd as your RPC proxy, even long-lived connections are not a problem since
linkerd will see that your service instance is unhealthy, so it will not proxy any more requests to it.
Proposed solution implementation
The proposed solution is a systemd unit - let’s call it drain-connections which is defined as follows:
#  cat /etc/systemd/system/drain-connections.service
[Unit]
Description=Shutdown hook to run before supervisord is stopped
After=supervisord.service networking.service
PartOf=supervisord.service
Conflicts=shutdown.target reboot.target halt.target
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true
ExecStop=/usr/local/bin/consul maint -enable
ExecStop=/bin/sleep 300
TimeoutSec=301
[Install]
WantedBy=multi-user.target
Let’s go over the key systemd directives used above in the Unit section:
Afterensures thatdrain-connectionsis started aftersupervisord, but stopped beforesupervisordPartOfensures thatdrain-connectionsis stopped/restarted wheneversupervisordis stopped/restarted
The Service section has the following key directives:
Type=oneshot(learn more about it here)- The first 
ExecStopfirst takes the service instance out of the pool by enablingconsulmaintenance mode - The second 
ExecStopthen gives our application 300 seconds to stop finishing what it is currently doing - The 
TimeoutSecparameter overridesystemddefault timeout of 90 seconds to 301 seconds so that the earlier sleep of 300 seconds can finish 
In addition, we setup supervisord systemd unit override as follows:
# /etc/systemd/system/supervisord.service.d/supervisord.conf
[Unit]
Wants=drain-connections.service
This ensures that drain-connections service gets started when supervisord is started.
Discussion
Let’s see how the above fits in to our scenario:
systemdstarts the shutdown process and tries to stopsupervisord- This triggerd 
drain-connectionsto be stopped where we have the commands we want to be executed - The above commands will take the instance out of the pool and sleep for an arbitrary period of time
 drain-connectionsfinishes “stopping”systemdstopssupervisord- shutdown proceeds
 
What if drain-connections is stopped first? That is okay, because that will execute the necessary commands
we would want to be executed. Then, supervisord can be stopped which will stop our application server, but
the drain-connections unit has already done its job by that time.