
Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: August 30, 2024
systemd is a system and service manager for Linux operating systems. It’s responsible for managing and maintaining system processes, services, and daemons. Sometimes, these services can fail, and it’s important that we get notified of such an event so that we can address the issue quickly.
In this tutorial, we’ll learn how to use the systemd OnFailure feature to trigger notifications and how to configure notification channels over Slack and email.
Before we begin, let’s look at what systemd unit files are, their purpose, and their format.
A unit file is a plain text, ini-style file that encodes information about a service or any other entity controlled and supervised by systemd.
Let’s look at a typical unit file:
[Unit]
Description=The NGINX HTTP and reverse proxy server
After=syslog.target network.target remote-fs.target nss-lookup.target
[Service]
Type=forking
PIDFile=/run/nginx.pid
ExecStartPre=/usr/sbin/nginx -t
ExecStart=/usr/sbin/nginx
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s QUIT $MAINPID
PrivateTmp=true
[Install]
WantedBy=multi-user.target
Typically, a unit file consists of three sections. The common configuration items are in the [Unit] and [Install] sections, while the service-specific configurations are in the [Service] section.
We can view the complete list of systemd section options by running the commands man systemd.unit and man systemd.service.
The [Unit] section accepts an OnFailure option. This is a space-separated list of one or more units that are activated when this unit enters the “failed” state.
We should note that we can’t use OnFailure to execute a command directly. So, what we’ll do in the next section is create a unit that, when started, sends the notification, and then we’ll use this new unit as part of the OnFailure definition for a service that we’d like to be notified of when it enters a failed state.
Let’s start by creating a sample notification service. This service doesn’t do anything useful, but it will serve to explain some important concepts.
We can create a systemd unit file to describe our notification service by creating a file in one of the locations where systemd expects to find them. Typically, unit files created by an administrator are placed under the /usr/local/lib/systemd/system directory.
Let’s create our pseudo-notify service:
$ sudo cat - <<EOF > /usr/local/lib/systemd/system/[email protected]
[Unit]
Description=Send Pseudo Notification
[Service]
Type=oneshot
ExecStart=echo 'Notification triggered for service %i'
[Install]
WantedBy=multi-user.target
EOF
Let’s look at a few important concepts here:
Now that we understand the basics of how to use OnFailure to trigger notifications when a systemd service fails, we’re ready to create our first notification.
To start, let’s create a script that utilizes the Slack API to post messages to a channel:
$ sudo cat - <<EOF > /usr/local/bin/slackNotify.sh
#!/bin/bash
# Bash script to send systemd notifications to Slack
# Edit the following variables to match your requirements
SLACK_WEBHOOK_URL="https://hooks.slack.com/services/XXXXXXXXX/XXXXXXXXX/XXXXXXXXXXXXXXXXXXXXXXXX"
SLACK_CHANNEL="#general"
SLACK_USERNAME="Notification Bot"
SLACK_ICON=":zap:"
SLACK_COLOR="danger"
SLACK_TITLE="Service $SERVICE_NAME failed on $(hostname)"
SLACK_PRETEXT="Service $SERVICE_NAME failed"
SLACK_TEXT="$(systemctl status $SERVICE_NAME)"
SLACK_FOOTER="Notification Bot at $(hostname) on $(date)"
# End of variables
function usage {
programName=$0
echo "description: use this script to post systemd service failure message to Slack channel"
echo "usage: $programName -s \"service name\""
echo " -s the systemd service name e.g. nginx"
exit 1
}
# Get service name from options
while getopts ":s:" opt; do
case $opt in
s)
SERVICE_NAME=$OPTARG
;;
\?)
echo "Invalid option: -$OPTARG" >&2
exit 1
;;
:)
echo "Option -$OPTARG requires an argument." >&2
exit 1
;;
esac
done
if [[ ! "${SERVICE_NAME}" ]]; then
echo "Service name is required"
usage
fi
SLACK_ATTACHMENT='[{"fallback": "'"$SLACK_MESSAGE"'", "color": "'"$SLACK_COLOR"'", "title": "'"$SLACK_TITLE"'", "title_link": "'"$SLACK_TITLE_LINK"'", "pretext": "'"$SLACK_PRETEXT"'", "text": "'"$SLACK_TEXT"'", "footer": "'"$SLACK_FOOTER"'", "footer_icon": "'"$SLACK_FOOTER_ICON"'"}]'
# Send notification to Slack
curl -X POST --data-urlencode 'payload={"channel": "'"$SLACK_CHANNEL"'", "username": "'"$SLACK_USERNAME"'", "icon_emoji": "'"$SLACK_ICON"'", "attachments": '"$SLACK_ATTACHMENT"'}' $SLACK_WEBHOOK_URL
# Exit with success code
exit 0
EOF
We should replace the SLACK_WEBHOOK _URL with a valid URL created under our Slack account. We can find more information about how to create a webhook URL in the Slack API Documentation.
Next, let’s make the script executable using chmod:
$ sudo chmod +x /usr/local/bin/slackNotify.sh
Then, let’s create the “notify-slack” service:
$ sudo cat - <<EOF > /usr/local/lib/systemd/system/[email protected]
[Unit]
Description=Send Systemd Notifications to Slack
[Service]
Type=oneshot
ExecStart=/usr/local/bin/slackNotify.sh -s %i
[Install]
WantedBy=multi-user.tar
EOF
Now, we’re ready to add this notification service to any systemd service. To do that, we just add OnFailure=notify-slack@%i.service to the service we’d like to monitor under the [Unit] section, for example:
[Unit]
Description=The NGINX HTTP and reverse proxy server
After=syslog.target network.target remote-fs.target nss-lookup.target
OnFailure=notify-slack@%i.service
[Service]
Type=forking
PIDFile=/run/nginx.pid
ExecStartPre=/usr/sbin/nginx -t
ExecStart=/usr/sbin/nginx
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s QUIT $MAINPID
PrivateTmp=true
[Install]
WantedBy=multi-user.target
Before our changes become effective, we need to inform systemd of the changes:
$ sudo systemctl daemon-reload
Another common channel to receive notifications is via email. But first, we need to ensure our system is able to send reliable emails with authentication.
After we’ve installed and configured msmtp, let’s build a simple email notification service:
$ sudo cat - <<EOF > /usr/local/lib/systemd/system/[email protected]
[Unit]
Description=Send Systemd Notifications to Email
[Service]
Type=oneshot
ExecStart=/usr/bin/bash -c 'echo "Subject: Service Failed\n\nService %i failed on $(hostname)\n$(systemctl status %i)" | /bin/msmtp [email protected]'
[Install]
WantedBy=multi-user.target
EOF
We must replace “[email protected]” with the email address where we’d like to receive notifications.
Now, we’re ready to add this notification service to any systemd service. To do that, we just add OnFailure=notify-email@%i.service to the service we’d like to monitor under the [Unit] section as we saw above.
In this article, we learned what systemd OnFailure is and how to get notified when a systemd service enters a “failed” state using the systemd unit OnFailure option.
Additionally, we explored two common notification channels and how to create simple services to receive notifications over Slack and email.