Effective system administration relies on robust monitoring scripts. This guide explores best practices for implementing service health checks and automated recovery using Bash conditional logic.
1. System Resource Monitoring
To monitor system memory and alert administartors, we can leverage the free utility. The following script checks if available memory drops below a defined threshold.
#!/bin/bash
THRESHOLD=1000
AVAILABLE_MEM=$(free -m | awk 'NR==3 {print $4}')
ALERT_MSG="Warning: Low memory detected. Current: ${AVAILABLE_MEM}MB"
if [ "$AVAILABLE_MEM" -lt "$THRESHOLD" ]; then
echo "$ALERT_MSG"
# Integration point for mailx or API-based alerting
fi
2. Service Health Check Strategies
Monitoring web servers and databases requires reliable verification methods. Relying on raw output can be fragile; instead, convert output to numerical counts using wc -l for consistent conditional evaluation.
A. Process and Socket Validation
Rather than parsing specific columns from output, identify the presence of a process or listener count:
# Validate service via process count
if [ "$(ps -ef | grep -v grep | grep -c 'mysqld')" -gt 0 ]; then
echo "Database is active."
fi
# Validate service via network socket
if [ "$(ss -lntu | grep -c ':3306')" -gt 0 ]; then
echo "Database port is listening."
fi
B. Remote Verification (Nmap/NC)
For high-availability scenarios, verifying service status from the perspective of a network client is often more reliable than local process checks.
# Using nc (netcat) to verify service responsiveness
if nc -z -w 2 127.0.0.1 80 &>/dev/null; then
echo "Web service is reachable."
fi
3. Implementation of Service Control Scripts
A unified start/stop script ensures predictable state management. The structure below demonstrates a robust pattern for managing a background service.
#!/bin/bash
SERVICE_NAME="rsync"
case "$1" in
start)
/usr/bin/rsync --daemon
[ "$(pgrep -c "$SERVICE_NAME")" -gt 0 ] && echo "Service started."
;;
stop)
pkill "$SERVICE_NAME"
[ "$(pgrep -c "$SERVICE_NAME")" -eq 0 ] && echo "Service stopped."
;;
restart)
$0 stop
sleep 2
$0 start
;;
*)
echo "Usage: $0 {start|stop|restart}"
exit 1
esac
4. Advanced Web Verification
For web applications, checking the HTTP status code is more definitive than simply verifying if a port is open. Use curl to ensure the service is returning valid responses.
# Check for successful HTTP status codes (200, 301, 302)
HTTP_STATUS=$(curl -I -s -o /dev/null -w "%{http_code}" http://localhost)
if [[ "$HTTP_STATUS" =~ ^200|301|302$ ]]; then
echo "Web application healthy."
else
echo "Web application returned error code: $HTTP_STATUS"
fi