A good way to check if a network share like NFS or CIFS
is still available, is to monitor an existing file on the share itself.
Doing this with SMB/CIFS is a little easier than with NFS
when the NFS share is hard mounted.
The check would then wait forever if the NFS server is not available and not return
any error. It could also happen that your checks are piling up in the process list.
Here is an example to make it work by using /usr/bin/timeout
timeout
will run stat
to check for the file
but kill the stat command if it does not return within
a specified time period.
For demonstration purposes I use monit
but this can be done with any other monitoring solution
like OpenNMS by executing the check over net-snmp's extend feature for example.
apt-get install monit
Monit Version >= 5.7
Since monit 5.7 "check program" supports now arguments.
/etc/monit/conf.d/cifs_nfs
check program CIFS with path "/usr/bin/timeout 1 /usr/bin/stat -t /media/cifs/test.txt"
if status != 0 then alert
check program NFS with path "/usr/bin/timeout 1 /usr/bin/stat -t /media/nfs/test.txt"
if status != 0 then alert
Monit Version < 5.7
With older versions of monit
you have to use a wrapper script for the check.
mkdir /etc/monit/check_scripts/
/etc/monit/conf.d/cifs_nfs
check program CIFS with path "/etc/monit/check_scripts/check_stale_cifs.sh"
if status != 0 then alert
check program NFS with path "/etc/monit/check_scripts/check_stale_nfs.sh"
if status != 0 then alert
/etc/monit/check_scripts/check_stale_cifs.sh
#!/bin/bash
CHECK_FILE="/media/cifs/test.txt"
TIMEOUT=1
BIN_TIMEOUT=/usr/bin/timeout
BIN_STAT=/usr/bin/stat
"$BIN_TIMEOUT" "$TIMEOUT" "$BIN_STAT" -t "$CHECK_FILE" > /dev/null 2> /dev/null
RETVAL=$?
[ $RETVAL -eq 0 ] && echo "Ok. Found $CHECK_FILE" && exit $RETVAL
[ $RETVAL -eq 124 ] && echo "Timed out checking for $CHECK_FILE" >&2 && exit $RETVAL
[ $RETVAL -ne 0 ] && echo "Could not find $CHECK_FILE" >&2 && exit $RETVAL
/etc/monit/check_scripts/check_stale_nfs.sh
#!/bin/bash
CHECK_FILE="/media/nfs/test.txt"
TIMEOUT=1
BIN_TIMEOUT=/usr/bin/timeout
BIN_STAT=/usr/bin/stat
"$BIN_TIMEOUT" "$TIMEOUT" "$BIN_STAT" -t "$CHECK_FILE" > /dev/null 2> /dev/null
RETVAL=$?
[ $RETVAL -eq 0 ] && echo "Ok. Found $CHECK_FILE" && exit $RETVAL
[ $RETVAL -eq 124 ] && echo "Timed out checking for $CHECK_FILE" >&2 && exit $RETVAL
[ $RETVAL -ne 0 ] && echo "Could not find $CHECK_FILE" >&2 && exit $RETVAL