For interest, here is a copy of the script I wanted to use:
#!/bin/bash
STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3
MYVOL=$1
WARNTHRESH=$2
CRITTHRESH=$3
RET=$?
if [[ $RET -ne 0 ]]
then
echo "query problem - No data received from host"
exit $STATE_UNKNOWN
fi
vdf -h -P | grep -E '^/vmfs/volumes/' | awk '{ print $2 " " $3 " " $4 " " $5 " " $6 }' | while read output ; do
DISKSIZE=$(echo $output | awk '{ print $1 }' )
DISKUSED=$(echo $output | awk '{ print $2 }' )
DISKAVAILABLE=$(echo $output | awk '{ print $3 }' )
PERCENTINUSE=$(echo $output | awk '{ print $4 }' )
VOLNAME=$(echo $output | awk '{ print $5 }' )
CUTPERC=$(echo $PERCENTINUSE | cut -d'%' -f1 )
if [ "/vmfs/volumes/$MYVOL" = $VOLNAME ] ; then
if [ $CUTPERC -lt $WARNTHRESH ] ; then
echo "OK - $PERCENTINUSE used | Volume=$MYVOL Size=$DISKSIZE Used=$DISKUSED Available=$DISKAVAILABLE PercentUsed=$PERCENTINUSE"
exit $STATE_OK
fi
if [ $CUTPERC -ge $CRITTHRESH ] ; then
echo "CRITICAL - *$PERCENTINUSE used* | Volume=$MYVOL Size=$DISKSIZE Used=$DISKUSED Available=$DISKAVAILABLE PercentUsed=$PERCENTINUSE"
exit $STATE_CRITICAL
fi
if [ $CUTPERC -ge $WARNTHRESH ] ; then
echo "WARNING - *$PERCENTINUSE used* | Volume=$MYVOL Size=$DISKSIZE Used=$DISKUSED Available=$DISKAVAILABLE PercentUsed=$PERCENTINUSE"
exit $STATE_WARNING
fi
fi
#echo "No data returned"
#exit $STATE_UNKNOWN
However, in a SAN environment where every host in an ESX environment has the same LUNs attached for HA, having every host report on space utilisation for the same list of LUNs is a bit over the top, and I'm actually more interested in knowing if a host has lost its path(s) to a LUN.
So I thought I would make a small modification to the script so that it would actually complain if the VMFS (LUN) disappeared. In its original form the script would produce no output to cover this eventuality, and I thought I could see why: the two commented lines just before the end of the while loop needed to be uncommented, changed to report a missing vmfs and moved outside the loop to pick up anything that didn't hit any of the exit statements. It should have taken about 30 seconds.
Well I spent about an hour grappling with a very curious symptom: if any of the conditions inside the loop were met, the script would provide the correct response, but then, instead of exiting, would appear to carry on executing any statements after the loop. This meant that the script would report that it had found a vmfs volume AND report that it couldn't find it. I tried setting variables inside the loop to pick up back in the main section, even exporting them to make them exist outside the script; but I seemed to be stuck with some sort of scoping problem.
It was only when I came across an article by Craig Russell on BASH variable scope inside a While loop, that I realised what the problem was: the while loop was sitting on the end of a pipe and in Bash that meant that the while loop was running in a different process. So the exit statements and any variables set inside the loop have no effect in the outer script.
Craig had a reasonable alternative - pipe out to a temporary file and hang the while loop onto the file instead. I was then able to implement my properly performing script:
#!/bin/sh
STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3
MYVOL=$1
WARNTHRESH=$2
CRITTHRESH=$3
RET=$?
if [[ $RET -ne 0 ]]
then
echo "query problem - No data received from host"
exit $STATE_UNKNOWN
fi
vdf -h -P | grep -E '^/vmfs/volumes/' | awk '{ print $2 " " $3 " " $4 " " $5 " " $6 }' >/tmp/vmfslist.tmp
while read output
do
DISKSIZE=$(echo $output | awk '{ print $1 }' )
DISKUSED=$(echo $output | awk '{ print $2 }' )
DISKAVAILABLE=$(echo $output | awk '{ print $3 }' )
PERCENTINUSE=$(echo $output | awk '{ print $4 }' )
VOLNAME=$(echo $output | awk '{ print $5 }' )
CUTPERC=$(echo $PERCENTINUSE | cut -d'%' -f1 )
if [ "/vmfs/volumes/$MYVOL" = $VOLNAME ]
then
if [ $CUTPERC -lt $WARNTHRESH ]
then
echo "OK - $PERCENTINUSE used | Volume=$MYVOL Size=$DISKSIZE Used=$DISKUSED Available=$DISKAVAILABLE PercentUsed=$PERCENTINUSE"
exit $STATE_OK
elif [ $CUTPERC -ge $CRITTHRESH ]
then
echo "CRITICAL - *$PERCENTINUSE used* | Volume=$MYVOL Size=$DISKSIZE Used=$DISKUSED Available=$DISKAVAILABLE PercentUsed=$PERCENTINUSE"
exit $STATE_CRITICAL
elif [ $CUTPERC -ge $WARNTHRESH ]
then
echo "WARNING - *$PERCENTINUSE used* | Volume=$MYVOL Size=$DISKSIZE Used=$DISKUSED Available=$DISKAVAILABLE PercentUsed=$PERCENTINUSE"
exit $STATE_WARNING
fi
fi
done < /tmp/vmfslist.tmp
echo "No data returned. VMFS Unavailable?"
exit $STATE_CRITICAL
The lesson? There's no real substitute for knowing how stuff works... properly.
No comments:
Post a Comment