In my working environment, we use (rather extensively) ZenLB (or as they are now know, Zevenet) Load Balancers. In production systems, sometimes the back-ends of an infrastructure, or the “real servers” behind the load balancers, can become unresponsive for whatever reason. A typical one that I see quite often is when using clustered MS Exchange Client Access servers behind a load balanced pool. IIS may lock up on one or multiple CAS’s causing the connections coming in from clients to be stored at LB level as “pending”.
This is fine, but in my experience, once the Zevenet LB racks up 1500+ pending connections on one of its farms, it quickly exhausts it’s available memory.
The following check is called by the Nagios NRPE agent installed locally on the LB (It’s just Debian 8 afterall)
# ZenLB Pending/Established Connection Tracking v1.0 - Dave Byrne
pending=`cat /proc/net/nf_conntrack |grep SYN_SENT |grep dport='443|80' |wc -l`
established=`cat /proc/net/nf_conntrack |grep ESTABLISHED |grep dport='443|80' |wc -l`
if [ $pending -gt 5 ]
printf "CRITICAL - Pending connections above threshold! Pending: $pending -- Established: $establishedn"
elif [ $established -eq 0 ] && [ $hour -ge 8 ] && [ $hour -le 23 ];
printf "CRITICAL - No established connections! Pending: $pending -- Established: $establishedn"
printf "OK - Pending connections at acceptable level. Pending: $pending -- Established: $establishedn"
The check will go CRITICAL if pending connections across ANY of the farms goes above 5. It will also go CRITICAL is the established connections drops to 0 (probably bad). But I have limited this to a certain time frame, as I appreciate that there may well be 0 established connections at 4am!!