Monday, November 18, 2024

Cisco UCS M5 Server Monitoring with Zabbix

I got a request from my manager recently, about using Zabbix to monitor Cisco servers.  

Specifically, someone had asked about whether it was possible to monitor the CRC errors on an adaptor.

Right now, the monitoring we are doing is coming from the operating systems and not at the hardware level. But we do use Zabbix to montor vCenter resources (hypervisors), using VMware templates, and we use Zabbix to "target monitor" certain virtual machines at the Linux OS level (Linux template) and at Layer 7 (app-specific templates).

Up to this point, our Zabbix monitoring has been, essentially, "load and forget" where we load the template, point Zabbix to a media webhook (i.e. Slack) and just monitor what comes in. We haven't really even done much extension of the templates, using everything "out of the box". Recently, we did add some new triggers on VMware monitoring, for CPU and Memory usage thresholds. We were considering adding some for CPU Ready as well.

But...this ask was to monitor Cisco servers, with our Zabbix monitoring system.

The first thing I did, was to check and see what templates for Cisco came "out of the box". I found two:

  1. Cisco UCS by SNMP
  2. Cisco UCS Manager by SNMP

I - incorrectly - assumed that #2, the Cisco UCS Manager by SNMP, was a template to interface with a Cisco UCS Manager. I learned a bit later, that it is actually a template to let Zabbix "be" or "emulate" a Cisco UCS Manager (as an alternative or replacement). 

First, I loaded the Cisco UCS by SNMP template. The template worked fine from what I could tell, but it didn't have any "network" related items (i.e. network adaptors).

As mentioned, after reading that Cisco UCS Manager was an extension or superset of Cisco UCS by SNMP, I went ahead and loaded that template on some selected hosts. We were pleased to start getting data flowing in from those hosts, and this time the items included in the template were adaptor metrics, but very basic metrics such as these shown below.

Adaptor/Ethernet metrics in Cisco UCS Manager Template

This was great. But we needed some esoteric statistics, such as crc errors on an adaptor. How do we find these? Are they available?

Well, it turns out that they indeed are available...in a MIB called:CISCO-UNIFIED-COMPUTING-ADAPTOR-MIB

Unfortunately, this MIB is not included in the CISCO-UCS-Manager template. So what to do now? Well, there are a couple of strategies...

  1. Add a new Discovery Rule to the (cloned) Cisco UCS Manager template. 
  2.  Create a new template for the adaptor mib using a tool called mib2zabbix.
I tried to do #1 first, but had issues because the discover rule needed an LLD Macro and I wasn't sure how, syntactically, to create the Discovery Rule properly. My attempts at doing so, failed to produce any results when I tested the rule.
 
I went to pursue #2, which led me down an interesting road. First, the mib2zabbix tool requires the net-snmp package to be installed. And on CentOS, this package alone will not work - you also have to install net-snmp-util package to get the utilities like snmptranslate that you need.

The first time I ran mib2zabbix, it produced a template that I "knew" was not correct. I didn't see any of the crc objects in the template at all.  I did some additional research, and found that for mib2zabbix to work correctly, there has to be a correct "mib search path". 

To create the search path, you create a ".snmp" folder in your home directory, and in that directory, you create an snmp.conf file. This file looked as follows for me to be able to run snmptranslate and mib2zabbix "properly".
 
mibdirs +/usr/share/snmp/mibs/cisco/v2
mibdirs +/usr/share/snmp/mibs/cisco/ucs-C-Series-mibs
mibdirs +/usr/share/snmp/mibs/cisco/ucs-mibs


No comments:

Cisco UCS M5 Server Monitoring with Zabbix

I got a request from my manager recently, about using Zabbix to monitor Cisco servers.   Specifically, someone had asked about whether it wa...