Background
BigPanda has made its way into the organization. I wasn't sure at first why, given that there's no shortage of Network Monitoring OSS / EMS systems in play.
Many vendors use their own EMS. VMware for example, uses VROPS (vRealize Operations Suite - now known as Aria Operations). So there is and has been a use case for consolidating this information from these disparate monitoring systems into a "Northbound" system.
So that's what BigPanda is, I guess. It was pitched as a Northbound system. It does not seem to be very mature, and it is simpler to use than most of them (based on limited inspection and reading). But the business case pitch is that it has an Artificial Intelligence rules engine that provides superior correlation, and if this is true, it could certainly make it a northbound system worthy of consideration.
So - that is why we stepped in to integrate Zabbix with BigPanda. We already have VROPS as our "authoritative" monitoring system for all things VMWare. Our team, which does use this VROPS, does not own and manage that platform (another group does). I believe they use it to monitor the vCenters, the hypervisors, and datastores. I don't think they're using it to monitor tenant workloads (virtual machines running on the hypervisors).
Our Zabbix platform, which we manage ourselves, is a "second layer of monitoring" behind VROPS. It manages only the VMWare Hypervisors along with some targeted specific virtual machines we run (load balancers, cloud management platform VMs, et al). The BigPanda team wanted to showcase the ability to correlate information from Zabbix and VROPS, so we volunteered to integrate the two systems.
Note: It is critical when integrating, that these integration steps be done in precisely this order!!!
Integration Steps
Setting up the Media Type
First, you need to "create" a Media Type - and this means Importing one, not creating one. There are two buttons when you click Media Type, "Create" and "Import". Because the Media Type has already been crafted, we will use "Import". The BigPanda Media Type, which is classified as a Webhook media type, is available for download, and you can find this (json) file at the following link: https://docs.bigpanda.io/docs/zabbix
When you import this webhook media type, you have the option to "Update Existing" or "Create New". The first time, of course, requires "Create New" but any subsequent updates to the webhook would utilize the "Update Existing" button.
After the media type has been created, everything will auto-populate. The Media Type tab will have a name (BigPanda in this case), a Type (Webhook), and a set of parameters. Most of these can be left alone, but four of them will need to be changed to literal values of macros (literal values for initial testing is recommended): BP_app_key, BP_endpoint, BP_token - and the zabbix url (which is at the bottom and out of view in the screenshot example below).
Setting Up the User Group
Next, you will create a User Group. The main reason for creating a (new) Big Panda user group, is that you can restrict the access of the Hosts that Big Panda has access to. If you wanted to allow Big Panda to have free roam access to all monitored hosts, then you probably could use one of the other host groups available. We wanted Big Panda to only receive alerts for specific hosts (hypervisors, test VMs, etc) so this was the justification for creating a new and separate Big Panda group. In the Host Permissions, we give this new user group Read access to these host groups.
Below is an example of what this group looks like.
Now, one thing worth looking at in this example, is the fact that the newly created User Group has Debug disabled. But there is a separate Debug Enabled group which does have Debug enabled, and any groups that we want to be debugged, can simply be slipped into this group. There will be more on debugging later. Another thing worth mentioning, is that we did NOT enable FrontEnd access for this user group. This is an integration outbound, and we don't expect a Big Panda user / group to be logging into the UI.
Setting Up the User
Next, we create the User. Users need to have a Media Type, and are
placed in User Groups which is why the Media Type and User Groups were
created BEFORE the user. Below, is an example of how the user is defined:
Notice that the user is mapped into the bigpandaservice User Group that we created in the previous step, which is why the User Group was pre-created in a previous step.
Now, after we establish the user fields, it is critically important to attach the User to the Media Type. Without this mapping, the alerts from Zabbix WILL NOT SEND!!!
After this Update button is hit, it is wise to verify and double-verify that this Media Type sticks - in our case, it did not and we had to remove the user and re-create it for some reason.
The final step in configuration is to create a Trigger Action on the Media Type. This is how that looks:
Next, you can click on Media Type, and select the "Test" button next to BigPanda. If you don't fill in the umpteen fields, and leave them as macros, with just the 4 fields we configured in the Media Type (endpoint, api key, token and zabbix url), the Test button "should" produce a 201 result, but you may get a json parse error because no actual data was sent. This is okay.
If the 201 exists, the Big Panda should receive the test alert. But this does not mean that the trigger is firing!!! The step to be taken after the Media Type "Test" button, is to generate an alert condition on the hosts that the Big Panda host group has access to, and make sure that Big Panda receives it!
Debugging & Troubleshooting
Troubleshooting requires making sure that all of these configuration steps were taken properly. This Webhook integration is all about mappings - users to user groups, users to media types, trigger definitions, host groups that are correct, etc.
When it comes to debugging, the debugging for a Webhook occurs within the Webhook!!!
The BigPanda Webhook, meaning the json file you imported, if you click on the Webhook you can see this json! In the screenshot below, notice the field called "script"...
If you were to click the "pencil" icon to the right, it will open up the entire webhook source code, which in this case is written in JavaScript.
Now, you will notice that the BigPanda Webhook is sending messages to the Zabbix log at Level 4. The problem is, most people shouldn't be using Level 4 in their Zabbix logging (in zabbix_server.cfg file). It is too voluminous and makes debugging impossible if you are watching or tailing the log looking for webhook-specific messages.
What I did, for testing and debugging, was to use a level that allows me to see the Webhook information without having to comb through a mountain of Zabbix debug information that you would normally see at Level 4 (Debug level). You will see in the screenshot below, that I commented out the "level 4" and replaced it with "level 2" - temporarily of course, until I could make sure that the Webhook was working properly. This example below, of course is just that: an example of how you can more simply debug the webhook. There are more lines in this code that I made these kinds of changes to, but the screenshot gives you an example of how it's done.
So hopefully that helps anyone wanting to get the BigPanda Webhook working in Zabbix, or for that matter, these steps should be helpful for any Webhook integration (i.e. Slack, Discord, et al).
No comments:
Post a Comment