Thursday, November 10, 2022

Zabbix log file parsing - IPtables packet drops

I recently had to create some iptables rules for a clustered system, restricting traffic from RabbitMQ, Percona and other services to just the nodes participating in the cluster.  The details of this could be a blog post in and of itself.

I set up logging on my IPTables drops, which was initially for debugging. I had planned on removing those logs. But, once I got all the rules working perfectly, the amount of logging drops down to almost zero. The only thing that gets logged at that point, are things you would want to know about. So I decided to leave the logging for dropped packets in play (if you do this, make sure your logs wrap and manage your disk space!).

When you put up a firewall that governs not only inbound traffic but also outbound traffic, you learn a LOT about what your system is receiving and sending, and it is always interesting to see traffic being dropped. I had to continually investigate traffic that was being dropped (especially outbound), and whitelist those services once I discovered what they were. NTP, DNS, all the normal things, but also some services that don't come to mind easily.

Logging onto systems to see what traffic is being dropped is a pain. You might be interested and do it initially, but eventually you will get tired of having to follow an operational routine.

I decided to do something more proactive. Send dropped packets to Zabbix.

To do this, here are the steps:

1. Add 2 new items to the hosts that employed the IPTables rules.

  • Iptables Dropped Packet Inbound
  • Iptables Dropped Packet Outbound

2.  Configure your items 

Configure your rules properly, as shown below (example is Inbound drops)

Zabbix Log File Item


First, and this is important, note the Type. It is Zabbix agent (active), not the default "Zabbix agent". This is because a log file check requires an active agent, and will not work if Zabbix is configured as a passive agent. There are plenty of documents and blogs on the web that discuss the difference between Active and Passive.

A passive agent configuration means Zabbix will open the TCP connection and fetch back the measurements it wants on port 10050. An active agent, means that the monitored node will take the responsibility of opening the connection to the Zabbix server, on port 10051. So passive is "over and back" (server to target and back to server), while active is "over" (target to server). There are firewall repercussions for sure, as the traffic flow is different source to destination, and the two approaches use different ports.

Note that the 2nd parameter on the key, is the search string in the log. This can be a regular expression if you want to grab only a portion of the full line. If you use a regular expression, you can put it in tickmarks. I imagine a lot of people don't get this 2nd parameter correct which leads to troubleshooting. And, I think many people don't realize that you don't need to be putting in carets and such, to get what you need. For instance, I initially had "^.*IPtables Input Dropped.*$" in this field, when I realized later I only needed "IPtables Input Dropped". You only need to be slicing and dicing with complex regular expressions, if you want the output to be trimmed up.

The third parameter to be concerned with, is the Type of Information. This is often overlooked. It needs to be Log. Overlook this, which is easy to do, and it won't work!

I went with an update interval of 1m (1 minute), and a history storage period of 30d (thirty days). This was to be conservative on space lest we get a ton of log activity.

Also note that the item is Disabled (Enabled unchecked). This is because - we can not actually turn this on yet! We need to configure the agent for Active! THEN, this item can be verified and tested.

3. Next, you need to configure your agent for Active.

I wasn't sure initially, if a Zabbix agent had to be one or the other (Active or Passive mutual exclusive). I had my agent initially set to Passive. In Passive Mode, Zabbix reaches out on an interval, and gathers up measurements. Passive is the default when you install a Zabbix agent. 

To configure your agent to be Active, the only thing I needed to do, was to put a Server=x.x.x.x address in the Active section of the zabbix_agentd.conf file in /etc/zabbix directory of the monitored VM.

Don't forget to restart the zabbix-agent service!

4. Next, you need to make sure the firewalls on both sides are allowing Zabbix

On the monitored node (running Zabbix Agent), I was running iptables - not FirewallD. So I had to add an iptables rules for port 10051. 

iptables -A OUTPUT -p tcp -d ${ZABBIX} -m tcp --dport 10051 -j ACCEPT

On the Zabbix server itself, which happens to be running FirewallD, we simply added port 10051 to the /etc/firewalld/services/zabbix.xml file:

 <?xml version="1.0" encoding="utf-8"?>
<service>
  <short>Zabbix</short>
  <description>Allow services for Zabbix server and agent</description>
  <port protocol="tcp" port="10050"/>
  <port protocol="tcp" port="10051"/>
</service>

But you are not done when you add this! Make sure you restart the firewall, which can be done by restarting the service, or, if you don't want a gap in coverage, you can "reload" the rules on a running firewall with:

# firewall-cmd --reload

which generates "success" if the firewall restarts properly with the newly added modifications.

5. Now it is time to go back and enable your items!

Go back to the Zabbix GUI, select Configuration-->Hosts, and choose the host(s) that you added your 2 new items for (IPtables-Input-Dropped, IPtables-Output-Dropped). Select Items on each one of these hosts, choose the item and click the Enabled checkbox.

6. Check for Data

Make sure you wait a bit, because the interval time we set is 1 minute. 

After a reasonable length of time (suggested 5 minutes), go into the Zabbix GUI and select Monitoring-->Latest Data. In the hosts field, put one of the hosts into the field to filter the items for that particular host.

Find the two items, which are probably at the last page (the end), in a section called "other", since manually added Items tend to be grouped in the "other" category. 

On the right hand side, you should see "History". When you click History, your log file entries show up in the GUI!

No comments:

Rocky Generic Cloud Image 9.4 - Image Prep, Cloud-Init and VMware Tools

  I just fixed an issue on these Rocky 9.x generic cloud images not booting properly on a VMWare platform. It turns o...