Thursday, June 6, 2019

The Network Problem From Hell - Fixed - Circuitous Routing



Life is easy when you use a single network interface adaptor.  But when you start using multiple adaptors, you start running into complexities because packets can start taking multiple paths. 

One particular thing most network engineers want to avoid, is situations where a packet leaves through door #1 (e.g. NIC 1), and arrives through door #2.  To fix this, though, requires some more advanced network techniques and tricks (separate routing tables per NIC, and corresponding rules to direct packets to use those separate routing tables).

So, I had this problem where an OpenStack-managed virtual machine stopped working because it could not reach OpenStack itself, which was running on the SAME machine that the virtual machine was running on. It was driving me insane.

I thought the problem might be iptables on the host machine. I disabled those. Nope. 

I thought the problem might be OpenVSwitch. I moved the cable to a new NIC, and changed the bridge the virtual machine was using. Nope.

Compounding the problem, was that the OpenStack Host could ping the virtual machine. But the virtual machine could not ping the host. Why would it work one way, and not the other?

The Virtual Machine could ping the internet. It could ping the IP of the OpenStack router. It could ping the router that the host was connected to.

OpenStack uses Linux IP Namespaces, and in our case was using the Neutron OpenVSwitch Agent. An examination of these showed that the networking seemed to be configured just as it showed up in the Horizon Dashboard "Network Topology" visual interface.

One thing that is worth mentioning, is that the bridge mappings for provider networks is in the ml2_conf.ini file, and the openvswitch_agent.ini file. But the EXTERNAL OpenStack networks use a bridge based on a parameter setting in the l3_agent.ini file! So if the l3_agent.ini file has a bridge setting of, say, "br-ex" for external networks, and you don't have that bridge correspondingly configured in the other files, OpenStack will give you a message when you create the external network that it cannot reach the external network. We did run into this when trying to create different external networks on different bridges to solve the problem.

At wits end, I finally called over one of the more advanced networking guys in the company, and we began troubleshooting it using tcpdump. We finally realized that when the VM pinged the OpenStack host, the ICMP request packets were arriving on the expected NIC (em1 below), but no responses were going out on em1. When we changed tcpdump to use "any" interface, we saw no responses at all. Clearly the host was dropping the packets. But iptables was flushed! WHO/HOW were the packets getting dropped? (to be honest, we still aren't sure about this - more research required on that). But - we did figure out that the problem was a "circuitous routing" problem.

We figured maybe reverse path filtering was causing the issue. So we disabled that in the kernel. Didn't fix it.  

Finally, we realized that what was happening, is that the VM sends all of its packets through the external network bridge, which was attached to a 172.22.0.0/24 network, and the packet went to the router, which routed it to its 172.20.0.0/24 port, and then to the host machine. But because the host machine had TWO NICs on BOTH those networks, the host machine did not send replies back the same way they came in. It sent the replies to its em2 NIC which was bridged to br-compute. And it was HERE that the packets were getting dropped. Since that NIC is managed by OpenVSwitch, we believe a loop-prevention flow rule in OpenVSwitch, or perhaps Spanning Tree Protocol, caused the packets to get dropped.

Illustration of the Circuitous Routing Issue & How it was solved

The final solution was to put in a host route, so that any packet to that particular VM would be sent outside of the host, upstream to the router, and back in through the appropriate 172.22.0.0/24 port on the host/bridge, to the OpenStack Router, where it would be NATd back to the 192.168.100.19 IP of the virtual machine. 

Firewalls and Routing. These two seem to be where most problems occur in networking.

No comments:

Fixing Clustering and Disk Issues on an N+1 Morpheus CMP Cluster

I had performed an upgrade on Morpheus which I thought was fairly successful. I had some issues doing this upgrade on CentOS 7 because it wa...