Showing posts with label LACP. Show all posts
Showing posts with label LACP. Show all posts

Friday, August 7, 2020

LACP not working on ESXi HOST - unless you use a Distributed vSwitch

Today we were trying to configure two NICs on an ESXi host in an Active-Active state, such that they would participate in a LAG using LACP with one NIC connected to one TOR (Top of Rack) Switch and the other connected to another separate TOR switch.

It didn't work.

There was no way to "bond" the two NICs (as you would typically do in Linux). ESXi only supported NIC Teaming. Perhaps only the most advanced networking folks realize, that NIC Teaming is not the same as NIC Bonding (we won't get into the weeds on that). And NIC Teaming and NIC Bonding are not the same as Link Aggregation.

So after configuring NIC Teaming, and enabling the second NIC on vSwitch0, poof! Lost connectivity.

Why? Well, ESXi runs Cisco Discovery Protocol (CDP). Not LACP, which the switch requires. So without LACP, there is no effective LAG, and the switch gets confused.

Finally, we read that in order to use LACP, you needed to use vDS - vmWare Distributed Switch. 

Huh? Another product? To do something we could do on a Linux box with no problems whatsoever?

Turns out, that to run vDS, you need to run vCenter Server. So they put the Distributed Switch on vCenter Server?

Doesn't that come at a performance cost? Just so they can charge licensing?

I was not impressed that I needed to use vCenter Server just to put 2 NICs on a box on a Link Aggregation Group.

Friday, November 15, 2019

Layer 2 Networking Configuration in Linux

I have not had a tremendous amount of exposure to Layer 2 Networking in Linux, or in general.

The SD-WAN product at my previous startup company has a Layer 2 Gateway that essentially would allow corporations to join LAN segments over a wide area network. So people sitting in an office in, say, Seattle, could be "theoretically" sitting next to some colleagues sitting in an office in, say, Atlanta. All on the same LAN segment. How the product did this is a separate discussions since it involved taking Ethernet Frames, and transporting / tunneling them across the internet (albeit in a very creative and very fast way due to link aggregation, UDP acceleration, multiple channels for delivering the tunneled packets, et al).

I only scratched the surface in terms of understanding the nuances of L2 with this. For example, I learned quite a bit about Bridging (from a Linux perspective). I learned a bit about Spanning Tree Protocol as well, and BPDUS.

I had heard about protocols like LLDP (Link Layer Discovery Protocol), and LACP (Link Aggregation Control Protocol), but since I was not dealing with commercial switches and things, I had no need for enabling, configuring, tuning or analyzing these protocols.

But - I am in an environment now, where these things start to matter a bit more. We run OpenStack hosts that connect to large Juniper switches. These Linux servers are using Link Aggregation and Bonding, and as such, are configured to use LACP to send PDUs to the switches.

LLDP is also enabled. With LLDP, devices advertise their information to directly-connected peers/neighbors. I found a good link that describes how to enable LLDP on Linux.
https://community.mellanox.com/s/article/howto-enable-lldp-on-linux-servers-for-link-discovery

This Juniper document does a pretty good job of discussing LACP.
Understanding LAG and LACP







SLAs using Zabbix in a VMware Environment

 Zabbix 7 introduced some better support for SLAs. It also had better support for VMware. VMware, of course now owned by BroadSoft, has prio...