Friday, January 13, 2023

Debugging Dropped Packets on NSX-T E-NVDS

 

Inside the hypervisor, we have the following nics:

The servers have physical nics as follows:

  • vmnic0 – 1G nic, Intel X550 – Unused
  • vmnic1 – 1G nic, Intel X550 - Unused
  • vmnic2 - 10G nic, SFP+, Intel XL710, driver version 2.1.5.0 FW version 8.50, link state up
  • vmnic3 - 10G nic, SFP+, Intel XL710, driver version 2.1.5.0 FW version 8.50, link state up
  • vmnic4 - 10G nic, SFP+, Intel XL710, driver version 2.1.5.0 FW version 8.50, link state up
  • vmnic5 - 10G nic, SFP+, Intel XL710, driver version 2.1.5.0 FW version 8.50, link state up

The nics connect to the upstream switches (Aristas), and they connect virtually to the virtual switches (discussed right below):

 

Inside Hypervisor (Host 5 in this specific case):

Distributed vSwitch

                Physical Nic Side: vmnic2 and vmnic4

                Virtual Side: vmk0 (VLAN 3850) and vmk1 (VLAN 3853)

 

NSX-T Switch (E-NVDS)

                Physical NIC side: vmnic3 and vmnic5 à this is the nic that gets hit when we run the load tests

                Virtual Side: 50+ individual segments that VMs connect to, and get assigned a port

 

Now, in my previous email, I dumped the stats for the physical NIC – meaning, from the “NIC Itself” from the ESXi OS operating system.

 

But, it is wise also, to take a look at the stats of the physical nic from the perspective of the virtual switch! Remember, vmnic5 is a port on the virtual switch!

 

So first, we need to figure out what port we need to look at:
net-stats -l

PortNum          Type SubType SwitchName       MACAddress         ClientName

2214592527          4       0 DvsPortset-0     40:a6:b7:51:56:e9  vmnic3

2214592529          4       0 DvsPortset-0     40:a6:b7:51:1b:9d  vmnic5 à here we go, port 2214592529 on switch DvsPortset-0 is the port of interest

67108885            3       0 DvsPortset-0     00:50:56:65:96:e4  vmk10

67108886            3       0 DvsPortset-0     00:50:56:65:80:84  vmk11

67108887            3       0 DvsPortset-0     00:50:56:66:58:98  vmk50

67108888            0       0 DvsPortset-0     02:50:56:56:44:52  vdr-vdrPort

67108889            5       9 DvsPortset-0     00:50:56:8a:09:15  DEV-ISC1-Vanilla3a.eth0

67108890            5       9 DvsPortset-0     00:50:56:8a:aa:3f  DEV-ISC1-Vanilla3a.eth1

67108891            5       9 DvsPortset-0     00:50:56:8a:9d:b1  DEV-ISC1-Vanilla3a.eth2

67108892            5       9 DvsPortset-0     00:50:56:8a:d9:65  DEV-ISC1-Vanilla3a.eth3

67108893            5       9 DvsPortset-0     00:50:56:8a:fc:75  DEV-ISC1-Vanilla3b.eth0

67108894            5       9 DvsPortset-0     00:50:56:8a:7d:cd  DEV-ISC1-Vanilla3b.eth1

67108895            5       9 DvsPortset-0     00:50:56:8a:d4:d8  DEV-ISC1-Vanilla3b.eth2

67108896            5       9 DvsPortset-0     00:50:56:8a:67:6f  DEV-ISC1-Vanilla3b.eth3

67108901            5       9 DvsPortset-0     00:50:56:8a:32:1c  DEV-MSC1-Vanilla3b.eth0

67108902            5       9 DvsPortset-0     00:50:56:8a:e6:2b  DEV-MSC1-Vanilla3b.eth1

67108903            5       9 DvsPortset-0     00:50:56:8a:cc:eb  DEV-MSC1-Vanilla3b.eth2

67108904            5       9 DvsPortset-0     00:50:56:8a:7a:83  DEV-MSC1-Vanilla3b.eth3

67108905            5       9 DvsPortset-0     00:50:56:8a:63:55  DEV-MSC1-Vanilla3a.eth3

67108906            5       9 DvsPortset-0     00:50:56:8a:40:9c  DEV-MSC1-Vanilla3a.eth2

67108907            5       9 DvsPortset-0     00:50:56:8a:57:8f  DEV-MSC1-Vanilla3a.eth1

67108908            5       9 DvsPortset-0     00:50:56:8a:5b:6d  DEV-MSC1-Vanilla3a.eth0

 

/net/portsets/DvsPortset-0/ports/2214592529/> cat stats

packet stats {

   pktsTx:10109633317

   pktsTxMulticast:291909

   pktsTxBroadcast:244088

   pktsRx:10547989949 à total packets RECEIVED on vmnic5’s port on the virtual switch

   pktsRxMulticast:243731083

   pktsRxBroadcast:141910804

   droppedTx:228

   droppedRx:439933 à This is a lot more than the 3,717 Rx Missed errors, and probably accounts for why MetaSwitch sees more drops than we saw up to this point!

}

 

So – we have TWO things now to examine here.

  • Is the Receive Side Scaling configured properly and working? 
    • We configured it, but…we need to make sure it is working and working properly.
    • We don’t see all of the queues getting packets. Each Rx Queue should be getting its own CPU.
  • Once packets get into the Ring Buffer and passed through to the VM (poll mode driver picks the packets up off the Ring), they hit the virtual switch.
    • And the switch is dropping some packets.
    • Virtual switches are software. As such, they need to be tuned to stretch their capability to keep up with what legacy hardware switches can do.
      • The NSX-T switch is a powerful switch, but is also a newer virtual switch, more bleeding edge in terms of technology. 
      • I wonder if we are running the latest greatest version of this switch, and if that could help us here.

 

Now, I looked even deeper into the E-NVDS switch. I went into vsish shell, and started examining any and all statistics that are captured by that networking stack.

 

Since we are concerned with receives, I looked at the InputStats specifically. I noticed there are several filters – which, I presume is tied to a VMWare Packet Filtering flow, analogous to Netfilter in Linux, or perhaps Berkeley Packet Filter. But, I have no documentation whatsoever on this, and can’t find any, so I did my best to “back into” what I was seeing.

 

I see the following filters that packets can traverse – traceflow might be packet capture but not sure aside of that.

·         ens-slowpath-input

·         traceflow-Uplink-Input:0x43110ae01630
·         vdl2-uplink-in:0x431e78801dd0
·         UplinkDoSwLRO@vmkernel#nover
·         VdrUplinkInput

 

If we go down into the filters and print the stats out, most of the stats seem to line up (started=passed, etc) except this one, which has drops in it:

/net/portsets/DvsPortset-0/ports/2214592529/inputFilters/vdl2-uplink-in/> cat stats
packet stats {
   pktsIn:31879020
   pktsOut:24269629
   pktsDropped:7609391
}

/net/portsets/DvsPortset-0/ports/2214592527/inputFilters/vdl2-uplink-in/> cat stats
packet stats {
   pktsIn:24817038
   pktsOut:17952829
   pktsDropped:6864209
}
 

That seems like a lot of dropped packets to me (a LOT more than those Rx Missed errors), so this looks like something we need to work with VMWare on because if I understand these stats properly, this suggests an issue on the virtual switch more than the adaptor itself.

 

Another thing I saw, poking around, was this interesting looking WRONG_VNIC on passthrough status on vmnic3 and vmnic5, the two nics being used in the test here. I think we should maybe ask VMWare about this and run this down also.

 

/net/portsets/DvsPortset-0/ports/2214592527/> cat status
port {
   port index:15
   vnic index:0xffffffff
   portCfg:
   dvPortId:4dfdff37-e435-4ba4-bbff-56f36bcc0779
   clientName:vmnic3
   clientType: 4 -> Physical NIC
   clientSubType: 0 -> NONE
   world leader:0
   flags: 0x460a3 -> IN_USE ENABLED UPLINK DVS_PORT DISPATCH_STATS_IN DISPATCH_STATS_OUT DISPATCH_STATS CONNECTED
   Impl customized blocked flags:0x00000000
   Passthru status: 0x1 -> WRONG_VNIC
   fixed Hw Id:40:a6:b7:51:56:e9:
   ethFRP:frame routing {
      requested:filter {
         flags:0x00000000
         unicastAddr:00:00:00:00:00:00:
         numMulticastAddresses:0
         multicastAddresses:
         LADRF:[0]: 0x0
         [1]: 0x0
      }
      accepted:filter {
         flags:0x00000000
         unicastAddr:00:00:00:00:00:00:
         numMulticastAddresses:0
         multicastAddresses:
         LADRF:[0]: 0x0
         [1]: 0x0
      }
   }
   filter supported features: 0 -> NONE
   filter properties: 0 -> NONE
   rx mode: 0 -> INLINE
   tune mode: 2 -> invalid
   fastpath switch ID:0x00000000
   fastpath port ID:0x00000004
}

No comments:

NUMA on VM a Hyperthread-Enabled Server

This could be a long post, because things like NUMA can get complicated. For background, we are running servers - hypervisors - that have 24...