Tuesday, June 7, 2022

VMWare Network Debugging - Trex Load Generation and Ring Buffer Overflow

We began running Trex Traffic Generator testing, sending load to a couple of virtual machines running on ESXi vSphere-managed hypervisors, and are running into some major problems.

First, the Trex Traffic Generator:

  • Cent7 OS virtual machine
  • 3 ports
    • eth0 used for ssh connectivity and to run Trex and Trex Console (with screen utility)
    • eth1 for sending traffic (Trex will put this port into DPDK-mode so OS cannot see it)
    • eth2 for sending traffic (Trex will put this port into DPDK-mode so OS cannot see it)
  • 4 cores 
    • the VM actually has 6, but two are used for running OS and Trex Admin
    • Traffic Tests utilize 4 cores

Next, the Device(s) Under Test (DUT):

  1. Juniper vSRX which is a router VM (based on JUNOS but Berkeley Unix under the hood?)
  2. Standard CentOS7 Virtual Machine
     

We ran the stateless imix test, at 20% and 100% line utilization.

We noticed that the Trex VM was using 80-90% core usage in the test (Trex Stats from console), and was using 20-25% line utilization, sending 4Gbps per port (8Gbps total) to the DUT virtual machines.

On the receiving side, the router was only processing about 1/4 to 1/6 of the packets sent by Trex.  The Cent7 VM, also, could not receive more than about 3.5Gbps maximum.

So what is happening? This led us to a Deep Dive, into the VMWare Statistics.

By logging into the ESXi host that the receiving VM was running on, we could first fine out what virtual switch and port the VM interface was assigned to, by running:

# net-stats -l

This produces a list, like this:

PortNum          Type SubType SwitchName       MACAddress         ClientName
50331650            4       0 DvsPortset-0     40:a6:b7:51:18:60  vmnic4
50331652            4       0 DvsPortset-0     40:a6:b7:51:1e:fc  vmnic2
50331654            3       0 DvsPortset-0     40:a6:b7:51:1e:fc  vmk0
50331655            3       0 DvsPortset-0     00:50:56:63:75:bd  vmk1
50331663            5       9 DvsPortset-0     00:50:56:8a:af:c1  P6NPNFVNDPKVMA.eth1
50331664            5       9 DvsPortset-0     00:50:56:8a:cc:74  P6NPNFVNDPKVMA.eth2
50331669            5       9 DvsPortset-0     00:50:56:8a:e3:df  P6NPNFVNRIV0009.eth0
67108866            4       0 DvsPortset-1     40:a6:b7:51:1e:fd  vmnic3
67108868            4       0 DvsPortset-1     40:a6:b7:51:18:61  vmnic5
67108870            3       0 DvsPortset-1     00:50:56:67:c5:b4  vmk10
67108871            3       0 DvsPortset-1     00:50:56:65:2d:92  vmk11
67108873            3       0 DvsPortset-1     00:50:56:6d:ce:0b  vmk50
67108884            5       9 DvsPortset-1     00:50:56:8a:80:3c  P6NPNFVNDPKVMA.eth0

A couple of nifty commands, will show you the statistics:
# vsish -e get /net/portsets/DvsPortset-0/ports/50331669/clientStats
port client stats {
   pktsTxOK:115
   bytesTxOK:5582
   droppedTx:0
   pktsTsoTxOK:0
   bytesTsoTxOK:0
   droppedTsoTx:0
   pktsSwTsoTx:0
   droppedSwTsoTx:0
   pktsZerocopyTxOK:0
   droppedTxExceedMTU:0
   pktsRxOK:6595337433
   bytesRxOK:2357816614826
   droppedRx:2934191332 <-- lots of dropped packets
   pktsSwTsoRx:0
   droppedSwTsoRx:0
   actions:0
   uplinkRxPkts:0
   clonedRxPkts:0
   pksBilled:0
   droppedRxDueToPageAbsent:0
   droppedTxDueToPageAbsent:0
}

# vsish -e get /net/portsets/DvsPortset-0/ports/50331669/vmxnet3/rxSummary
stats of a vmxnet3 vNIC rx queue {
   LRO pkts rx ok:0
   LRO bytes rx ok:0
   pkts rx ok:54707478
   bytes rx ok:19544123192
   unicast pkts rx ok:54707448
   unicast bytes rx ok:19544121392
   multicast pkts rx ok:0
   multicast bytes rx ok:0
   broadcast pkts rx ok:30
   broadcast bytes rx ok:1800
   running out of buffers:9325862
   pkts receive error:0
   1st ring size:4096 <-- this is a very large ring buffer size!
   2nd ring size:256
   # of times the 1st ring is full:9325862 <-- WHY packets are being dropped
   # of times the 2nd ring is full:0
   fail to map a rx buffer:0
   request to page in a buffer:0
   # of times rx queue is stopped:0
   failed when copying into the guest buffer:0
   # of pkts dropped due to large hdrs:0
   # of pkts dropped due to max number of SG limits:0
   pkts rx via data ring ok:0
   bytes rx via data ring ok:0
   Whether rx burst queuing is enabled:0
   current backend burst queue length:0
   maximum backend burst queue length so far:0
   aggregate number of times packets are requeued:0
   aggregate number of times packets are dropped by PktAgingList:0
   # of pkts dropped due to large inner (encap) hdrs:0
   number of times packets are dropped by burst queue:0
   number of packets delivered by burst queue:0
   number of packets dropped by packet steering:0
   number of packets dropped due to pkt length exceeds vNic mtu:0 <-- NOT the issue!
}

We were also able to notice that this VM had an Rx queue, per vCPU added to the VM (no additional settings to the VM settings were made to this specific Cent7 VM):

# vsish -e ls /net/portsets/DvsPortset-0/ports/50331669/vmxnet3/rxqueues
0/
1/
2/
3/
4/
5/
6/
7/

Each of the queues, can be dumped individually, to check Ring Buffer size (we did this and they were all 4096):

# vsish -e get /net/portsets/DvsPortset-0/ports/50331669/vmxnet3/rxqueues/1/status
status of a vmxnet3 vNIC rx queue {
   intr index:1
   stopped:0
   error code:0
   ring #1 size:4096 <-- if you use ethtool -G eth0 rx 4096 inside the VM it updates ALL queues
   ring #2 size:256
   data ring size:0
   next2Use in ring0:33
   next2Use in ring1:0
   next2Write:1569
}

# vsish -e get /net/portsets/DvsPortset-0/ports/50331669/vmxnet3/rxqueues/7/status
status of a vmxnet3 vNIC rx queue {
   intr index:7
   stopped:0
   error code:0
   ring #1 size:4096 <-- if you use ethtool -G eth0 rx 4096 inside the VM it updates ALL queues
   ring #2 size:256
   data ring size:0
   next2Use in ring0:1923
   next2Use in ring1:0
   next2Write:3458
}

So, that is where we are. We see the problem. Now, to fix it - that might be a separate post altogether.


Some additional knowledgebase sources of information on troubleshooting in VMWare environments:

  • MTU Problem

            https://kb.vmware.com/s/article/75213

  • Ring Buffer Problem

            https://kb.vmware.com/s/article/2039495

            https://vswitchzero.com/2017/09/26/vmxnet3-rx-ring-buffer-exhaustion-and-packet-loss/

No comments:

NUMA on VM a Hyperthread-Enabled Server

This could be a long post, because things like NUMA can get complicated. For background, we are running servers - hypervisors - that have 24...