Grasping Technology: T-Rex

Showing posts with label T-Rex. Show all posts

Tuesday, June 7, 2022

VMWare Network Debugging - Trex Load Generation and Ring Buffer Overflow

We began running Trex Traffic Generator testing, sending load to a couple of virtual machines running on ESXi vSphere-managed hypervisors, and are running into some major problems.

First, the Trex Traffic Generator:

Cent7 OS virtual machine

3 ports

eth0 used for ssh connectivity and to run Trex and Trex Console (with screen utility)
eth1 for sending traffic (Trex will put this port into DPDK-mode so OS cannot see it)
eth2 for sending traffic (Trex will put this port into DPDK-mode so OS cannot see it)

4 cores

the VM actually has 6, but two are used for running OS and Trex Admin
Traffic Tests utilize 4 cores

Next, the Device(s) Under Test (DUT):

Juniper vSRX which is a router VM (based on JUNOS but Berkeley Unix under the hood?)
Standard CentOS7 Virtual Machine

We ran the stateless imix test, at 20% and 100% line utilization.

We noticed that the Trex VM was using 80-90% core usage in the test (Trex Stats from console), and was using 20-25% line utilization, sending 4Gbps per port (8Gbps total) to the DUT virtual machines.

On the receiving side, the router was only processing about 1/4 to 1/6 of the packets sent by Trex. The Cent7 VM, also, could not receive more than about 3.5Gbps maximum.

So what is happening? This led us to a Deep Dive, into the VMWare Statistics.

By logging into the ESXi host that the receiving VM was running on, we could first fine out what virtual switch and port the VM interface was assigned to, by running:

# net-stats -l

This produces a list, like this:

PortNum          Type SubType SwitchName       MACAddress         ClientName
50331650            4       0 DvsPortset-0     40:a6:b7:51:18:60 vmnic4
50331652            4       0 DvsPortset-0     40:a6:b7:51:1e:fc vmnic2
50331654            3       0 DvsPortset-0     40:a6:b7:51:1e:fc vmk0
50331655            3       0 DvsPortset-0     00:50:56:63:75:bd vmk1
50331663            5       9 DvsPortset-0     00:50:56:8a:af:c1 P6NPNFVNDPKVMA.eth1
50331664            5       9 DvsPortset-0     00:50:56:8a:cc:74 P6NPNFVNDPKVMA.eth2
50331669            5       9 DvsPortset-0     00:50:56:8a:e3:df P6NPNFVNRIV0009.eth0
67108866            4       0 DvsPortset-1     40:a6:b7:51:1e:fd vmnic3
67108868            4       0 DvsPortset-1     40:a6:b7:51:18:61 vmnic5
67108870            3       0 DvsPortset-1     00:50:56:67:c5:b4 vmk10
67108871            3       0 DvsPortset-1     00:50:56:65:2d:92 vmk11
67108873            3       0 DvsPortset-1     00:50:56:6d:ce:0b vmk50
67108884            5       9 DvsPortset-1     00:50:56:8a:80:3c P6NPNFVNDPKVMA.eth0

A couple of nifty commands, will show you the statistics:
# vsish -e get /net/portsets/DvsPortset-0/ports/50331669/clientStats
port client stats {
   pktsTxOK:115
   bytesTxOK:5582
   droppedTx:0
   pktsTsoTxOK:0
   bytesTsoTxOK:0
   droppedTsoTx:0
   pktsSwTsoTx:0
   droppedSwTsoTx:0
   pktsZerocopyTxOK:0
   droppedTxExceedMTU:0
   pktsRxOK:6595337433
   bytesRxOK:2357816614826
   droppedRx:2934191332 <-- lots of dropped packets
   pktsSwTsoRx:0
   droppedSwTsoRx:0
   actions:0
   uplinkRxPkts:0
   clonedRxPkts:0
   pksBilled:0
   droppedRxDueToPageAbsent:0
   droppedTxDueToPageAbsent:0
}

# vsish -e get /net/portsets/DvsPortset-0/ports/50331669/vmxnet3/rxSummary
stats of a vmxnet3 vNIC rx queue {
   LRO pkts rx ok:0
   LRO bytes rx ok:0
   pkts rx ok:54707478
   bytes rx ok:19544123192
   unicast pkts rx ok:54707448
   unicast bytes rx ok:19544121392
   multicast pkts rx ok:0
   multicast bytes rx ok:0
   broadcast pkts rx ok:30
   broadcast bytes rx ok:1800
   running out of buffers:9325862
   pkts receive error:0
   1st ring size:4096 <-- this is a very large ring buffer size!
   2nd ring size:256
   # of times the 1st ring is full:9325862 <-- WHY packets are being dropped
   # of times the 2nd ring is full:0
   fail to map a rx buffer:0
   request to page in a buffer:0
   # of times rx queue is stopped:0
   failed when copying into the guest buffer:0
   # of pkts dropped due to large hdrs:0
   # of pkts dropped due to max number of SG limits:0
   pkts rx via data ring ok:0
   bytes rx via data ring ok:0
   Whether rx burst queuing is enabled:0
   current backend burst queue length:0
   maximum backend burst queue length so far:0
   aggregate number of times packets are requeued:0
   aggregate number of times packets are dropped by PktAgingList:0
   # of pkts dropped due to large inner (encap) hdrs:0
   number of times packets are dropped by burst queue:0
   number of packets delivered by burst queue:0
   number of packets dropped by packet steering:0
   number of packets dropped due to pkt length exceeds vNic mtu:0 <-- NOT the issue!
}

We were also able to notice that this VM had an Rx queue, per vCPU added to the VM (no additional settings to the VM settings were made to this specific Cent7 VM):

# vsish -e ls /net/portsets/DvsPortset-0/ports/50331669/vmxnet3/rxqueues
0/
1/
2/
3/
4/
5/
6/
7/

Each of the queues, can be dumped individually, to check Ring Buffer size (we did this and they were all 4096):

# vsish -e get /net/portsets/DvsPortset-0/ports/50331669/vmxnet3/rxqueues/1/status
status of a vmxnet3 vNIC rx queue {
   intr index:1
   stopped:0
   error code:0
   ring #1 size:4096 <-- if you use ethtool -G eth0 rx 4096 inside the VM it updates ALL queues
   ring #2 size:256
   data ring size:0
   next2Use in ring0:33
   next2Use in ring1:0
   next2Write:1569
}

# vsish -e get /net/portsets/DvsPortset-0/ports/50331669/vmxnet3/rxqueues/7/status
status of a vmxnet3 vNIC rx queue {
   intr index:7
   stopped:0
   error code:0
   ring #1 size:4096 <-- if you use ethtool -G eth0 rx 4096 inside the VM it updates ALL queues
   ring #2 size:256
   data ring size:0
   next2Use in ring0:1923
   next2Use in ring1:0
   next2Write:3458
}

So, that is where we are. We see the problem. Now, to fix it - that might be a separate post altogether.

Some additional knowledgebase sources of information on troubleshooting in VMWare environments:

MTU Problem

https://kb.vmware.com/s/article/75213

Ring Buffer Problem

https://kb.vmware.com/s/article/2039495

https://vswitchzero.com/2017/09/26/vmxnet3-rx-ring-buffer-exhaustion-and-packet-loss/

Tuesday, May 3, 2022

T-Rex Traffic Generator - Stateless vs Stateful

I am in the beginning of learning the T-Rex Traffic Generator. Cisco developed this initially, but it is now an open source traffic generator. With all traffic generators, there is a learning curve associated with it.

The first major question I had, was the modes that T-Rex can work in:

Stateless (STL)
Stateful (STF)
Advanced Stateful (ASTF)

There are two T-Rex doc pages that discuss these distinctions, but they are not written from a comparative perspective. I will list those links here.

Trex Website: Trex Stateless

Trex Website: Trex Stateful

While this has good information, however, it was a discussion on Reddit that I found most useful:

Reddit Discussion: STF vs. STL vs. ASTF

In the event that this Reddit thread becomes archived, I will (re) post that discussion here:

----------------------------------------------------------------------------------------------------------------------------

Stateless STL - there is no IP stack so it can't communicate with another standard IP node. The framed packets are pre-built and just pumped out the NIC. Because there is no normal dynamic protocol stack STL mode is run between a TRex NIC pair where they just pass the framed packets between each other and track statistics.

Stateful [A]STF - there is an actual TCP stack running with some L7 support so the stream can communicate to a non t-rex node; or through a stateful firewall with NAT or load balancer etc.

More info and a quick comparison table is here - https://trex-tgn.cisco.com/trex/doc/trex_stateless.html#_stateful_vs_stateless

There is a fairly active community at https://groups.google.com/g/trex-tgn The developers are usually very responsive and will patch bugs usually within a day or two.

t-rex is an engineering tool that seems to be run by the developers and engineers so the documentation can be a little frustrating and the learning curve can be steep. It is however a flexible, powerful and extremely cost effective tool when compared to commercial equivalents.

Then making use of the API combined with your imagination you can also build things other than just stress testing hardware.

----------------------------------------------------------------------------------------------------------------------------

Grasping Technology

Tuesday, June 7, 2022

VMWare Network Debugging - Trex Load Generation and Ring Buffer Overflow

Tuesday, May 3, 2022

T-Rex Traffic Generator - Stateless vs Stateful

SLAs using Zabbix in a VMware Environment

Search This Blog