Grasping Technology

Wednesday, June 28, 2023

VMWare Storage - Hardware Acceleration Status

Today, we had a customer call in and tell us that they couldn't do Thick provisioning from a vCenter template. We went into vCenter (the GUI), and sure enough, we could only provision Thin virtual machines from it.

But - apparently on another vCenter cluster, they COULD provision Thick virtual machines. There seemed to be no difference between the virtual machines. Note that we are using NFS and not Block or iSCSI or Fibre Channel storage.

We went into vCenter, and lo and behold, we saw this situation...

NOTE: To get to this screen in VMWare's very cumbersome GUI, you have to click on the individual datastore, then click "Configure" then a tab called "Hardware Acceleration" occurs.

So, what we have here, is one datastore that says "Not Supported" on a host, and another datastore in the same datastore cluster that says "Supported" on the exact same host. This sounds bad. This looks bad. Inconsistency. Looks like a problem.

So what IS hardware acceleration when it comes to Storage? To find this out, I located this KnowledgeBase:

Storage Hardware Acceleration

There is also a link for running storage HW acceleration on NAS devices:

Storage Hardware Acceleration on NAS Devices

When these two (above) links are referenced, there are (on left hand side) some additional links as well.

For each storage device and datastore, the vSphere Client display the hardware acceleration support status.

The status values are Unknown, Supported, and Not Supported. The initial value is Unknown.

For block devices, the status changes to Supported after the host successfully performs the offload operation. If the offload operation fails, the status changes to Not Supported. The status remains Unknown if the device provides partial hardware acceleration support.

With NAS, the status becomes Supported when the storage can perform at least one hardware offload operation.

When storage devices do not support or provide partial support for the host operations, your host reverts to its native methods to perform unsupported operations.

NFS = NAS, I am pretty darned sure.

So this is classic VMWare confusion. They are using a "Status" field, and using values of Supported / Non-Supported, when in fact Supported means "Working" and Non-Supported means "Not Working" based on (only) the last operation attempted.

So. Apparently, if a failure on this offload operation occurs, this flag gets turned to Non-Supported, and guess what? That means you cannot do *any* Thick Provisioning.

In contacting VMWare, they want us to re-load the storage plugin. Yikes. Stay Tuned....

VMWare also has some Best Practices for running iSCSI Storage, and the link to that is found at:

VMWare iSCSI Storage Best Practices

Wednesday, April 19, 2023

Colorizing Text in Linux

I went hunting today, for a package that I had used to colorize text. There are tons of those out there of course. But - what if you want to filter the text and colorize based on a set of rules?

There's probably a lot of stuff out there for that, too. Colord for example, runs as a daemon in Linux.

Another package, is grc, found at this GitHub site: https://github.com/garabik/grc

Use Case:

I had a log that was printing information related to exchanges with different servers. I decided to color these so that messages from Server A were green, Server B were blue, etc. In this way, I could do really cool things like suppress messages from Server B (no colorization). Or, I could take Control Plane messages from, say, Server C, and highlight those Yellow.

This came in very handy during a Demo, where people were watching the messages display in rapid succession on a large screen.

Monday, February 27, 2023

Hyperthreading vs Non-Hyperthreading on an ESXi Hypervisor

We started to notice that several VNF (Virtual Network Function) vendors were recommending to turn off (disable) Hyper-threading on hypervisors. But why? They claimed it helped their performance.

Throwing a switch and disabling this, means that the number of cores that are exposed to users, is cut in half. So a 24 core CPU, has 48 cores if Hyper-threading is enabled, and only has 24 cores if it is disabled.

This post isn't meant to go into the depths of Hyper-threading itself. The question we had, was whether disabling it or enabling it, affected performance, and to what degree.

We ran a benchmark that was comprised of three "layers".

Non-Hyperthreaded (24 cores) vs Hyperthreaded (48 cores)
Increasing vCPU of the Benchmark VM (increments of eight: 1,8,16,24)
Each test ran several Sysbench tests with increasing threads (1,2,4,8,16,32)

The servers we are running on, include: Cisco M5 (512G RAM, 24 vCPU)

We collected the results in Excel, and ran a Pivot Char graph on it, and this is what we found (below).

VM with 1,8,16,24 vCPU running Sysbench with increasing threads

on a Hyperthread-disabled system (24) vs Hyperthread-enabled system (48)

It appears to me, that Hyperthreading starts to look promising when two things happen:

vCPU resources on the VM increase past a threshold of about 8 vCPU.
an application is multi-threaded, and is launching 16 or more threads.

Notice that on an 8 vCPU virtual machine, the "magic number" is 8 threads. On a 16 vCPU virtual machine, you do not see hyperthreading become an advantage until 16 threads are launched. On a 24 vCPU system, we start to see hyperthreading become favorable at about 16 threads and higher.

BUT - if the threads are low, between 1 and about 8, the hyperthreading works against you.

Thursday, February 16, 2023

Morpheus API - pyMorpheus Python API Wrapper

I have been working on some API development in the Morpheus CMP tool.

The first thing I do when I need to use an API, is to see if there is a good API wrapper. I found this one API wrapper out on Github, called pyMorpheus.

With this wrapper, I was up and running in absolutely no time, making calls to the API, parsing JSON responses, etc.

The Use Case I am working on, is a "re-conciliator" that will do two things:

Remove Orphaned VMs

Find, and delete (upon user confirmation) those VMs that have had their "rug pulled out" from Morpheus (deleted in vCenter but still sitting in Morpheus as an Instance)

Convert Certain Discovered VMs to Morpheus

This part sorta kinda worked. The call to https://<applianceurl>/servers/id/make-managed did take a Discovered VM and converted it to an instance, with a "VMWare" logo on it.

But I was unable to set advanced attributes of the VMs - Instance Type, Layout, Plan, etc. and this made it only a partial success.

Maybe if we can get the API fixed up a bit, we can get this to work.

One issue, is the "Cloud Sync". When we call the API, we do a cloud sync, to find Discovered VMs. We do the same cloud sync, to determine whether any of the VM's fields in Morpheus change their state, if someone deletes a VM in vCenter (such a state change gives us the indicator that the VM is, in fact, now an orphan). The Cloud Sync is an asynchronous call. You have to wait for an indefinite amount of time, to ensure that the results you are looking for in vCenter, are reflected in Morpheus. It's basically polling, which is not an exact art. For this reason, the reconciliator tool needs to be run as an operations tool, manually, as opposed to some kind of batch scheduled job.

Tuesday, January 17, 2023

Trying to get RSS (Receive Side Scaling) to work on an Intel X710 NIC

Cisco M5 server, with 6 nics on it. The first two are 1G nics that are unused.

The last 4, are:

vmnic2 - 10G nic, Intel XL710, driver version 2.1.5.0 FW version 8.50, link state up
vmnic3 - 10G nic, Intel XL710, driver version 2.1.5.0 FW version 8.50, link state up
vmnic4 - 10G nic, Intel XL710, driver version 2.1.5.0 FW version 8.50, link state up
vmnic5 - 10G nic, Intel XL710, driver version 2.1.5.0 FW version 8.50, link state up

Worth mentioning:

vmnic 2 and 4 are uplinks, using a standard Distributed Switch (virtual switch) for those uplinks.
vmnic 3 and 5 are connected to an N-VDS virtual switch (used with NSX-T) and don't have uplinks.

In ESXi (VMWare Hypervisor, v7.0), we have set the RSS values accordingly:

UPDATED: how we set the RSS Values!

first, make sure that RSS parameters are unset. Because DRSS and RSS should not be set together.

> esxcli system module parameters set -m -i40en -p RSS=""

next, make sure that DRSS parameters are set. We are setting to 4 Rx queues per relevant vmnic.

esxcli system module parameters set -m -i40en -p DRSS=4,4,4,4

now we list the parameters to ensure they took correctly

> esxcli system module parameters list -m i40en
Name           Type          Value    Description
-------------  ------------  -------  -----------
DRSS           array of int           Enable/disable the DefQueue RSS(default = 0 )
EEE            array of int           Energy Efficient Ethernet feature (EEE): 0 = disable, 1 = enable, (default = 1)
LLDP           array of int           Link Layer Discovery Protocol (LLDP) agent: 0 = disable, 1 = enable, (default = 1)
RSS            array of int  4,4,4,4  Enable/disable the NetQueue RSS( default = 1 )
RxITR          int                    Default RX interrupt interval (0..0xFFF), in microseconds (default = 50)
TxITR          int                    Default TX interrupt interval (0..0xFFF), in microseconds, (default = 100)
VMDQ           array of int           Number of Virtual Machine Device Queues: 0/1 = disable, 2-16 enable (default =8)
max_vfs        array of int           Maximum number of VFs to be enabled (0..128)
trust_all_vfs  array of int           Always set all VFs to trusted mode 0 = disable (default), other = enable

But, we are seeing this when we look at the individual adaptors in the ESXi kernel:

> vsish -e get /net/pNics/vmnic3/rxqueues/info
rx queues info {
   # queues supported:1
   # rss engines supported:0
   # filters supported:0
   # active filters:0
   # filters moved by load balancer:0
   RX filter classes: 0 -> No matching defined enum value found.
   Rx Queue features: 0 -> NONE
}

Nics 3 and 5, connected to the N-VDS virtual switch, only get one single Rx Queue supported, even though the kernel module is configured properly.

> vsish -e get /net/pNics/vmnic2/rxqueues/info
rx queues info {
   # queues supported:9
   # rss engines supported:1
   # filters supported:512
   # active filters:0
   # filters moved by load balancer:0
   RX filter classes: 0x1f -> MAC VLAN VLAN_MAC VXLAN Geneve
   Rx Queue features: 0x482 -> Pair Dynamic GenericRSS

But Nics 2 and 4, which are connected to the standard distributed switch, have 9 Rx Queues configured properly.

Is this related to the virtual switch we are connecting to (meaning we need to be looking at VMWare)? Or, is this somehow related to the i40en driver that is being used (in which case we need to be going to server vendor or Intel who makes the XL710 nic)?

Friday, January 13, 2023

Debugging Dropped Packets on NSX-T E-NVDS

Inside the hypervisor, we have the following nics:

The servers have physical nics as follows:

~~vmnic0 – 1G nic, Intel X550 – Unused~~
~~vmnic1 – 1G nic, Intel X550 - Unused~~
vmnic2 - 10G nic, SFP+, Intel XL710, driver version 2.1.5.0 FW version 8.50, link state up
vmnic3 - 10G nic, SFP+, Intel XL710, driver version 2.1.5.0 FW version 8.50, link state up
vmnic4 - 10G nic, SFP+, Intel XL710, driver version 2.1.5.0 FW version 8.50, link state up
vmnic5 - 10G nic, SFP+, Intel XL710, driver version 2.1.5.0 FW version 8.50, link state up

The nics connect to the upstream switches (Aristas), and they connect virtually to the virtual switches (discussed right below):

Inside Hypervisor (Host 5 in this specific case):

Distributed vSwitch

Physical Nic Side: vmnic2 and vmnic4

Virtual Side: vmk0 (VLAN 3850) and vmk1 (VLAN 3853)

NSX-T Switch (E-NVDS)

Physical NIC side: vmnic3 and vmnic5 à this is the nic that gets hit when we run the load tests

Virtual Side: 50+ individual segments that VMs connect to, and get assigned a port

Now, in my previous email, I dumped the stats for the physical NIC – meaning, from the “NIC Itself” from the ESXi OS operating system.

But, it is wise also, to take a look at the stats of the physical nic from the perspective of the virtual switch! Remember, vmnic5 is a port on the virtual switch!

So first, we need to figure out what port we need to look at:
net-stats -l

PortNum Type SubType SwitchName MACAddress ClientName

2214592527 4 0 DvsPortset-0 40:a6:b7:51:56:e9 vmnic3

2214592529 4 0 DvsPortset-0 40:a6:b7:51:1b:9d vmnic5 à here we go, port 2214592529 on switch DvsPortset-0 is the port of interest

67108885 3 0 DvsPortset-0 00:50:56:65:96:e4 vmk10

67108886 3 0 DvsPortset-0 00:50:56:65:80:84 vmk11

67108887 3 0 DvsPortset-0 00:50:56:66:58:98 vmk50

67108888 0 0 DvsPortset-0 02:50:56:56:44:52 vdr-vdrPort

67108889 5 9 DvsPortset-0 00:50:56:8a:09:15 DEV-ISC1-Vanilla3a.eth0

67108890 5 9 DvsPortset-0 00:50:56:8a:aa:3f DEV-ISC1-Vanilla3a.eth1

67108891 5 9 DvsPortset-0 00:50:56:8a:9d:b1 DEV-ISC1-Vanilla3a.eth2

67108892 5 9 DvsPortset-0 00:50:56:8a:d9:65 DEV-ISC1-Vanilla3a.eth3

67108893 5 9 DvsPortset-0 00:50:56:8a:fc:75 DEV-ISC1-Vanilla3b.eth0

67108894 5 9 DvsPortset-0 00:50:56:8a:7d:cd DEV-ISC1-Vanilla3b.eth1

67108895 5 9 DvsPortset-0 00:50:56:8a:d4:d8 DEV-ISC1-Vanilla3b.eth2

67108896 5 9 DvsPortset-0 00:50:56:8a:67:6f DEV-ISC1-Vanilla3b.eth3

67108901 5 9 DvsPortset-0 00:50:56:8a:32:1c DEV-MSC1-Vanilla3b.eth0

67108902 5 9 DvsPortset-0 00:50:56:8a:e6:2b DEV-MSC1-Vanilla3b.eth1

67108903 5 9 DvsPortset-0 00:50:56:8a:cc:eb DEV-MSC1-Vanilla3b.eth2

67108904 5 9 DvsPortset-0 00:50:56:8a:7a:83 DEV-MSC1-Vanilla3b.eth3

67108905 5 9 DvsPortset-0 00:50:56:8a:63:55 DEV-MSC1-Vanilla3a.eth3

67108906 5 9 DvsPortset-0 00:50:56:8a:40:9c DEV-MSC1-Vanilla3a.eth2

67108907 5 9 DvsPortset-0 00:50:56:8a:57:8f DEV-MSC1-Vanilla3a.eth1

67108908 5 9 DvsPortset-0 00:50:56:8a:5b:6d DEV-MSC1-Vanilla3a.eth0

/net/portsets/DvsPortset-0/ports/2214592529/> cat stats

packet stats {

pktsTx:10109633317

pktsTxMulticast:291909

pktsTxBroadcast:244088

pktsRx:10547989949 à total packets RECEIVED on vmnic5’s port on the virtual switch

pktsRxMulticast:243731083

pktsRxBroadcast:141910804

droppedTx:228

droppedRx:439933 à This is a lot more than the 3,717 Rx Missed errors, and probably accounts for why MetaSwitch sees more drops than we saw up to this point!

}

So – we have TWO things now to examine here.

Is the Receive Side Scaling configured properly and working?

We configured it, but…we need to make sure it is working and working properly.
We don’t see all of the queues getting packets. Each Rx Queue should be getting its own CPU.

Once packets get into the Ring Buffer and passed through to the VM (poll mode driver picks the packets up off the Ring), they hit the virtual switch.

And the switch is dropping some packets.
Virtual switches are software. As such, they need to be tuned to stretch their capability to keep up with what legacy hardware switches can do.

The NSX-T switch is a powerful switch, but is also a newer virtual switch, more bleeding edge in terms of technology.
I wonder if we are running the latest greatest version of this switch, and if that could help us here.

Now, I looked even deeper into the E-NVDS switch. I went into vsish shell, and started examining any and all statistics that are captured by that networking stack.

Since we are concerned with receives, I looked at the InputStats specifically. I noticed there are several filters – which, I presume is tied to a VMWare Packet Filtering flow, analogous to Netfilter in Linux, or perhaps Berkeley Packet Filter. But, I have no documentation whatsoever on this, and can’t find any, so I did my best to “back into” what I was seeing.

I see the following filters that packets can traverse – traceflow might be packet capture but not sure aside of that.

· ens-slowpath-input

·         traceflow-Uplink-Input:0x43110ae01630

·         vdl2-uplink-in:0x431e78801dd0

·         UplinkDoSwLRO@vmkernel#nover

·         VdrUplinkInput

If we go down into the filters and print the stats out, most of the stats seem to line up (started=passed, etc) except this one, which has drops in it:


/net/portsets/DvsPortset-0/ports/2214592529/inputFilters/vdl2-uplink-in/> cat stats

packet stats {

   pktsIn:31879020

   pktsOut:24269629

   pktsDropped:7609391


/net/portsets/DvsPortset-0/ports/2214592527/inputFilters/vdl2-uplink-in/> cat stats

packet stats {

   pktsIn:24817038

   pktsOut:17952829

   pktsDropped:6864209

That seems like a lot of dropped packets to me (a LOT more than those Rx Missed errors), so this looks like something we need to work with VMWare on because if I understand these stats properly, this suggests an issue on the virtual switch more than the adaptor itself.

Another thing I saw, poking around, was this interesting looking WRONG_VNIC on passthrough status on vmnic3 and vmnic5, the two nics being used in the test here. I think we should maybe ask VMWare about this and run this down also.

/net/portsets/DvsPortset-0/ports/2214592527/> cat status

port {

   port index:15

   vnic index:0xffffffff

   portCfg:

   dvPortId:4dfdff37-e435-4ba4-bbff-56f36bcc0779

   clientName:vmnic3

   clientType: 4 -> Physical NIC

   clientSubType: 0 -> NONE

   world leader:0

   flags: 0x460a3 -> IN_USE ENABLED UPLINK DVS_PORT DISPATCH_STATS_IN DISPATCH_STATS_OUT DISPATCH_STATS CONNECTED

   Impl customized blocked flags:0x00000000

   Passthru status: 0x1 -> WRONG_VNIC

   fixed Hw Id:40:a6:b7:51:56:e9:

   ethFRP:frame routing {

      requested:filter {

         flags:0x00000000

         unicastAddr:00:00:00:00:00:00:

         numMulticastAddresses:0

         multicastAddresses:

         LADRF:[0]: 0x0

         [1]: 0x0

      accepted:filter {

         flags:0x00000000

         unicastAddr:00:00:00:00:00:00:

         numMulticastAddresses:0

         multicastAddresses:

         LADRF:[0]: 0x0

         [1]: 0x0

   filter supported features: 0 -> NONE

   filter properties: 0 -> NONE

   rx mode: 0 -> INLINE

   tune mode: 2 -> invalid

   fastpath switch ID:0x00000000

   fastpath port ID:0x00000004

Tuesday, January 10, 2023

VMWare NSX-T Testing - Dropped Packets

We have been doing some performance testing with a voice system.

In almost all cases, these tests are failing. They are failing for two reasons:

Rx Missed counters on the physical adaptors of the hypervisors that are used to send the test traffic. These adaptors are connected to the E-NVDS virtual switch on one side, and to an upstream Arista data center switch on the other.
Dropped Packets - mostly media (RTP UDP), with less than 2% of the drops being RTCP traffic (TCP).

Lately, I used the"Performance Best Practices for VMWare vSphere 7.0" guide, as a method for trying to improve the dropped packets were seeing.

We attempted several things that were mentioned in this document:

ESxi NiC - enable Receive Side Scaling (RSS)

Actually, to be technical, we enabled DRSS (Default Queue RSS) rather than the RSS (NetQ RSS) which the i40en driver also supported for this Intel X710 adaptor.

LatencySensitivity=High - and we checked "Reserve all Memory" on the checkbox
Interrupt Coalescing

Disabling it, to see what affect disabling it had
Setting it from its rate-based scheme (the default, rbc) to static with 64 packets per interrupt

We didn't really see any noticeable improvement from the Receive Side Scaling or the Latency Sensitivity settings, which was a surprise, actually. We did see some perhaps minor improvement on the interrupt coalescing when we set it to static.