Wednesday, January 29, 2025

Pinephone Pro - Booting an OS off SPI vs eMMC

I finally got a chance to pick the Pinephone Pro back up and play with it some more.

I was able to charge up the battery, and boot the phone and verify that Tow-Boot was installed on it properly. That was my first step. I believe I verified this by holding the volume down button, and waiting for the light to turn aqua (note, it may have been volume up, I should check this for correctness).

Next, I rebooted the phone,  and it booted into the Manjaro OS which is installed on the eMMC drive of the phone.

Next, I put the PostMarketOS into the microSD card slot, and booted the phone. Apparently Tow-Boot uses the following boot order:

  1. SPI - more on this in a bit, I had to learn what this is
  2. microSD Card
  3. eMMC (which has Manjaro on it)

I didn't get a Boot Menu - but maybe a key sequence (volume up?) would give me such a menu. It booted straight into the PostMarket OS. 

I proceeded to experiment with PostMarket OS, and did a complete update of all of the packages on it.

Next, I wondered how I could "replace" the default Manjaro with the PostMarket OS, which was newer than Manjaro, such that it would boot PostMarket OS on the eMMC, allowing me recycle the microSD card for perhaps another OS distribution I could take a look at later. 

It turns out, that there is a PostMarketOS "on-disk installer".  It is called pmbootstrap.

THIS is where I had to learn about SPI. Because there is a warning about over-writing your Tow-Boot installation, if Tow-Boot was not installed on SPI. 

so...what is SPI? (more search required)

SPI Flash is a type of non-volatile memory that uses the Serial Peripheral Interface (SPI) protocol for communication. It is commonly used in embedded systems for data storage and transfer, allowing devices to retain information even when powered off. 

Apparently it is a newer (or improved, perhaps) concept, found on phones with System-On-A-Chip (SOC) architectures. 

so...how do you know if you even have SPI?

Answer: I had to figure out which version of Pinephone Pro I have. 

I finally learned that there is a Developer Edition of the Pinephone Pro, and there is a Explorer Edition. The Explorer Edition supposedly has the SPI. 

But what confused me, is that it said the phone supporting SPI had the Rockchip RK3399S SoC. And when I went into the terminal on the phone and ran "lscpu", it said I had an ARM Cortex A-53 chip. 

so...now I am thoroughly confused.

Well, I finally learned, that the Rockchip RK3399S SoC combines four Cortex-A53 cores with two Cortex-A72 cores.

hmmm, I did not see the 72 in the lscpu command I ran - but, it does look like I have the SPI.

but, how do I know that Tow-Boot was installed on the SPI, versus the eMMC? Because if I have this wrong, I can't boot an OS as there would be no bootloader partition.

I think the SPI is mmcblk1 device. And /boot is on mmcblk1p1 partition of that device.

The Manjaro (previous installation) is definitely on the eMMC, which is on mmcblk2 device, which has two partitions on it, one of them being /root.

Sunday, January 19, 2025

NUMA PreferHT VM setting on a Hyperthread-Enabled ESXi Hypervisor

This could be a long post, because things like NUMA can get complicated.

For background, we are running servers - hypervisors - that have 24 cores. There are two chips - wafers as I like to refer to them - each with 12 cores, giving a total of 24 physical cores.

When you enable hyperthreading, you get 48 cores, and this is what is presented to the operating system and cpu scheduler (somewhat - more on this later).  But - you don't get an effective doubling of cores when you enable hyperthreading. What is really happening, is that the 24 cores are "cut in half" so that another 24 cores can be "fit in", giving you 48 logical cores.  

Worth mentioning also, is that each (now half) core, has a "sibling" - and this also matters from a scheduling perspective when you see things like cpu pinning used - because if you pin something to a specific core, then that "sibling" cannot be used for something else.  For example, if you enabled hyperthreading, the cores would look like:

0 | 1

2 | 3

4 | 5

... and so on. So if someone pinned to core 4, core 5 is also "off the table" now from a scheduling perspective because pinning is a physical core concept, not a logical core concept.

So with this background, we had a tenant who wanted to enable a "preferHT" setting. This setting can be applied to an entire hypervisor by setting numa.PreferHT=1, affecting all VMs deployed on it.

Or, one can selectively add this setting to a particular or specific virtual machine by going into the Advanced Settings and configuring numa.vcpu.preferHT=TRUE.  

In our case, it was the VM setting being requested - not the hypervisor setting.  Now, this tenant is the "anchor tenant" on the platform, and their workloads are very latency sensitive. So it was important to jump through this hoop when it was requested. First, we tested the setting by powering a VM off and adding the setting, then powering the VM back on. No problems with this. We then migrated the VM to another hypervisor, and had no issues with that either. Aside of that, though, how do you know that the VM setting "took" - meaning that it was picked up and recognized?

It turns out, that there are a couple of ways to do this:

1. esxtop

When you load esxtop, it is going to show you cpu by default. But if you hit the "m" key, it goes into a "memory view". If you go into memory view by hitting "m" and then hit the "f" key, a list of fields will show up. One of them, is NUMA Statistics. So by selecting this, you get a ton of interesting information about NUMA. The settings you are most interested in, are going to be:

NHN - Current home node for the virtual machine or resource pool - in our case, this was 0 or 1 (we had two numa nodes, as there is usually one per physical cpu socket).

NMIG - Number of NUMA migrations between two snapshot samples

NRMEM - (NUMA Remote Memory): Amount of remote memory allocated to the virtual machine, in MB

NLMEM (NUMA Local Memory) - Amount of local memory allocated to the virtual machine, in MB

L%D - this shows the amount of memory that is Localized. You want this number to be 100% but seeing the number in the 90s is probably okay also because it is showing that the memory access is not traversing a NUMA bus, which adds latency.

GST_NDx (Guest Node x): Guest memory being allocated for the VM on NUMA node x, where x is the node number

MEMSZ (Memory Size): Total amount of physical memory allocated to a virtual machine

2. vmdumper command

I found this command on a blog post - which I will list in my sources at the end of this blog post. This useful command, can show you a lot of interesting information about how NUMA is working "under the hood" (in practice). It can show you a Logical Processor to NUMA Node Map, it can show you how many home nodes are utilized for a given VM, and show you the assignment of NUMA clients to the respective NUMA nodes.

One of the examples covered in this blog post refers to the situation where a VM has 12 vCPUs on a 10 core system, and then goes down and shows what it would look like if the VM had 10 vCPU instead.


Sources:

http://www.staroceans.org/ESXi_VMkernel_NUMA_Constructs.htm

https://frankdenneman.nl/2010/02/03/sizing-vms-and-numa-nodes/

https://frankdenneman.nl/2010/10/07/numa-hyperthreading-and-numa-preferht/

https://docs.pexip.com/server_design/vmware_numa_affinity.htm

https://docs.pexip.com/server_design/numa_best_practices.htm#hyperthreading

https://knowledge.broadcom.com/external/article?legacyId=2003582


 

Wednesday, January 8, 2025

MySQL Max Allowed Packet

I recently conducted an upgrade, and for the life of me I couldn't figure out why the application wouldn't initialize.

I checked MySQL - it seemed to be running fine. I logged into the database, checked the Percona cluster status, it looked fine.

I checked RabbitMQ, and it also seemed to be running fine.

In checking the application logs, I saw an exception about a query and the packet size being too big, and I thought this was strange - mainly because of the huge size of the packet.

Sure enough, after calling support, I was informed that I needed to change the MySQL configuration in my.cnf and add a directive in the [mysqld] section.

max_allowed_packet=128M

In terms of what this value should 'really' be, I was told that this is a normal setting on most installations.

Who knew? It's unusual to be adding new parameters on the fly like this to a clustered database. 

But, sure enough, after restarting the database (well, the whole VM actually because I had done updates), it came up just fine.

SLAs using Zabbix in a VMware Environment

 Zabbix 7 introduced some better support for SLAs. It also had better support for VMware. VMware, of course now owned by BroadSoft, has prio...