Showing posts with label disk. Show all posts
Showing posts with label disk. Show all posts

Wednesday, September 18, 2024

Fixing Clustering and Disk Issues on an N+1 Morpheus CMP Cluster

I had performed an upgrade on Morpheus which I thought was fairly successful. I had some issues doing this upgrade on CentOS 7 because it was designated EOL and the repositories were archived, but I worked through that and it seemed everyone was using the system just fine.

Today, however, I had someone contact me to tell me that they provisioned a virtual machine, but it was stuck in an incomplete "Provisioning" state (a state that has a blue icon with a rocketship in it). The VM was provisioned on vCenter and working, but the state in Morpheus never set to "Finalized".

I couldn't figure this out, so I went to the Morpheus help site and I discovered that I myself had logged a ticket on this issue quite a while back. It turned out that the reason the state never flipped in that case, was because the clustering wasn't working properly.

So I checked RabbitMQ. It looked fine.

I checked MySQL and Percona, and I suspected that perhaps the clustering wasn't working properly. In the process of restarting the VMs, one of the virtual machines wouldn't start. I had to do a bunch of Percona advanced troubleshooting to figure out that I needed to do a wsrep recover commit before I could start the system and have it properly join the cluster. 

The NEXT problem was that Zabbix was screeching about these Morpheus VMs using too much disk space. It turned out that the /var file system was 100% full - because of ElasticSearch. Fortunately I had an oversized /home directory, and was able to do an rsync of the elasticsearch directory over to /home and re-link it.

But this gets to the topic of system administration with respect to disks.

First let's start with some KEY commands you MUST know:

>df -Th 

This command (disk free = df) shows how much space is used in human readable format, but with the mountpoint and file system type. This tells you NOTHING about the physical disks though!

>lsblk -f

This command (list block device) will give you the physical disk, the mountpoint, the uuid and any labels. It is a device specific command and doesn't show you space consumption.

>fdisk -l

I don't really like this command that much because of the output formatting. But it does list disk partitions and related statistics.

Some other commands you can use are:

>sudo file -sL /dev/sda3

the -s flag enables reading of block or character files and -L enables following of symlinks:

>blkid /dev/sda3

Similar command to lsblk -f above.

Friday, October 28, 2022

Moving a LVM file system to a new disk in Linux

I had to dive back into Linux disk partitioning, file systems, and volumes when I got an alert from Zabbix that a cluster of 3 VMs were running out of space. As the alert from Zabbix said disk space was greater than 88 percent, I grew concerned and took a look.

In the labs, we had 3 x CentOS7 Virtual Machines, each deployed with a 200G VMDK file.  But inside the VM, in the Linux OS, there were logical volumes (centos-root, centos-swap, centos-home) that were mounted as XFS file systems on a 30G partition. There was no separate volume for /var (centos-var). And /var was the main culprit of the disk space usage. 

The decision was made to put /var on a separate disk as a good practice, because the var file system was used to store large virtual machine images.

The following steps were taken to move the /var file system to the new disk:

1. Add new Disk in vCenter to VM - create new VMDK file (100G in this particular case)

2. If the disk is seen, a /dev/sdb will be present in the Linux OS of the virtual machine. We need to create a partition on it (/dev/sdb1).
 
# fdisk /dev/sdb

n is the option to create a new partition, then p for selecting primary, then a bunch of useless question for this case, like the partition number, first and last cylinder, just use the default options.
This will create a Linux primary partition, you will need to use the command t in order to change the partition type to 8e (Linux LVM).
Then w will write everything to the disk and exit from fdisk.
# fdisk -l /dev/sdb

Will return something like this:

Device Boot Start End Sectors Size Id Type
/dev/sdb1 2048 20971519 20969472 10G 8e Linux LVM

3. Add device to physical volume (this creates a partition)
# pvcreate /dev/sdb1

NOTE: to delete a device from a physical volume, use vgreduce first, then pvremove!
vgreduce centos /dev/sdb1
pvremove /dev/sdb1

4. display volume group
# vgdisplay

--- Volume group ---
VG Name centos
[... more detail …]

5. display physical volumes in volume group
 
pvdisplay -C --separator '  |  ' -o pv_name,vg_name

6. Extend the volume group so it can contain the new disk (partition)

# vgextend centos /dev/sdb1

You will get info like this:
VG Size 29.75 GiB
PE Size 4.00 MiB
Total PE 7617
Alloc PE / Size 5058/ 19.75 GiB
Free PE / Size 2559 / 10 Gib

7. Create new logical volume

NOTE: this command can be tricky. You either need to know extents, and semantics, or you can keep is simple. Such as:
# lvcreate -n var -l 100%FREE centos

8. Create file system - NOTE that XFS is the preferred type, not ext4!
# mkfs -t xfs /dev/centos/var

9. Mount the new target var directory as newvar
# mkdir /mnt/newvar
# mount /dev/centos/var /mnt/newvar

10. Copy the files

NOTE: Lots of issues can occur during this, depending on what directory you are copying (i.e. var is notorious because of run and lock dirs).

I found this command to work:
# cp -apxv /var/* /mnt/newvar

Another one people seem to like, is the rsync command, but this one below I attempted hung:
# rsync -avHPSAX /var/ /mnt/newvar/

11. You can do a diff, or try to, to see how sane the copy went:
# diff -r /var /mnt/newvar/

12. Update fstab for reboot
/dev/mapper/centos-var /var xfs defaults 0 0

Note that we used the logical volume centos-var here, not centos (the volume group). LVM calls the volumes centos-swap, centos-home, etc.

13. Move the old /var on old root file system
# mv /var /varprev

14.Rename current var, create a new folder and remount
# mkdir /var
# mount /var

15. Use the df command to bring all the mounts
# df -h | grep /dev/

16. Decide whether you want to remove the old var file system and reclaim that disk space.

NOTE: Do not do this until you’re damned sure the new one is working fine. I recommend rebooting the system, inspecting all services that need to be running, etc.  

Now, the only thing left to consider now, is that after we have moved /var to a new 100G VMDK disk, what do we do about the fact that we now have a 200G boot/swap/root disk that is only using a small fraction of 200G in space now? Well, shrinking disks is even MORE daunting, and is not the topic of this post. But, if I decide to reclaim some space, expect another post that documents how I tackled that effort (or attempted to). 

For now, no more alerts about running out of space on a root file system is good news, and this VM can now run peacefully for quite a while.

Pinephone Pro (with Tow-Boot) - Installing a new OS on the eMMC

In my previous Pinephone Pro, I was describing how I was coming up to speed on the different storage mechanisms on the Pinephone Pro: SPI vs...