Showing posts with label XFS. Show all posts
Showing posts with label XFS. Show all posts

Friday, August 18, 2023

The Linux XFS File System - How Resilient Is It?

We are using VMWare Datastores, using NFS version 3.x.  The storage was routed, which is never a good thing to do because let's face it, if your VMs all lose their storage simultaneously, that constitutes a disaster. Having dependencies on a router, which can lose its routing prefixes due to a maintenance or configuration problem, is architecturally deficient (polite way of putting it). To solve this, you need to make sure that you don't have routing hops (storage on same segment as storage interface on hypervisor).

So, after our storage routers went AWOL due to a maintenance event, I noticed some VMs came back and appeared to be fine. They had rebooted and were at a login prompt.  Other VMs, however, did not come back, and had some nasty things printing on the console (you could not log into these VMs).


What we noticed, was that any Linux virtual machine running with XFS file system type on boot or root (/boot or /) had this issue of being unrecoverable.  VMs that were using ext3 or ext4 seemed to be able to recover and start running their services - although some were still echoing some messages to the console.

There is a lesson here. That the file system matters when it comes to resiliency in a virtualized environment.

I did some searching around for discussions on file system types, and of course there are many. This one in particular, I found interesting:  ext4-vs-xfs-vs-btrfs-vs-zfs-for-nas


Friday, October 28, 2022

Moving a LVM file system to a new disk in Linux

I had to dive back into Linux disk partitioning, file systems, and volumes when I got an alert from Zabbix that a cluster of 3 VMs were running out of space. As the alert from Zabbix said disk space was greater than 88 percent, I grew concerned and took a look.

In the labs, we had 3 x CentOS7 Virtual Machines, each deployed with a 200G VMDK file.  But inside the VM, in the Linux OS, there were logical volumes (centos-root, centos-swap, centos-home) that were mounted as XFS file systems on a 30G partition. There was no separate volume for /var (centos-var). And /var was the main culprit of the disk space usage. 

The decision was made to put /var on a separate disk as a good practice, because the var file system was used to store large virtual machine images.

The following steps were taken to move the /var file system to the new disk:

1. Add new Disk in vCenter to VM - create new VMDK file (100G in this particular case)

2. If the disk is seen, a /dev/sdb will be present in the Linux OS of the virtual machine. We need to create a partition on it (/dev/sdb1).
 
# fdisk /dev/sdb

n is the option to create a new partition, then p for selecting primary, then a bunch of useless question for this case, like the partition number, first and last cylinder, just use the default options.
This will create a Linux primary partition, you will need to use the command t in order to change the partition type to 8e (Linux LVM).
Then w will write everything to the disk and exit from fdisk.
# fdisk -l /dev/sdb

Will return something like this:

Device Boot Start End Sectors Size Id Type
/dev/sdb1 2048 20971519 20969472 10G 8e Linux LVM

3. Add device to physical volume (this creates a partition)
# pvcreate /dev/sdb1

NOTE: to delete a device from a physical volume, use vgreduce first, then pvremove!
vgreduce centos /dev/sdb1
pvremove /dev/sdb1

4. display volume group
# vgdisplay

--- Volume group ---
VG Name centos
[... more detail …]

5. display physical volumes in volume group
 
pvdisplay -C --separator '  |  ' -o pv_name,vg_name

6. Extend the volume group so it can contain the new disk (partition)

# vgextend centos /dev/sdb1

You will get info like this:
VG Size 29.75 GiB
PE Size 4.00 MiB
Total PE 7617
Alloc PE / Size 5058/ 19.75 GiB
Free PE / Size 2559 / 10 Gib

7. Create new logical volume

NOTE: this command can be tricky. You either need to know extents, and semantics, or you can keep is simple. Such as:
# lvcreate -n var -l 100%FREE centos

8. Create file system - NOTE that XFS is the preferred type, not ext4!
# mkfs -t xfs /dev/centos/var

9. Mount the new target var directory as newvar
# mkdir /mnt/newvar
# mount /dev/centos/var /mnt/newvar

10. Copy the files

NOTE: Lots of issues can occur during this, depending on what directory you are copying (i.e. var is notorious because of run and lock dirs).

I found this command to work:
# cp -apxv /var/* /mnt/newvar

Another one people seem to like, is the rsync command, but this one below I attempted hung:
# rsync -avHPSAX /var/ /mnt/newvar/

11. You can do a diff, or try to, to see how sane the copy went:
# diff -r /var /mnt/newvar/

12. Update fstab for reboot
/dev/mapper/centos-var /var xfs defaults 0 0

Note that we used the logical volume centos-var here, not centos (the volume group). LVM calls the volumes centos-swap, centos-home, etc.

13. Move the old /var on old root file system
# mv /var /varprev

14.Rename current var, create a new folder and remount
# mkdir /var
# mount /var

15. Use the df command to bring all the mounts
# df -h | grep /dev/

16. Decide whether you want to remove the old var file system and reclaim that disk space.

NOTE: Do not do this until you’re damned sure the new one is working fine. I recommend rebooting the system, inspecting all services that need to be running, etc.  

Now, the only thing left to consider now, is that after we have moved /var to a new 100G VMDK disk, what do we do about the fact that we now have a 200G boot/swap/root disk that is only using a small fraction of 200G in space now? Well, shrinking disks is even MORE daunting, and is not the topic of this post. But, if I decide to reclaim some space, expect another post that documents how I tackled that effort (or attempted to). 

For now, no more alerts about running out of space on a root file system is good news, and this VM can now run peacefully for quite a while.

SLAs using Zabbix in a VMware Environment

 Zabbix 7 introduced some better support for SLAs. It also had better support for VMware. VMware, of course now owned by BroadSoft, has prio...