Introduction to HA clusters in XCP-ng

What is HA and why cluster nodes together?

High Availability (HA) in XCP-ng is the concept of combining multiple servers (nodes) into a single pool, allowing them to be managed as a single unit. This makes the infrastructure resilient to single-node failures, allowing virtual machines (VMs) to remain operational. A minimum of three nodes is needed to achieve full redundancy, preventing split brain problems.

Application of HA clusters:

  • Redundancy and fault tolerance - If one node fails, VMs can be moved to other nodes.
  • Uninterrupted updates - With the live migration function, it is possible to update the software or hardware of one of the nodes without interrupting services.
  • Dynamic load management - VM workloads can be moved between nodes as needed.

HA cluster configuration conditions:

  • External storage - All nodes must share access to the datastore to enable rapid VM migration.
  • XCP-ng version compatibility - All nodes in the cluster must run on the same version of the operating system and have the same patches installed.
  • Similar hardware configuration - Nodes should have similar hardware parameters to avoid compatibility problems.

Example configuration of a dual-mode cluster

Although full HA functionality requires a minimum of three nodes, you can build a cluster on two nodes. Here are the steps:

  1. Pooling nodes:

    • In Xen Orchestra, go to the "Pools" section and add a new nod to an existing field.
    • Make sure both nodes are configured with access to shared storage.
  2. Enabling Maintenance mode:

    • To update or disable one of the nodes, enable Maintenance mode. All VMs will be automatically transferred to another node.
  3. VM migration:

    • During live migration, VM RAM is transferred between nodes. This process is done without interrupting services.

Disadvantages and limitations of HA:

  • RAM failure - In case of a sudden failure of one of the nodes, the data stored in RAM will be lost.
  • Problems with the master - Clusters in XCP-ng have a single master who coordinates operations. Failure of the master requires a manual change to a new master.
  • Local storage - If the VM uses local storage, you will not be able to migrate it to another nod.

Useful functionalities:

  • Rolling Pool Reboot - Automatically restart hosts in a cluster with VM migration between them.
  • Rolling Pool Update - XCP-ng system updates on nodes without interrupting VM operations.
  • Smart Reboot - A feature that allows RAM status to be written to the disk before the host reboots, although it requires caution with slower disks.

Summary

HA clusters in XCP-ng significantly improve the reliability of virtualization infrastructure. When properly configured, they provide flexibility and the ability to operate without downtime. For full automation and redundancy, it is recommended to use at least three nodes and shared storage.

If you need support in designing, implementing or maintaining your infrastructure, feel free to contact us. Did you like this material? Like, subscribe and see you in future articles!