Achieving High Availability for Ceph Cluster Management Nodes

Management High Availability Overview

To ensure robust administration capabilities, a Ceph cluster requires multiple management nodes. Relying on a single management node creates a single point of failure; if that node goes down, cluster administration becomes impossible. Administrative access is generally divided into two categories: web dashboard management, which interacts with the manager (mgr) service, and client-based command-line management, which relies on the monitor (mon) service.

Manager Daemon High Availability

The Ceph Manager daemon operates in an active-passive configuration. Only one manager instance is active at any given time, handling the web dashboard and metrics, while the others remain in standby mode. If the active instance fails, a standby node automatically promotes itself to active status.

You can verify the current status and active manager using the cluster status command:

ceph -s

The output will indicate which node is active and which are acting as standbys. For instance, if the dashboard is accessed via the active node's IP address, it loads successfully. Accessing it via a standby node's address will typically redirect the connection to the active hostname. When the active daemon crashes, one of the standby nodes immediately assumes the active role to maintain service continuity.

Client-Side Management High Availability

Command-line management is handled by the Ceph Monitor daemons. Consequently, any node hosting a `mon` component can function as a management node. To enable this functionality, the target nodes must have the common Ceph utilities installed, possess the correct configuration file, and hold the necessary administrative keyrings.

Step 1: Configure Software Repositories

Prepare the package manager on all intended management nodes by adding the Ceph repository. The following configuration example uses a specific mirror for the Ceph packages:

cat > /etc/yum.repos.d/ceph.repo <<'EOF'
[ceph]
name=Ceph packages
baseurl=https://mirrors.tuna.tsinghua.edu.cn/ceph/rpm-nautilus/el7/x86_64/
enabled=1
gpgcheck=0

[ceph-noarch]
name=Ceph noarch packages
baseurl=https://mirrors.tuna.tsinghua.edu.cn/ceph/rpm-nautilus/el7/noarch/
enabled=1
gpgcheck=0
EOF

Step 2: Install Common Utilities

Install the `ceph-common` package on all monitor nodes to provide the necessary CLI tools. The EPEL repository is required to satisfy dependencies:

sudo yum install -y epel-release
sudo yum install -y ceph-common

Step 3: Distribute Administrative Keys

Copy the administrative keyring from the initial deployment node (e.g., the first monitor) to the other monitor nodes to grant them management privileges:

scp /etc/ceph/ceph.client.admin.keyring root@node-alpha:/etc/ceph/
scp /etc/ceph/ceph.client.admin.keyring root@node-beta:/etc/ceph/

Step 4: Deploy Configuration Files

Ensure every management node has a valid `ceph.conf` file. While configurations are stored in specific monitor data directories, it is standard practice to copy the main configuration file to the `/etc/ceph/` directory for CLI tools to function correctly.

scp /etc/ceph/ceph.conf root@node-alpha:/etc/ceph/
scp /etc/ceph/ceph.conf root@node-beta:/etc/ceph/

Once these steps are complete, the additional nodes are fully configured as high-availability management nodes. You can verify the setup by executing administrative commands from any of these nodes:

ceph -s

Tags: Ceph High Availability System Administration Linux Storage

Posted on Sat, 16 May 2026 16:47:11 +0000 by Wildthrust