Initial Environment Setup
System Environment: CentOS 7.3 (Kernel 3.10.0-514.el7.x86_64)
Hardware Requirements:
- Two server nodes
- Minimum two disks per node - one for OS, one for storage
- Nodes named: server01, server02
IP Configuration:
- 192.168.238.129 - server01
- 192.168.238.130 - server02
- Disk Configuration
After OS installation, format the secondary disk on both nodes:
[root@server01 ~]# fdisk /dev/sdb
# Create partition using fdisk interactive commands
1.1 Format and Mount the Storage Disk
Execute the following on both nodes. This example assumes the brick will reside on /dev/sdb1:
# Format with XFS filesystem
[root@server01 ~]# mkfs.xfs -i size=512 /dev/sdb1
# Create mount point and mount the partition
[root@server01 ~]# mkdir -p /data/brick1 && mount /dev/sdb1 /data/brick1
# Add to fstab for persistent mounting
[root@server01 ~]# echo "/dev/sdb1 /data/brick1 xfs defaults 0 0" >> /etc/fstab
# Verify mounting
[root@server01 ~]# mount -a && mount
- Install GlusterFS
# Install the GlusterFS repository
[root@server01 ~]# yum install centos-release-gluster
# Install GlusterFS server packages
[root@server01 ~]# yum install glusterfs-server
- Configure Firewall
For testing environments, it is recommended to disable the firewall:
# Temporarily stop firewall
[root@server01 ~]# systemctl stop firewalld
# Permanently disable firewall
[root@server01 ~]# systemctl disable firewalld
Note: For production environments, configure apppropriate firewall rules instead of disabling.
- Start GlusterFS Service
# Start GlusterFS daemon
[root@server01 ~]# systemctl start glusterd
# Enable auto-start on boot
[root@server01 ~]# systemctl enable glusterd
- Configure Trusted Storage Pool
First, ensure host resolution is configured in /etc/hosts or DNS. This example uses /etc/hosts:
[root@server01 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.238.129 server01
192.168.238.130 server02
192.168.238.133 server03
192.168.238.132 server04
192.168.238.134 server05
Add the second node to the trusted pool from the first node:
[root@server01 ~]# gluster peer probe server02
peer probe: success.
Verify peer status:
[root@server01 ~]# gluster peer status
Number of Peers: 1
Hostname: server02
Uuid: 61fe987a-99ff-419d-8018-90603ea16fe7
State: Peer in Cluster (Connected)
To remove a node from the cluster:
[root@server01 ~]# gluster peer detach server02
- Create a GlusterFS Volume
Create the brick directories on both nodes:
[root@server01 ~]# mkdir -p /data/brick1
[root@server02 ~]# mkdir -p /data/brick1
Create a replicated volume:
[root@server01 ~]# gluster volume create myvolume replica 2 transport tcp server01:/data/brick1 server02:/data/brick1
volume create: myvolume: success: please start the volume to access data
Start the volume:
[root@server01 ~]# gluster volume start myvolume
volume start: myvolume: success
View volume information:
[root@server01 ~]# gluster volume info
Volume Name: myvolume
Type: Replicate
Volume ID: bc637d83-0273-4373-9d00-d794a3a3d2e7
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: server01:/data/brick1
Brick2: server02:/data/brick1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
Check volume status:
[root@server02 ~]# gluster volume status
Status of volume: myvolume
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick server01:/data/brick1 49152 N/A Y 10774
Brick server02:/data/brick1 N/A N/A N/A N/A
Self-heal Daemon on localhost N/A N/A Y 998
Self-heal Daemon on server01 N/A N/A Y 10794
- Volume Expansion
To expand a volume (add new bricks), follow these steps:
Prerequisites: Ensure new nodes have the same setup as existing nodes, including /etc/hosts entries.
7.1 Probe New Nodes
[root@server01 ~]# gluster peer probe server03
peer probe: success.
[root@server01 ~]# gluster peer probe server04
peer probe: success.
7.2 Add New Bricks
[root@server01 ~]# gluster volume add-brick myvolume server03:/data/brick1 server04:/data/brick1
volume add-brick: success
7.3 Verify Volume Information
[root@server01 ~]# gluster volume info
Volume Name: myvolume
Type: Distributed-Replicate
Volume ID: 09363405-1c7c-4eb1-b815-b97822c1f274
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: server01:/data/brick1
Brick2: server02:/data/brick1
Brick3: server03:/data/brick1
Brick4: server04:/data/brick1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
7.4 Rebalance the Volume
After expanding, run rebalance to distribute new data to new bricks:
[root@server01 ~]# gluster volume rebalance myvolume start
volume rebalance: myvolume: success: Rebalance on myvolume has been started successfully. Use rebalance status command to check status of the rebalance process.
ID: c9d052e8-2b6c-40b0-8d77-52290bcdb61
- Volume Shrinking
To remove bricks from a volume (online shrink):
[root@server01 ~]# gluster volume remove-brick myvolume server03:/data/brick1 server04:/data/brick1 force
Removing brick(s) can result in data loss. Do you want to Continue? (y/n)
y
volume remove-brick commit force: success
Monitor remove-brick operation:
[root@server01 ~]# gluster volume remove-brick status
- Rebalancing Operations
9.1 Fix Layout Only
This redistributes new file writes to new bricks without migrating existing data:
[root@server01 ~]# gluster volume rebalance myvolume fix-layout start
volume rebalance: myvolume: success: Rebalance on myvolume has been started successfully.
[root@server01 ~]# gluster volume rebalance myvolume status
Node status run time in h:m:s
----------------------------------------
localhost completed 0:0:0
server02 completed 0:0:0
server03 completed 0:0:0
server04 completed 0:0:0
volume rebalance: myvolume: success
9.2 Fix Layout and Migrate Data
This redistributes layout and migrates existing data to new bricks:
[root@server01 ~]# gluster volume rebalance myvolume start
volume rebalance: myvolume: success: Rebalance on myvolume has been started successfully.
[root@server01 ~]# gluster volume rebalance myvolume start force
volume rebalance: myvolume: success
9.3 Stop Rebalance
[root@server01 ~]# gluster volume rebalance myvolume stop
- Disk Failure Handling
Method 1: Using Local Spare Disk
Assume server02's /dev/sdb1 has failed:
Step 1: Prepare new disk on failed node
[root@server02 ~]# mkfs.xfs -i size=512 /dev/sdd1
[root@server02 ~]# mkdir /data/sdd1 -p
[root@server02 ~]# mount /dev/sdd1 /data/sdd1
[root@server02 ~]# echo "/dev/sdd1 /data/sdd1 xfs defaults 0 0" >> /etc/fstab
Step 2: Get extended attributes from healthy node
[root@server01 ~]# getfattr -d -m . -e hex /data/brick1/
Step 3: Mount volume and trigger self-heal
[root@server02 ~]# mount -t glusterfs server02:/myvolume /mnt
[root@server02 ~]# mkdir /mnt/tempdir
[root@server02 ~]# rmdir /mnt/tempdir
[root@server02 ~]# setfattr -n trusted.non-existent-key -v abc /mnt
[root@server02 ~]# setfattr -x trusted.non-existent-key /mnt
Step 4: Check current status
[root@server01 ~]# gluster volume heal myvolume info
Step 5: Replace failed brick with new disk
[root@server02 ~]# gluster volume replace-brick myvolume server02:/data/brick1 server02:/data/sdd1/brick commit force
volume replace-brick: success: replace-brick commit force operation successful
Method 2: Cross-Host Migration
When a disk fails and you want to use a different host's disk:
Step 1: Add new node to trusted pool
[root@server01 ~]# gluster peer probe server05
peer probe: success.
Step 2: Prepare disk on new node
[root@server05 ~]# mkdir -p /data/brick1 && mount /dev/sdb1 /data/brick1
[root@server05 ~]# echo "/dev/sdb1 /data/brick1 xfs defaults 0 0" >> /etc/fstab
[root@server05 ~]# mount -a && mount
Step 3: Replace brick across hosts
[root@server01 ~]# gluster volume replace-brick myvolume server02:/data/brick1 server05:/data/brick1 commit force
volume replace-brick: success: replace-brick commit force operation successful
Step 4: Later, when original disk is replaced, migrate data back
[root@server01 ~]# gluster volume replace-brick myvolume server05:/data/brick1 server02:/data/brick1 commit force
volume replace-brick: success: replace-brick commit force operation successful
- Volume Operations
11.1 Stop Volume
[root@server01 ~]# gluster volume stop myvolume
Stopping volume will make its data inaccessible. Do you want to continue? (y/n)
y
Stopping volume myvolume has been successful
11.2 Delete Volume
[root@server01 ~]# gluster volume delete myvolume
Deleting volume will erase all information about the volume. Do you want to continue? (y/n)
y
Deleting volume myvolume has been successful
- Self-Heal Operations
GlusterFS automatically runs self-heal daemon every 10 minutes to synchronize replica bricks.
12.1 Trigger Self-Heal on Files Requiring Healing
# gluster volume heal myvolume
Heal operation on volume myvolume has been successful
12.2 Trigger Full Self-Heal
# gluster volume heal myvolume full
Heal operation on volume myvolume has been successful
12.3 View Files Needing Healing
# gluster volume heal myvolume info
Brick server01:/data/brick1
Number of entries: 0
Brick server02:/data/brick1
Number of entries: 101
/file1.txt
/file2.txt
...
12.4 View Self-Healed Files
# gluster volume heal myvolume info healed
12.5 View Failed Healing Files
# gluster volume heal myvolume info failed
12.6 View Split-Brain Files
# gluster volume heal myvolume info split-brain
- Mounting GlusterFS Volumes
13.1 Native GlusterFS Mount
[root@server01 ~]# mkdir -p /mnt/glusterfs
[root@server01 ~]# mount -t glusterfs server01:/myvolume /mnt/glusterfs
13.2 NFS Mount
[root@server01 ~]# mount -o mountproto=tcp -t nfs server01:/myvolume /mnt/nfs
- GlusterFS Volume Types
14.1 Distributed Volume
Files are distributed across bricks using hash algorithm. No redundancy:
# gluster volume create gv1 server01:/data/brick1 server02:/data/brick1 force
volume create: gv1: success: please start the volume to access data
14.2 Replicated Volume
Similar to RAID 1, provides high availability:
# gluster volume create gv2 replica 2 server01:/data/brick1 server02:/data/brick1 force
volume create: gv2: success: please start the volume to access data
14.3 Striped Volume
Similar to RAID 0, provides better performance for large files:
# gluster volume create gv3 stripe 2 server01:/data/brick1 server02:/data/brick1 force
volume create: gv3: success: please start the volume to access data
14.4 Distributed-Replicated Volume
Most commonly used in production environments:
# gluster volume create gv4 replica 2 server01:/data/brick1 server02:/data/brick1 server03:/data/brick1 server04:/data/brick1 force
volume create: gv4: success: please start the volume to access data
- Troubleshooting
Issue 1: Peer Shows "Disconnected" State
[root@server01 ~]# gluster peer status
Number of Peers: 1
Hostname: server02
Uuid: 61fe987a-99ff-419d-8018-90603ea16fe7
State: Peer in Cluster (Disconnected)
Solution: Check firewall status and /etc/hosts configuration. Ensure firewall is disabled or appropriate rules are configured.
Issue 2: Volume Start Fails with Extended Attribute Error
[root@server01 ~]# gluster volume start myvolume
volume start: myvolume: failed: Failed to get extended attribute trusted.glusterfs.volume-id for brick dir /data/brick1. Reason : No data available
Solution:
[root@server01 ~]# gluster volume delete myvolume
# Clean up extended attributes
[root@server01 ~]# setfattr -x trusted.glusterfs.volume-id /data/brick1
[root@server01 ~]# setfattr -x trusted.gfid /data/brick1
[root@server01 ~]# rm -rf /data/brick1/.glusterfs
# Recreate volume
[root@server01 ~]# gluster volume create myvolume replica 2 server01:/data/brick1 server02:/data/brick1
volume create: myvolume: success: please start the volume to access data
[root@server01 ~]# gluster volume start myvolume
volume start: myvolume: success
Issue 3: "Path Already Part of a Volume" Error
This occurs when trying to reuse a brick that was previously part of a volume:
[root@server01 ~]# gluster volume add-brick myvolume server03:/data/brick1
volume add-brick: failed: Pre Validation failed on server03. /data/brick1 is already part of a volume
Solution:
# On the affected node
[root@server03 ~]# setfattr -x trusted.glusterfs.volume-id /data/brick1
[root@server03 ~]# setfattr -x trusted.gfid /data/brick1
[root@server03 ~]# rm -rf /data/brick1/.glusterfs
# Restart glusterd
[root@server03 ~]# systemctl restart glusterd
- Important Notes
- File System: XFS is recommended. GlusterFS works best with files of at least 16KB (optimal around 128KB). Structured data (SQL databases) is not supported.
- Hardware: GlusterFS is designed for commodity hardware. A basic cluster can run on two servers with 2 CPUs, 4GB RAM each, and 1Gbps network.
- Mixinng Hardware: Nodes do not need identical hardware configurations.
- DNS: Proper DNS (forward and reverse) and NTP are essential for production deployments.
- Data Balance: After every volume expansion, run rebalance operation to distribute data to new bricks.