GlusterFS 8 on FreeBSD 13

vermaden · Mar 9, 2021

About two years ago I have made a guide for really old GlusterFS 3.11 version that was available back then on FreeBSD 12.0. Recently I noticed that GlusterFS version in FreeBSD Ports (and packages) is not finally up-to-date with upstream GlusterFS versions.

This guide will show you how to create GlusterFS 8 distributed filesystem on latest FreeBSD 13. At the moment of writing this article FreeBSD 13 is at RC1 state but it will be released within a month.

While in the earlier guide I created dispersed volume with redundancy comparably to RAID6 but between 6 nodes not disks. This means that 2 of 6 nodes can crash and that GlusterFS would still work without a problem. Today I will show you more minimalistic approach with 3 node setup and a volume that takes space only on nodes node0 and node1 while node2 will be used as an arbiter only and does not hold any data. The arbiter greatly improves split brain problems because instead of vulnerable two node cluster we have a three nodes in the cluster so even if any of them fails we still have 2 of 3 votes.

I will not repeat all ‘initial’ steps needed to prepare these three FreeBSD hosts as it was already described here – GlusterFS Cluster on FreeBSD with Ansible and GNU Parallel – in the older article about that topic. I will focus on the GlusterFS commands that need to be executed to achieve our goal.

We will use several prompts in this guide to show which commands will be executed on which nodes.

[ALL] # command that will be executed on all node0/node1/node2 nodes
[node0] # command that will be executed on node0 only

GlusterFS

We have three nodes on our lab.

node0 - 10.0.10.200 - DATA NODE 'A'
node1 - 10.0.10.201 - DATA NODE 'B'
node2 - 10.0.10.202 - ARBITER NODE

Install and then enable and start the GlusterFS.

[ALL] # pkg install glusterfs

[ALL] # sysrc glusterd_enable=YES
glusterd_enable: -> YES

[ALL] # service glusterd start
Starting glusterd.

Enable and mount the /proc filesystem and create needed directories for GlusterFS bricks.

[ALL] # grep procfs /etc/fstab
proc /proc procfs rw 0 0

[ALL] # mount /proc

[ALL] # mkdir -p /bricks/data/{01,02,03,04}

Now connect all these nodes into one cluster and create GlusterFS volume.

[node0] # gluster peer status
Number of Peers: 0

[node0] # gluster peer probe node1
peer probe: success

[node0] # gluster peer probe node2
peer probe: success

[node0] # gluster peer status
Number of Peers: 2

Hostname: node1
Uuid: b5bc1602-a7bb-4f62-8149-98ca97be1784
State: Peer in Cluster (Connected)

Hostname: node2
Uuid: 2bfa0c71-04b4-4660-8a5c-373efc5da15c
State: Peer in Cluster (Connected)

[node0] # gluster volume create data \
replica 2 \
arbiter 1 \
node0:/bricks/data/01 \
node1:/bricks/data/01 \
node2:/bricks/data/01 \
node0:/bricks/data/02 \
node1:/bricks/data/02 \
node2:/bricks/data/02 \
node0:/bricks/data/03 \
node1:/bricks/data/03 \
node2:/bricks/data/03 \
node0:/bricks/data/04 \
node1:/bricks/data/04 \
node2:/bricks/data/04 \
force
volume create: data: success: please start the volume to access data

[node0] # gluster volume start data
volume start: data: success

[node0] # gluster volume info

Volume Name: data
Type: Distributed-Replicate
Volume ID: f73d57ea-6f10-4840-86e7-f8178540e948
Status: Started
Snapshot Count: 0
Number of Bricks: 4 x (2 + 1) = 12
Transport-type: tcp
Bricks:
Brick1: node0:/bricks/data/01
Brick2: node1:/bricks/data/01
Brick3: node2:/bricks/data/01 (arbiter)
Brick4: node0:/bricks/data/02
Brick5: node1:/bricks/data/02
Brick6: node2:/bricks/data/02 (arbiter)
Brick7: node0:/bricks/data/03
Brick8: node1:/bricks/data/03
Brick9: node2:/bricks/data/03 (arbiter)
Brick10: node0:/bricks/data/04
Brick11: node1:/bricks/data/04
Brick12: node2:/bricks/data/04 (arbiter)
Options Reconfigured:
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

[node0] # gluster volume status
Status of volume: data
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick node0:/bricks/data/01 49152 0 Y 4595
Brick node1:/bricks/data/01 49152 0 Y 1022
Brick node2:/bricks/data/01 49152 0 Y 3356
Brick node0:/bricks/data/02 49153 0 Y 4597
Brick node1:/bricks/data/02 49153 0 Y 1024
Brick node2:/bricks/data/02 49153 0 Y 3358
Brick node0:/bricks/data/03 49154 0 Y 4599
Brick node1:/bricks/data/03 49154 0 Y 1026
Brick node2:/bricks/data/03 49154 0 Y 3360
Brick node0:/bricks/data/04 49155 0 Y 4601
Brick node1:/bricks/data/04 49155 0 Y 1028
Brick node2:/bricks/data/04 49155 0 Y 3362
Self-heal Daemon on localhost N/A N/A Y 4604
Self-heal Daemon on node1 N/A N/A Y 1031
Self-heal Daemon on node2 N/A N/A Y 3365

Task Status of Volume data
------------------------------------------------------------------------------
There are no active volume tasks

[node0] # ps aux | grep -e gluster -e RSS | cut -d ' ' -f 1-27
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
root 4585 5.0 0.7 48264 21296 - Rs 21:14 2:46.12 /usr/local/sbin/glusterd --pid-file=/var/run/glusterd.pid (glusterfsd)
root 4604 5.0 0.6 61196 18976 - Rs 21:15 2:33.10 /usr/local/sbin/glusterfs -s localhost --volfile-id shd/data
root 4595 4.0 0.7 62108 22524 - Rs 21:15 2:21.21 /usr/local/sbin/glusterfsd -s node0 --volfile-id data.node0.bricks-data-01
root 4597 4.0 0.7 61716 21084 - Rs 21:15 2:19.32 /usr/local/sbin/glusterfsd -s node0 --volfile-id data.node0.bricks-data-02
root 4599 4.0 0.7 61716 21112 - Rs 21:15 2:17.58 /usr/local/sbin/glusterfsd -s node0 --volfile-id data.node0.bricks-data-03
root 4601 3.0 0.7 61716 21064 - Rs 21:15 2:20.37 /usr/local/sbin/glusterfsd -s node0 --volfile-id data.node0.bricks-data-04
root 4784 0.0 0.0 432 244 2 R+ 22:20 0:00.00 grep

The GlusterFS e data volume is now created and started. You can mount it and use it the way you like.

[node2] # mkdir /data

[node2] # kldload fusefs

[node2] # mount_glusterfs node0:/data /data

[node2] # echo $?
0

[node2] # df -h /data
Filesystem Size Used Avail Capacity Mounted on
/dev/fuse 123G 2.5G 121G 2% /data

Voila! Mounted and ready to serve.

Tuning

GlusterFS comes without any tuning applied so I suggest something to start with.

[node0] # gluster volume set data client.event-threads 8
[node0] # gluster volume set data cluster.lookup-optimize on
[node0] # gluster volume set data cluster.readdir-optimize on
[node0] # gluster volume set data features.cache-invalidation on
[node0] # gluster volume set data group metadata-cache
[node0] # gluster volume set data network.inode-lru-limit 200000
[node0] # gluster volume set data performance.cache-invalidation on
[node0] # gluster volume set data performance.cache-refresh-timeout 10
[node0] # gluster volume set data performance.cache-size 1GB
[node0] # gluster volume set data performance.io-thread-count 16
[node0] # gluster volume set data performance.parallel-readdir on
[node0] # gluster volume set data performance.stat-prefetch on
[node0] # gluster volume set data performance.write-behind-trickling-writes on
[node0] # gluster volume set data performance.write-behind-window-size 100MB
[node0] # gluster volume set data server.event-threads 8
[node0] # gluster volume set data server.outstanding-rpc-limit 256

That is all in this rather short guide.

Treat it as an addendum to the original GlusterFS article linked earlier.

EOF

Continue reading...

GlusterFS 8 on FreeBSD 13

vermaden

GlusterFS​

Tuning​

GlusterFS

Tuning