With a little inspiration from the 200TB Glusterfs Odroid HC-2 Build posted to /r/DataHoarder/ a while back and a whole lot of bonus.ly dollars from work, I have finally completed my 3 node HC-1 cluster build and am writing to share my experiences with anyone else interested in checking out single board computing for themselves. Unlike the massive amount of storage provided by the Odroid HC-2 build, I am using the Odroid HC-1. The main difference being, it will only fit a 2.5" drive, where the HC-2 will fit a full size 3.5 inch HDD. The purpose of this build is not a NAS, but rather a focus on clustering software itself. Primarily, Docker Swarm backed by Glusterfs for highly available containers and mounted volumes.
Parts List
Here is the complete parts list and prices in US dollars. There are a couple of ways you could go with powering this, so I will create separate lists for power, including tools and other things I bought to make the build a little better, but are not necessary for it to work. All of it was purchased from Amazon, so prices may vary a bit.
note
The 5V 20A Power Supply I first purchased for this project ended up not being able to supply the right amount of power needed to keep the odroid running at any amount of higher than average load. I was able to power all three with it, but when it came to stress testing them, even tests that were just downloading a large file to the device or using around 50% CPU would cause the odroid to crash. There were also times that the odroid would not seam to gain enough power to boot and would loop in a reboot cycle for quite some time or never fully boot back up. I do not recommend this model for this project. I have since replaced it with Power Option #3 and am very happy with the results. This lab grade, DC Power Supply 1.5-15V 30A, is way overkill for anyone in their right mind, but will serve as a great tool in my homelab and be able to run a lot more devices in the future, as I expand out my SBC collection. I have adjusted it up to around 5.3-5.4 volts and the Odroids seem to run much better with a bit of a boost. I have since confirmed with a digital multimeter that a steady supply of around 5.3 volts comes from the 5.5mm barrel plugs.
Odroid
Part | Amount | Price | Total |
---|---|---|---|
Odroid HC-1 | 3 | $59.95 | $179.85 |
32GB MicroSD | 3 | $11.49 | $34.47 |
240GB SSD | 3 | $47.99 | $143.97 |
80mm Fan | 1 | $15.95 | $15.95 |
6 pack Cat 6 Ethernet Cable 1 ft | 1 | $8.90 | $8.90 |
Total | $383.14 |
Power option #1 - Standard Odroid power supplies
Part | Amount | Price | Total |
---|---|---|---|
5V 4A Power Supply US Plug | 3 | $15.95 | $47.85 |
Total | $47.85 |
Power option #2 - What I did.
Part | Amount | Price | Total |
---|---|---|---|
9 Port 40A Power Splitter | 1 | $58.00 | $58.00 |
20 pack 10 inch 2.1 x 5.5mm Male DC Power Pigtail Connectors | 1 | $9.99 | $9.99 |
Wire Crimping Tool for Powerpole | 1 | $34.99 | $34.99 |
15 Amp Unassembled Red/Black Anderson Powerpole Connectors | 1 | $16.47 | $16.47 |
Heat Gun | 1 | $23.94 | $23.94 |
Heat Shrink Tubing | 1 | $8.99 | $8.99 |
10 pack 4 Amp Two Prong Blade Plug-in ATC Fuses | 1 | $6.30 | $6.30 |
Total | $176.67 |
Power option #3 - More power!
Part | Amount | Price | Total |
---|---|---|---|
DC Power Supply 1.5-15V 30A | 1 | $149.99 | $149.99 |
12 awg Copper Banana Plug | 1 | $10.49 | $10.49 |
Total | $160.48 |
Putting it together
I started with one Odroid and added to it over a few months. I had been tinkering with a couple of Raspberry Pi 3B+ and having fun and decided I wanted something a bit more powerful for running tests and possibly use as a logging and analytics back end down the road when I get bored with it and just make it do a thing for a while. One of the first things you'll notice, moving from a Pi to these Odroids, is that they have different power needs than most smaller boards because they need more power to run a SATA connected drive. The casing on the HC-1 acts as both a stackable case and a heat sink, so scaling is easy. This is one nice benefit over some of the other boards out there that will also require a case.
Power
The 5V 20A Power Supply I went with, should be enough to power 5 HC-1's at load, which is good enough for what I have planned. I used a Powerpole power splitter for the first time ever and am pretty happy with how it turned out. I can easily make a new power cable, add a 4 Amp fuse, and add another node to the cluster. Be sure and use the 15 Amp connectors, the 30 Amp connectors that came with the crimper are too big for these small wires.
I ended up getting a much better power supply.
The 5V 4A power brick you can order with the HC-1 works just fine and that's what I used while evaluating the first one, but I don't want to manage 3 to 5 of those blocks in my already filling up power strip on the bottom shelf of the rack, so that is why I went with a solution that will scale with the number of nodes I add and then some.
From failed attempts at trying to power up the HC-1 with it plugged into USB power bricks, I discovered that even if you buy a USB power supply that advertises 12 Amps, they're still most likely limited to 2.5 Amps per USB port. This is great for a Pi, but will not run the HC-1.
This will not work
This power supply ended up not working well, either
Cooling
I went with a 5V USB powered fan and tested it plugged in to the front of one of the HC-1 and it worked great, but I opted to take a spare connector and plug it directly in to the power splitter. This means it runs at full speed at all times, but so far, that hasn't been a problem. Things stay very cool and the fan barely makes any noise, even at full speed. I've been happy with every Noctua fan I've bought.
OS
So far, I've been able to get ubuntu 16.04 and 18.04 running, from images found on the Odroid site. I tested Armbian, but wasn't able to get a shell onto the box after. Not sure if ssh is enabled by default or not. I'm certain this was power related and not Armbian related. Also, tested Arch without success, so far. I'm happy with ubuntu 18.04 right now because Docker Swarm seems to be working as it should with armhf 32 bit applications. I was having weird Kernel errors when I tested Docker Swarm on ubuntu 16.04 and wasn't ever able to get Swarm services started, however was able to start Docker containers in stand alone Docker mode.
I've since switched to using the Armbian ubuntu bionic, 18.04 as the base for all three and it's working great. Everything works out of the box so far (Glusterfs and Docker Swarm).
example error output for ubuntu 16.04 in Swarm mode
"subnet sandbox join failed for "10.0.0.0/24": error creating vxlan interface: operation not supported"
docker service ps --no-trunc ubuntu_ubuntu
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
jip55rydfymv3leidoxfc2fiy ubuntu_ubuntu.1 armv7/armhf-ubuntu:[email protected]:fc32949ab8547c9400bb804004aa3d36fd53d2fedba064a9594a173a6ed4a3b6 ninja.dojo.io Ready Rejected 1 second ago "subnet sandbox join failed for "10.0.0.0/24": error creating vxlan interface: operation not supported"
But, things seem to work great in 18.04, so that's what I'm sticking with, for now. I've had mixed results with Docker Swarm, so far, but standalone docker seams to work fine. Glusterfs has been working very well on both 16.04 and 18.04 and kept working just fine on an upgrade from 16.04 to 18.04 without error. So did Docker. Elasticsearch 5.0.1 and 6.3.1 tested running well on the Odroid. It was able to keep up with 3 packetbeat clients hitting it with network traffic, as well as, Kibana and Grafana querying it on the front end. I started putting that together in an Ansible role, arm-elasticsearch.
Installation
With image at hand, installation is pretty simple. Mount the micro SD and use dd to copy the image.
sudo dd if=ubuntu-18.04-4.14-minimal-odroid-xu4-20180531.img of=/dev/mmcblk0 bs=4M status=progress
1577058304 bytes (1.6 GB, 1.5 GiB) copied, 1 s, 1.6 GB/s
499+1 records in
499+1 records out
2094006272 bytes (2.1 GB, 2.0 GiB) copied, 85.9404 s, 24.4 MB/s
Plug it in and connect an Ethernet cable. You'll want a DHCP server running to give this new box an IP address. Ping for 'odroid' until you see it come online, then ssh in to update the hostname, add a user, etc.. I'm leaving DHCP enabled and just relying on a hostname.
Defaults:
- user = root
- pass = odroid
Update /etc/hosts and /etc/hostname, ex. host 'ninja'
[email protected]:~# cat /etc/hostname
ninja
[email protected]:~# cat /etc/hosts
127.0.0.1 ninja
Add a user, where user == your user
[email protected]:~# adduser user
If you want ansible to have passwordless sudo, where user == your user
[email protected]:~# cat /etc/sudoers.d/user
user ALL=(ALL) NOPASSWD: ALL
Upgrade, install python, and reboot. Do this with all three nodes.
apt update
apt upgrade
apt install python
reboot
If you want passwordless ssh access, add an ssh key to each host.
ssh-copy-id ninja
ssh-copy-id oroku
ssh-copy-id venus
Now that each node has an identity, a user with sudo and ssh access, and python installed, they are ready for Ansible provisioning.
Glusterfs
One of the main purposes of this build is to test Glusterfs as an alternative to mounting persistent Docker volumes to an NFS share. Gluster seams to scale well enough and should keep up with scaling Docker Swarm nodes, as needed. An NFS server is a valid solution for data persistence, but leaves a single point of failure. If the NFS server goes down, it wouldn't matter how many nodes I had in my Swarm cluster, 3 or 30, I will have just lost any data that was stored on the NFS. Replicating storage across all Swarm nodes would mean a Docker service could fail on node 00 and be automatically brought back up on node 01, where the data has been replicated, the volume would remount any directories from the gluster-client, and start back up with minimal down-time. If configured with enough nodes and replication, you can potentially lose a hardware node or two and have things keep running normally, while you replace the underlying failed hardware.
Once I had the second Odroid, I started testing Glusterfs across two machines with 1 brick and a replication factor of 2. Now that I have a third node, I've been testing a dispersed volume, but have been experiencing a lot of gluster-client connection issues. Not sure what to chalk this up to just yet. I have a lot more learning to do. So, currently, I am running it across the three nodes with a replica factor of 3. Meaning any file I write/delete to the SSD on node 00 will be written/deleted to the SSD on both nodes 01 and 02. If I start a mysql Docker service with a mounted volume to this share, it should work on which ever node it starts on.
While Gluster isn't bad at all to set up manually, there is an Ansible module for gluster_volume that makes setup way easier! I followed Jeff Geerling's post on a Simple GlusterFS Setup with Ansible to get started. I also added in some tasks to handle disk partitioning and formatting for me.
Each node has the following:
- python installed
- ansible user with ssh key access
- ansible user has sudo access
Starting with an inventory.ini file to define the hosts.
inventory.ini
[gluster]
ninja
venus
oroku
Test a connection
ansible -i inventory.ini all -m ping
oroku | SUCCESS => {
"changed": false,
"ping": "pong"
}
ninja | SUCCESS => {
"changed": false,
"ping": "pong"
}
venus | SUCCESS => {
"changed": false,
"ping": "pong"
}
Create a playbook that will configure as many nodes as I add to this inventory.
First, define a couple of variables.
playbook.yml
---
- hosts: gluster
become: true
vars:
gluster_mount_dir: /mnt/g1
gluster_brick_dir: /bricks/brick1
gluster_brick_name: g1
With hosts being called and variables set, add tasks to partition the SSD drives. Parted will be installed and used to Create a primary partition on /dev/sda, which is what the SSD card shows up at. The micro SD will shows up as /dev/mmcblk0.
tasks:
- name: Install parted
package:
name: parted
state: present
tags:
- setup
- name: Create Primary partition
parted:
device: /dev/sda
number: 1
state: present
tags:
- setup
An ext4 file system is then created,
- name: Create a ext4 filesystem on /dev/sda1
filesystem:
fstype: ext4
dev: /dev/sda1
tags:
- setup
As well as any mount and volume directories.
- name: Ensure Gluster brick and mount directories exist.
file:
path: "{{ item }}"
state: directory
mode: 0775
with_items:
- "{{ gluster_brick_dir }}"
- "{{ gluster_mount_dir }}"
Which are then mounted and added to the /etc/fstab file.
- name: Mount "{{ gluster_brick_dir }}"
mount:
path: "{{ gluster_brick_dir }}"
src: /dev/sda1
fstype: ext4
state: present
tags:
- setup
The real magic happens with the gluster_volume module. It will create the defined brick and reach out to any and all other nodes defined in the inventory file to add them to the cluster and start replication. force: yes
is only used because this is happening on the /dev/sda partition, which gluster assumes is the root partition, but in this case is not.
- name: Configure Gluster volume.
gluster_volume:
state: present
name: "{{ gluster_brick_name }}"
brick: "{{ gluster_brick_dir }}"
replicas: "{{ groups.gluster | length }}"
cluster: "{{ groups.gluster | join(',') }}"
host: "{{ inventory_hostname }}"
force: yes
run_once: true
Finally, mount the volume with the gluster-client.
- name: Ensure Gluster volume is mounted.
mount:
name: "{{ gluster_mount_dir }}"
src: "{{ inventory_hostname }}:/{{ gluster_brick_name }}"
fstype: glusterfs
opts: "defaults,_netdev,log-level=WARNING,log-file=/var/log/gluster.log"
state: mounted
Run it with:
ansible-playbook -i inventory.ini playbook.yml
PLAY [gluster] ********************************************************************************************************
TASK [Gathering Facts] ************************************************************************************************
ok: [oroku]
ok: [ninjara]
ok: [venus]
...
...
...
TASK [Configure Gluster volume.] **************************************************************************************
ok: [ninjara]
TASK [Ensure Gluster volume is mounted.] ******************************************************************************
ok: [oroku]
ok: [venus]
ok: [ninjara]
PLAY RECAP ************************************************************************************************************
ninjara : ok=9 changed=0 unreachable=0 failed=0
oroku : ok=8 changed=0 unreachable=0 failed=0
venus : ok=8 changed=0 unreachable=0 failed=0
Verify peers with:
[email protected]:~# gluster peer status
Number of Peers: 2
Hostname: ninjara
Uuid: 19c60517-d584-4cf2-9dcb-fedca9463507
State: Peer in Cluster (Connected)
Hostname: venus
Uuid: f582eb76-97f0-483d-9a31-91c272915940
State: Peer in Cluster (Connected)
Verify volume with:
[email protected]:~# gluster volume status
Status of volume: g1
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick ninja:/bricks/brick1 49152 0 Y 1689
Brick venus:/bricks/brick1 49152 0 Y 1019
Brick oroku:/bricks/brick1 49152 0 Y 10326
NFS Server on localhost N/A N/A N N/A
Self-heal Daemon on localhost N/A N/A Y 10317
NFS Server on ninjara N/A N/A N N/A
Self-heal Daemon on ninjara N/A N/A Y 1680
NFS Server on venus N/A N/A N N/A
Self-heal Daemon on venus N/A N/A Y 973
Task Status of Volume g1
------------------------------------------------------------------------------
There are no active volume tasks
Docker Swarm
Setting up Docker Swarm is easy enough, so I won't go in to too much detail and will do a lazy set up for this example.
On each of the nodes, download the docker-install script and run it.
apt install curl
curl -fsSL get.docker.com -o get-docker.sh
sh get-docker.sh
Add your user to the docker group, if you want to run it without sudo, where user == your user
sudo usermod -aG docker user
Init the Swarm
[email protected]:~# docker swarm init
Check to see what the manager join token is.
[email protected]:~# docker swarm join-token manager
Run the command it spits back at you on the other two nodes.
[email protected]:~# docker swarm join --token <manager-token-goes-here> 192.168.123.123:2377
[email protected]:~# docker swarm join --token <manager-token-goes-here> 192.168.123.123:2377
Verify that the Swarm is up and running with 3 managers.
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
ukqyy69598inihrq29kf3preo ninja.dojo.io Ready Active Reachable 18.06.0-ce
csa4ykcq1nszafenlewvq5gj0 oroku.dojo.io Ready Active Leader 18.06.0-ce
a2uu7sljay4q7rueez6thxj6d * venus.dojo.io Ready Active Reachable 18.06.0-ce
Docker Swarm is now up and ready to start running services! Now, to do a quick test and see if data volumes will persist across the nodes.
Mysql test
Where /mnt/g1
is the mounted gluster share on every box.
mysql-stack.yml
version: '3'
services:
mysql:
image: hypriot/rpi-mysql
environment:
- MYSQL_ROOT_PASSWORD=root
- MYSQL_DATABASE=test
- MYSQL_USER=test
- MYSQL_PASSWORD=test
volumes:
- /mnt/g1/mysql:/var/lib/mysql/
ports:
- 3306:3306
Deployed with:
mkdir -p /mnt/g1/mysql
[email protected]:~# docker stack deploy -c mysql-stack.yml mysql
Creating network mysql_default
Creating service mysql_mysql
Running docker stack ps
shows that it's running on host venus.
[email protected]:~# docker stack ps mysql
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
ul55v78dgef7 mysql_mysql.1 hypriot/rpi-mysql:latest venus.dojo.io Running Running 21 seconds ago
SSH to host venus and open a mysql shell on the container itself to generate a bit of data.
[email protected]:~# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a2c7d29790ab hypriot/rpi-mysql:latest "/entrypoint.sh mysq…" About a minute ago Up About a minute 3306/tcp mysql_mysql.1.21an16vybjn7ipve11krmrv8b
[email protected]:~# docker exec -it a2c7d29790ab mysql -u'root' -p'root'
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.5.60-0+deb7u1 (Debian)
Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>
Create a database
mysql> create database odroid;
Query OK, 1 row affected (0.07 sec)
Create a table
mysql> CREATE TABLE IF NOT EXISTS `odroid`.`nodes` (
`id` INT AUTO_INCREMENT ,
`host` VARCHAR(15) NOT NULL ,
`model` VARCHAR(15) NOT NULL ,
PRIMARY KEY (`id`));
mysql> describe nodes;
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| host | varchar(15) | NO | | NULL | |
| model | varchar(15) | NO | | NULL | |
+-------+-------------+------+-----+---------+----------------+
3 rows in set (0.01 sec)
Gluster is replicating the new mysql database to all 3 nodes.
oroku
[email protected]:/home/wgill# ls -list /mnt/g1/mysql/odroid/
total 9
11527446623504689574 9 -rw-rw---- 1 999 docker 8618 Aug 21 06:17 nodes.frm
13231204873164849180 1 -rw-rw---- 1 999 docker 65 Aug 21 06:07 db.opt
venus
[email protected]:~# ls -list /mnt/g1/mysql/odroid/
total 9
11527446623504689574 9 -rw-rw---- 1 999 docker 8618 Aug 21 06:17 nodes.frm
13231204873164849180 1 -rw-rw---- 1 999 docker 65 Aug 21 06:07 db.opt
ninja
[email protected]:~# ls -list /mnt/g1/mysql/odroid/
total 9
11527446623504689574 9 -rw-rw---- 1 999 docker 8618 Aug 21 06:17 nodes.frm
13231204873164849180 1 -rw-rw---- 1 999 docker 65 Aug 21 06:07 db.opt
To replicate a soft fail-over, I'll set the node it's running on to drain and see if it migrates to another node and starts back up.
[email protected]:/home/wgill# docker node update --availability drain venus.dojo.io
venus.dojo.io
docker node ls
shows that the node is now set to drain.
[email protected]:/home/wgill# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
12fbmbm4hoebavv9z5ihxhetp ninja.dojo.io Ready Active Reachable 18.06.0-ce
csa4ykcq1nszafenlewvq5gj0 * oroku.dojo.io Ready Active Reachable 18.06.0-ce
a2uu7sljay4q7rueez6thxj6d venus.dojo.io Ready Drain Leader 18.06.0-ce
The mysql service has moved to node ninja
[email protected]:/home/wgill# docker stack ps mysql
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
j9eqoykbdix5 mysql_mysql.1 hypriot/rpi-mysql:latest ninja.dojo.io Running Running 29 seconds ago
21an16vybjn7 \_ mysql_mysql.1 hypriot/rpi-mysql:latest venus.dojo.io Shutdown Shutdown about a minute ago
Verify that the data has persisted by finding the container now running on the ninja node.
[email protected]:~# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
dd031910ccc4 hypriot/rpi-mysql:latest "/entrypoint.sh mysq…" 2 minutes ago Up About a minute 3306/tcp mysql_mysql.1.j9eqoykbdix5ub13n7gm0ywfz
Open a mysql shell
[email protected]:~# docker exec -it dd031910ccc4 mysql -u'root' -p'root'
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.5.60-0+deb7u1 (Debian)
Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>
Check that the table still exists.
mysql> use odroid;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> describe nodes;
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| host | varchar(15) | NO | | NULL | |
| model | varchar(15) | NO | | NULL | |
+-------+-------------+------+-----+---------+----------------+
3 rows in set (0.01 sec)
Woot! Good enough for a first test for me. Now the fun can begin!
Final Thoughts
I like these little boards and am exited to see what I can make and test on this cluster setup. My main gripe after using these for a month or so now, would be power issues. Even with the power brick that comes from HardKernel these things don't always like to boot. Sometimes, it takes a few reboots before it will come back up. This being tested on previously functioning installs of ubuntu 16.04 and 18.04. I have also been noticing this on the 20 amp power supply I have them plugged in to. At times it seems as though it's still not enough to power all 3 of these and 1 or 2 will fail to boot up from a full cluster restart. Tested by un-plugging and plugging back in the power supply directly from the power strip and causing all three nodes to cycle. When they are up and running, however, they have been working well enough to get plenty of tests run. Gluster has been working great! Docker and Docker Swarm have been a little flaky, but it still seams very promising.