Raspberry Pi Docker Swarm cluster 8 months later

My Swarm cluster has now been up and running since December of 2019 with only a few problems. Below are the problems I ran into after 8 months of 24/7 usage:

  1.  loss of 2 micro sd cards
  2.  replaced R Pi 3 with R Pi 4
  3.  added Swarm Pit at a cluster monitor
  4.  added Pihole
  5.  added Open Hab

About a week ago I had a few issues that cause some of the services being hosted on the cluster became unavailable.  This led to me have to do a hard reboot on 3 different Pis.  During this I lost the sd card in the Pi 3 and it was the leader of the swarm cluster and hosted Portainer.  The other Pi that I lost was the 3rd node in the cluster.  I had to hard boot one other but it came back fine.  I had already been playing with anther Pi 4 that I had and booting it from USB without any issues and decided to move the cluster over to SSD.  I had to rebuild the two devices that lost sd cards before I could move them over to SSDs.  I had to remove remove all the nodes from the Swarm cluster, rebuild the cluster, and then re-install Portainer.  I wrote a post that listed the exact steps that I used to move to SSDs can be found HERE.  In reality I am still using the sdcard to host the /boot directory and then all the other files are located on the USB SSD drive.  There is now ways to move these R Pis over to straight USB boot but that would require a firmware change and I am not going to do that right now.  It has been running this was for a few weeks and I have seen increased speed and better functionality.

I also decided that I would replace the R Pi3 with an R Pi4 just to improve overall function and performance.  The R Pi3 lost the sdcard and there was nothing holding me back.  The R Pi3 also had performance issues such as sluggish access by terminal and just overall slower than the Pi4.  Logically it just made since and if I still had the R Pi3 I wouldn’t be able to have Swarm Pit installed with Portainer.

Then I started wanting to track how the nodes of the cluster were performing and using resources.  I was hoping Portainer had what I was looking for but it didn’t.  I have Nagios monitoring temp, drive space, and memory usage.  I still wanted something that presented a little better.  So after some research I found Swarm Pit.  This has some of the same features but it listed out node usage over time, the cluster resources all together, and the breaks it out by node.  Swarm Pit uses multiple containers to make it work.  It has the app, an agent, and two db containers.  The app has to run on the leader so I have Portainer and Swarm Pit as the only things running on the leader node.  I like being able to bring this up and view the a simple interface that has the last hour along with current usage.  Below are a few pics of the Swarm Pit interface.

Finally I added a few more containers that would allow for more functionality from the cluster.  The first one is Pi-hole and it is used to black hole advertisements and unwanted DNS searches.  It has a great interface and allows for you to search historic data, but it will only display the last 24 hours on the initial display that you bring up when launching the wed GUI.  You simply point it to the DNS servers you want it to sent requests to and then use Pi-hole as the DNS in your DHCP server.  You will be surprized by how much garbage this container will keep off your network.  The other container I added was OpenHab.  I already had this running on an R Pi2 and the device had an sd card that failed.  So moving it over just made since.  I have used this for some basic automation and after a little configuration it took over everything I had with out problems.  Below are some screen shot of the two interfaces.

This cluster is ever changing and I am slowly adding additional containers to it.  I am thinking about adding an x86_64 machine and another R Pi4.  So I will write another post once I make some more changes.  Feel free to ask questions or just comment.


You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload the CAPTCHA.