Keywords: Full guide to setting up Kubernetes Stacked ETCD High Availability Control node cluster, with diagrams and references.Setup K8 HA – Part5

Step 25 - Check Worker Node

it’s worth noting that checking statuses/cluster health as you go along is quite important especially if you’re not well experienced yet, in a way, that will help you catch issues earlier thus getting a better determination of the issue and that’d help you identify possible root cause.

Ergo, check worker node along with cluster overall health status:

if you’ve made it all the way here while having your nodes in a “Ready” Status, you deserve a pat on the back.

Typically one should have more than a single worker node, thus I’m going to expand my cluster adding two more worker nodes to it.

In the same way we bootstrapped our first worker node, will also apply to second, third, fourth … and so on

My current node Infra count:

Awesome!! … all my nodes are in “Ready” state.

if you’re wondering how my worker nodes are 35s old  , that’s because I’m simply using Ansible running parallel processes (although that is not a requirement now)

Step 26 - KeepAliveD

Last but certainly not least, install and configure Keepalived to manage our VIP and remove the temporary one we set at the beginning of this guide

In my case I have a more sophisticated setup where I containerized Keepalived and injected that into Kubernetes with Node Affinity to run only on control plane nodes, (Just giving you ideas)

 installing and configuring keepalived is rather a simple task … therefore we’ll stick to it for now as it provides similar functionality anyway

With this in mind if you’re interested in getting deeper into load-balancing techniques then checkout their official site:

Official Keepalived Loadbalancing techniques

On all three control plane nodes:
yum install keepalived
On node1: vim /etc/keepalived/keepalived.conf

cat > /etc/keepalived/keepalived.conf << EOF
        global_defs {
          vrrp_version 2
          vrrp_garp_master_delay 1
          vrrp_garp_master_refresh 60
          script_user root
          enable_script_security
        }

        vrrp_script chk_script {
          script "/usr/bin/curl --silent --max-time 2 --insecure https://127.0.0.1:6443/ -o /dev/null"
          interval 1 # check every 1 second
          fall 2 # require 2 failures for OK
          rise 2 # require 2 successes for OK
        }

        vrrp_instance lb-vips {
            state BACKUP
            interface eth0
            virtual_router_id 206
            priority 100
            advert_int 1
            nopreempt # Prevent fail-back
            track_script {
              chk_script
            }
            authentication {
                auth_type PASS
                auth_pass password
            }
            virtual_ipaddress {
                192.168.1.161/32 dev eth0
            }
        }
      EOF
On node 2: (Control Plane Node 2) vim /etc/keepalived/keepalived.conf

cat > /etc/keepalived/keepalived.conf << EOF
        global_defs {
          vrrp_version 2
          vrrp_garp_master_delay 1
          vrrp_garp_master_refresh 60
          script_user root
          enable_script_security
        }

        vrrp_script chk_script {
          script "/usr/bin/curl --silent --max-time 2 --insecure https://127.0.0.1:6443/ -o /dev/null"
          interval 1 # check every 1 second
          fall 2 # require 2 failures for OK
          rise 2 # require 2 successes for OK
        }

        vrrp_instance lb-vips {
            state BACKUP
            interface eth0
            virtual_router_id 206
            priority 100
            advert_int 1
            nopreempt # Prevent fail-back
            track_script {
              chk_script
            }
            authentication {
                auth_type PASS
                auth_pass password
            }
            virtual_ipaddress {
                192.168.1.161/32 dev eth0
            }
        }
      EOF

On node3: (Control Plane Node 3) vim /etc/keepalived/keepalived.conf

cat > /etc/keepalived/keepalived.conf << EOF
        global_defs {
          vrrp_version 2
          vrrp_garp_master_delay 1
          vrrp_garp_master_refresh 60
          script_user root
          enable_script_security
        }

        vrrp_script chk_script {
          script "/usr/bin/curl --silent --max-time 2 --insecure https://127.0.0.1:6443/ -o /dev/null"
          interval 1 # check every 1 second
          fall 2 # require 2 failures for OK
          rise 2 # require 2 successes for OK
        }

        vrrp_instance lb-vips {
            state BACKUP
            interface eth0
            virtual_router_id 206
            priority 100
            advert_int 1
            nopreempt # Prevent fail-back
            track_script {
              chk_script
            }
            authentication {
                auth_type PASS
                auth_pass password
            }
            virtual_ipaddress {
                192.168.1.161/32 dev eth0
            }
        }
      EOF

Check Pod statuses – checking my Keepalived pods:

Step 27 and The FINAL - Remove the Temporary VIP

Now that we have keepalived installed and configured, then it’s time to remove the VIP as we fire up keepalived starting with the second master node to particularly handle the initial IP assignments correctly

Remove the temporary IP added in Part 3 accordingly, and start keepalived service

 ip add del 192.168.1.161/32 dev eth0
Start keepalived
 systemctl start keepalived
Enable keepalived
 systemctl enable keepalived

CONGRATULATIONS MISSION FINALLY ACCOMPLISHED!

At this point, you now should have a Fully Functional Kubernetes HA Cluster with Three Control Plane Nodes in High Availability and Failover!

Test time - Test High Availability and Failover

To test:

  1. shutdown one of the control plane nodes (One that has the VIP assigned), make sure Kubectl still works on one of the other control plane nodes. You can run somrthing like “kubectl get pods”

  2. Turn back on the Controlplane node you shutdown .. wait for it to complete syncing and re-joining the cluster

Repeat both steps a and b for the other two Control Plane nodes, kubectl command should continue to function regardless of which control node is up

That’s it ..

See you in the next one! Cheers

Keywords: Full guide to setting up Kubernetes Stacked ETCD High Availability Control node cluster, with diagrams and references.Setup K8 HA – Part5