Kubernetes Cluster on CentOS 8 Home Lab

I currently have 3 Dell T110 and they are all single socket CPU/4-8 cores and 16G memory. Although this isn’t an ideal setup, I don’t plan on running anything crazy. This is for staying on top of skills for the most part.

Note: I used Techmint Articles for assistance in this build.

Hardware

  • 3 Dell T110 Servers
    • Server A: Intel(R) Xeon(R) CPU X3430 @ 2.40GHz, 2x500GB Raid 1, 16G Mem, 2x1G eNet
    • Server B: Intel(R) Xeon(R) CPU X3470 @ 2.93GHz, 2x500GB Raid 1, 16G Mem, 2x1G eNet
    • Server C: Intel(R) Xeon(R) CPU X3440 @ 2.53GHz, 2x500GB Raid 1, 16G Mem, 2x1G eNet
  • Synology DS418+
    • NFS, iSCSI, SMB
  • Netgear G5308E
    • Managed switch: VLAN, Aggregation

Getting Started

  1. Three servers running Centos 8. 1 will be a Master and the other 2 will be Working Nodes.
  2. You should have at least 2 CPUs with 2GB RAM or more per machine. You can run with something other than this but your application performance will suffer.
  3. Internet connectivity on all your nodes. We will be fetching Kubernetes and docker packages from the repository. Equally, you will need to make sure that DNF package manager is installed by default and can fetch packages remotely.
  4. All your nodes should also be able to connect to one another, either on a private or public network, whichever is available.
  5. You will also need access to an account with sudo or root privileges.

Special Note: Make sure that you are using unique Mac addresses and product UUID is different. You can verify that they are with he following:

# ip link
1: lo: mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp3s0f0: mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
link/ether 00:00:00:00:00:1a brd ff:ff:ff:ff:ff:ff

# cat /sys/class/dmi/id/product_uuid
4c4c4544-0037-3310-8054-xxxxxxxxxx

Logical Architecture

Original image from kubernetes.io
  • Cluster: It is a collection of hosts(servers) that helps you to aggregate their available resources. That includes ram, CPU, ram, disk, and their devices into a usable pool.
  • Master: The master is a collection of components which make up the control panel of Kubernetes. These components are used for all cluster decisions. It includes both scheduling and responding to cluster events.
  • Node: It is a single host which is capable of running on a physical or virtual machine. A node should run both kube-proxy, minikube, and kubelet which are considered as a part of the cluster.
  • Namespace: It is a logical cluster or environment. It is a widely used method which is used for scoping access or dividing a cluster.

Okay Lets Get some Pre-Requisites out of the way!

On the Centos 8 Master we need to set the hostname and update DNS int he /etc/hosts file.

# hostnamectl set-hostname master-node
# cat <<EOF>> /etc/hosts
10.0.1.171 master-node lab-t110-4a
10.0.1.172 node-1 worker-node-1 lab-t110-4b
10.0.1.173 node-2 worker-node-2 lab-t110-4c
EOF

Now lets check to see if we can ping the nodes

# ping worker-node-1
PING node-1 (10.0.1.172) 56(84) bytes of data.
64 bytes from node-1 (10.0.1.172): icmp_seq=1 ttl=64 time=0.425 ms
64 bytes from node-1 (10.0.1.172): icmp_seq=2 ttl=64 time=0.293 ms
^C
--- node-1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 14ms
rtt min/avg/max/mdev = 0.293/0.359/0.425/0.066 ms

# ping worker-node-2
PING node-2 (10.0.1.173) 56(84) bytes of data.
64 bytes from node-2 (10.0.1.173): icmp_seq=1 ttl=64 time=0.480 ms
64 bytes from node-2 (10.0.1.173): icmp_seq=2 ttl=64 time=0.261 ms
^C
--- node-2 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 2ms
rtt min/avg/max/mdev = 0.261/0.370/0.480/0.111 ms

Check to see what SELinux is set to. If not disabled then disable it.

# sestatus
SELinux status: disabled

To disable
# setenforce 0

To make it survive reboots enter:
# sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux 
# reboot  (Only if it was not already disabled)

If you have the firewall running you will need to open some ports. You can check if it is running with the following:

# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabl>
Active: active (running) since Mon 2020-09-14 08:32:37 EDT; 2s a>
Docs: man:firewalld(1)
Main PID: 2966 (firewalld)
Tasks: 2 (limit: 101284)
Memory: 30.8M
CGroup: /system.slice/firewalld.service
└─2966 /usr/libexec/platform-python -s /usr/sbin/firewal>
Kubernetes Ports
# firewall-cmd --permanent --add-port=6443/tcp 
# firewall-cmd --permanent --add-port=2379-2380/tcp
# firewall-cmd --permanent --add-port=10250/tcp 
# firewall-cmd --permanent --add-port=10251/tcp 
# firewall-cmd --permanent --add-port=10252/tcp 
# firewall-cmd --permanent --add-port=10255/tcp 
# firewall-cmd --reload
# modprobe br_netfilter 
# echo '1' > /proc/sys/net/bridge/bridge-nf-call-iptables

Now we can install Docker CE

we need to add the repo first:

# dnf config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo

We also need to install containerd.io package which is available as a daemon that manages the complete container lifecycle of its host system, from image transfer and storage to container execution and supervision to low-level storage to network attachments and beyond.

# dnf install https://download.docker.com/linux/centos/7/x86_64/stable/Packages/containerd.io-1.2.6-3.3.el7.x86_64.rpm -y

Docker CE Stable - x86_64 40 kB/s | 3.5 kB 00:00
containerd.io-1.2.6-3.3.el7.x86_64. 20 MB/s | 26 MB 00:01
Dependencies resolved.
Package Arch Version Repository Size
Upgrading:
containerd.io x86_64 1.2.6-3.3.el7 @commandline 26 M
Transaction Summary
Upgrade 1 Package
Total size: 26 M
Is this ok [y/N]: y
Downloading Packages:
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Running scriptlet: containerd.io-1.2.6-3.3.el7.x86_64 1/1
Upgrading : containerd.io-1.2.6-3.3.el7.x86_64 1/2
Running scriptlet: containerd.io-1.2.6-3.3.el7.x86_64 1/2
Running scriptlet: containerd.io-1.2.0-3.el7.x86_64 2/2
Cleanup : containerd.io-1.2.0-3.el7.x86_64 2/2
Running scriptlet: containerd.io-1.2.0-3.el7.x86_64 2/2
Verifying : containerd.io-1.2.6-3.3.el7.x86_64 1/2
Verifying : containerd.io-1.2.0-3.el7.x86_64 2/2
Upgraded:
containerd.io-1.2.6-3.3.el7.x86_64
Complete!

# dnf install docker-ce

Lets start Docker

# systemctl enable docker 
# systemctl start docker

We can now Install Kubernetes (Kubeadm)

We will install the repos since its not default

# cat <<EOF > /etc/yum.repos.d/kubernetes.repo 
[kubernetes] 
name=Kubernetes 
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64 
enabled=1 
gpgcheck=1 
repo_gpgcheck=1 
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg 
EOF

Kubeadm helps you bootstrap a minimum viable Kubernetes cluster that conforms to best practices. With kubeadm, your cluster should pass the Kubernetes Conformance tests.

Kubeadm also supports other cluster lifecycle functions, such as upgrades, downgrade, and managing bootstrap tokens. Kubeadm is also integration-friendly with other orchestration tools like Ansible and Terraform.

With the package repo now ready, you can go ahead and install kubeadm package.

# dnf install kubeadm -y
Kubernetes 750 B/s | 454 B 00:00
Kubernetes 16 kB/s | 1.8 kB 00:00
Importing GPG key 0xA7317B0F:
Userid : "Google Cloud Packages Automatic Signing Key gc-team@google.com"
Fingerprint: D0BC 747F D8CA F711 7500 D6FA 3746 C208 A731 7B0F
From : https://packages.cloud.google.com/yum/doc/yum-key.gpg
Importing GPG key 0xBA07F4FB:
Userid : "Google Cloud Packages Automatic Signing Key gc-team@google.com"
Fingerprint: 54A6 47F9 048D 5688 D7DA 2ABE 6A03 0B21 BA07 F4FB
From : https://packages.cloud.google.com/yum/doc/yum-key.gpg
Kubernetes 8.4 kB/s | 975 B 00:00
Importing GPG key 0x3E1BA8D5:
Userid : "Google Cloud Packages RPM Signing Key gc-team@google.com"
Fingerprint: 3749 E1BA 95A8 6CE0 5454 6ED2 F09C 394C 3E1B A8D5
From : https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
Kubernetes 114 kB/s | 103 kB 00:00
Last metadata expiration check: 0:00:01 ago on Mon 14 Sep 2020 08:46:22 AM EDT.
Dependencies resolved.
Package Arch Version Repository Size
Installing:
kubeadm x86_64 1.19.1-0 kubernetes 8.3 M
Installing dependencies:
conntrack-tools x86_64 1.4.4-10.el8 BaseOS 204 k
cri-tools x86_64 1.13.0-0 kubernetes 5.1 M
kubectl x86_64 1.19.1-0 kubernetes 9.0 M
kubelet x86_64 1.19.1-0 kubernetes 19 M
kubernetes-cni x86_64 0.8.7-0 kubernetes 19 M
libnetfilter_cthelper x86_64 1.0.0-15.el8 BaseOS 24 k
libnetfilter_cttimeout x86_64 1.0.0-11.el8 BaseOS 24 k
libnetfilter_queue x86_64 1.0.2-11.el8 BaseOS 30 k
Transaction Summary
Install 9 Packages
Total download size: 61 M
Installed size: 260 M
Downloading Packages:
(1/9): libnetfilter_cthelper-1.0.0- 79 kB/s | 24 kB 00:00
(2/9): libnetfilter_cttimeout-1.0.0 79 kB/s | 24 kB 00:00
(3/9): libnetfilter_queue-1.0.2-11. 159 kB/s | 30 kB 00:00
(4/9): conntrack-tools-1.4.4-10.el8 373 kB/s | 204 kB 00:00
(5/9): 14bfe6e75a9efc8eca3f638eb22c 8.0 MB/s | 5.1 MB 00:00
(6/9): c70d28093e36905c8163d22b6bbf 15 MB/s | 19 MB 00:01
(7/9): 8c7681cd4f2e6d08354ea6735e5a 4.3 MB/s | 8.3 MB 00:01
(8/9): 1797a730663f51270c890b8868d5 4.2 MB/s | 9.0 MB 00:02
(9/9): db7cb5cb0b3f6875f54d10f02e62 15 MB/s | 19 MB 00:01
Total 17 MB/s | 61 MB 00:03
warning: /var/cache/dnf/kubernetes-33343725abd9cbdc/packages/14bfe6e75a9efc8eca3f638eb22c7e2ce759c67f95b43b16fae4ebabde1549f3-cri-tools-1.13.0-0.x86_64.rpm: Header V4 RSA/SHA512 Signature, key ID 3e1ba8d5: NOKEY
Kubernetes 17 kB/s | 1.8 kB 00:00
Importing GPG key 0xA7317B0F:
Userid : "Google Cloud Packages Automatic Signing Key gc-team@google.com"
Fingerprint: D0BC 747F D8CA F711 7500 D6FA 3746 C208 A731 7B0F
From : https://packages.cloud.google.com/yum/doc/yum-key.gpg
Key imported successfully
Importing GPG key 0xBA07F4FB:
Userid : "Google Cloud Packages Automatic Signing Key gc-team@google.com"
Fingerprint: 54A6 47F9 048D 5688 D7DA 2ABE 6A03 0B21 BA07 F4FB
From : https://packages.cloud.google.com/yum/doc/yum-key.gpg
Key imported successfully
Kubernetes 8.4 kB/s | 975 B 00:00
Importing GPG key 0x3E1BA8D5:
Userid : "Google Cloud Packages RPM Signing Key gc-team@google.com"
Fingerprint: 3749 E1BA 95A8 6CE0 5454 6ED2 F09C 394C 3E1B A8D5
From : https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
Key imported successfully
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Installing : kubectl-1.19.1-0.x86_64 1/9
Installing : cri-tools-1.13.0-0.x86_64 2/9
Installing : libnetfilter_queue-1.0.2-11.el8.x86_64 3/9
Running scriptlet: libnetfilter_queue-1.0.2-11.el8.x86_64 3/9
Installing : libnetfilter_cttimeout-1.0.0-11.el8.x86_ 4/9
Running scriptlet: libnetfilter_cttimeout-1.0.0-11.el8.x86_ 4/9
Installing : libnetfilter_cthelper-1.0.0-15.el8.x86_6 5/9
Running scriptlet: libnetfilter_cthelper-1.0.0-15.el8.x86_6 5/9
Installing : conntrack-tools-1.4.4-10.el8.x86_64 6/9
Running scriptlet: conntrack-tools-1.4.4-10.el8.x86_64 6/9
Installing : kubernetes-cni-0.8.7-0.x86_64 7/9
Installing : kubelet-1.19.1-0.x86_64 8/9
Installing : kubeadm-1.19.1-0.x86_64 9/9
Running scriptlet: kubeadm-1.19.1-0.x86_64 9/9
Verifying : conntrack-tools-1.4.4-10.el8.x86_64 1/9
Verifying : libnetfilter_cthelper-1.0.0-15.el8.x86_6 2/9
Verifying : libnetfilter_cttimeout-1.0.0-11.el8.x86_ 3/9
Verifying : libnetfilter_queue-1.0.2-11.el8.x86_64 4/9
Verifying : cri-tools-1.13.0-0.x86_64 5/9
Verifying : kubeadm-1.19.1-0.x86_64 6/9
Verifying : kubectl-1.19.1-0.x86_64 7/9
Verifying : kubelet-1.19.1-0.x86_64 8/9
Verifying : kubernetes-cni-0.8.7-0.x86_64 9/9
Installed:
conntrack-tools-1.4.4-10.el8.x86_64
cri-tools-1.13.0-0.x86_64
kubeadm-1.19.1-0.x86_64
kubectl-1.19.1-0.x86_64
kubelet-1.19.1-0.x86_64
kubernetes-cni-0.8.7-0.x86_64
libnetfilter_cthelper-1.0.0-15.el8.x86_64
libnetfilter_cttimeout-1.0.0-11.el8.x86_64
libnetfilter_queue-1.0.2-11.el8.x86_64
Complete!

Now we can enable and start it

# systemctl enable kubelet 
# systemctl start kubelet

Now we need to Create a control-plane Master

The Kubernetes master which acts as the control plane for the cluster runs a few critical services necessary for the cluster. As such, the initialization process will do a series of prechecks to ensure that the machine is ready to run Kubernetes. These prechecks expose warnings and exit on errors. kubeadm init then downloads and installs the cluster control plane components.

Now it’s time to initialize Kubernetes master, but before that, you must disable swap in order to run “kubeadm init“ command.

# swapoff -a

Initializing Kubernetes master is a completely automated process that is controlled by the “kubeadm init“ command as shown.

# kubeadm init

W0914 10:55:29.369030 11582 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.19.1
[preflight] Running pre-flight checks
[WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master-node] and IPs [10.96.0.1 10.0.1.171]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost master-node] and IPs [10.0.1.171 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost master-node] and IPs [10.0.1.171 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 26.512735 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.19" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node master-node as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node master-node as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: 9iucxl.9i23i12arybmui5e
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.0.1.171:6443 --token 9iucxl.9i23i12arybmui5e \
--discovery-token-ca-cert-hash sha256:d2d25a0c3df687979c4a2d35f3316a83ae20ecb4de94dc1b1b11351741ec6f77

We need to copy the command below since we will need it later on for the worker nodes. Make sure and remove the “\” it will cause errors sometimes:

kubeadm join 10.0.1.171:6443 --token 9iucxl.9i23i12arybmui5e \
--discovery-token-ca-cert-hash sha256:d2d25a0c3df687979c4a2d35f3316a83ae20ecb4de94dc1b1b11351741ec6f77

Will become

kubeadm join 10.0.1.171:6443 --token 9iucxl.9i23i12arybmui5e --discovery-token-ca-cert-hash sha256:d2d25a0c3df687979c4a2d35f3316a83ae20ecb4de94dc1b1b11351741ec6f77

When Kubernetes initialized successfully, you must enable your user to start using the cluster. In our scenario, we will be using the root user. You can also start the cluster using sudo user as shown.

# mkdir -p $HOME/.kube 
# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config 
# chown $(id -u):$(id -g) $HOME/.kube/config

As a sudo enabled user:

$ mkdir -p $HOME/.kube 
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config 
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config

Validate that the kubectl command is activated.

[root@master-node ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master-node NotReady master 12m v1.19.1

The Master is not ready since we have not setup a POD network yet.

# export kubever=$(kubectl version | base64 | tr -d '\n') 
# kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$kubever"
serviceaccount/weave-net created
clusterrole.rbac.authorization.k8s.io/weave-net created
clusterrolebinding.rbac.authorization.k8s.io/weave-net created
role.rbac.authorization.k8s.io/weave-net created
rolebinding.rbac.authorization.k8s.io/weave-net created
daemonset.apps/weave-net created

Now lets check if Master is ready

# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master-node NotReady master 18m v1.19.1

It may take a few minutes be patient

# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master-node Ready master 19m v1.19.1

Add Worker Nodes to the Cluster

As before we need to make some preparations on node-1 and node-2

# hostnamectl set-hostname node-1
# cat <<EOF>> /etc/hosts
10.0.1.171 master-node lab-t110-4a
10.0.1.172 node-1 worker-node-1 lab-t110-4b
10.0.1.173 node-2 worker-node-2 lab-t110-4c
EOF

# hostnamectl set-hostname node-2
# cat <<EOF>> /etc/hosts
10.0.1.171 master-node lab-t110-4a
10.0.1.172 node-1 worker-node-1 lab-t110-4b
10.0.1.173 node-2 worker-node-2 lab-t110-4c
EOF

Now lets check to see if we can ping the nodes on each server including Master

# ping node-1
# ping node-2

Again we need to disable SELinux and update some firewall rules

# setenforce 0 
# sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux 
# firewall-cmd --permanent --add-port=6783/tcp 
# firewall-cmd --permanent --add-port=10250/tcp 
# firewall-cmd --permanent --add-port=10255/tcp 
# firewall-cmd --permanent --add-port=30000-32767/tcp 
# firewall-cmd  --reload 
# echo '1' > /proc/sys/net/bridge/bridge-nf-call-iptables

We can now setup Docker-CE on the Nodes. This is similar so I will not show the outputs on the next section.

# dnf config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo
# dnf install https://download.docker.com/linux/centos/7/x86_64/stable/Packages/containerd.io-1.2.6-3.3.el7.x86_64.rpm -y
# dnf install docker-ce -y
# systemctl enable docker 
# systemctl start docker

Install Kubernetes on worker nodes

# cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF

# swapoff -a
# dnf install kubeadm -y
# systemctl enable kubelet 
# systemctl start kubelet

On each worker node join the cluster

# kubeadm join 10.0.1.171:6443 --token 9iucxl.9i23i12arybmui5e --discovery-token-ca-cert-hash sha256:d2d25a0c3df687979c4a2d35f3316a83ae20ecb4de94dc1b1b11351741ec6f77

[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Reading configuration from the cluster…
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap…
This node has joined the cluster:
Certificate signing request was sent to apiserver and a response was received.
The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

# kubeadm join 10.0.1.171:6443 --token 9iucxl.9i23i12arybmui5e --discovery-token-ca-cert-hash sha256:d2d25a0c3df687979c4a2d35f3316a83ae20ecb4de94dc1b1b11351741ec6f77

[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Reading configuration from the cluster…
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap…
This node has joined the cluster:
Certificate signing request was sent to apiserver and a response was received.
The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

Made a mistake for the config?

You can reset kubeadm init or join by issuing a reset. You can then re-issue the init or join command

[root@worker-node-1 ~]# kubeadm reset
[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[preflight] Running pre-flight checks
...

We can not check for the nodes within our cluster

As suggested on the last line, go back to your master-node and verify if worker node-1 and worker node-2 have joined the cluster using the following command.

[root@master-node ~]# kubectl get nodes

NAME STATUS ROLES AGE VERSION
master-node Ready master 20h v1.19.1
worker-node-1 NotReady 70s v1.19.1
worker-node-2 Ready 35s v1.19.1

Troubleshooting

Problem: Worker Node NotReady

[root@master-node ~]# kubectl get nodes 
NAME STATUS ROLES AGE VERSION
master-node Ready master 21h v1.19.1
worker-node-1 NotReady 66m v1.19.1
worker-node-2 Ready 66m v1.19.1

You can see here that one of the nodes is not starting. It turned out that the “cni” was not getting configured so how did I figure that out. Lets look at this systematically:

On Master Node run command:

[root@master-node ~]# kubectl get nodes 
NAME STATUS ROLES AGE VERSION
master-node Ready master 21h v1.19.1
worker-node-1 NotReady 66m v1.19.1
worker-node-2 Ready 66m v1.19.1

Okay so we know our issue is something to do with worker-node-1, but 2 is fine so what happened? The setup should have been the same but for whatever reason something was different. Let figure out what the problem is first?

  1. Login to worker-node-1
  2. Restart kubelet and check status
    1. systemctl restart kubelet
    2. systemctl status kubelet
[root@worker-node-1 ~]# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled>
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Tue 2020-09-15 08:17:04 EDT; 20mi>
Docs: https://kubernetes.io/docs/
Main PID: 21206 (kubelet)
Tasks: 19 (limit: 101284)
Memory: 59.9M
CGroup: /system.slice/kubelet.service
└─21206 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kub>

Ok so kubelet is running fine so nothing wrong here so far. Lets continue…

  1. Logon to Master Node and Enter:
    1. [root@master-node ~]# kubectl describe node worker-node-1

Your going to see allot of information. We are looking for anything that is impeding this node from going ready! Within the output below I will add notes for diagnose.

Name: worker-node-1
Roles:
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=worker-node-1
kubernetes.io/os=linux
Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Tue, 15 Sep 2020 07:10:32 -0400
<<<Everything is good until we get here; ok lets keep looking to find the issue>>>
Taints: node.kubernetes.io/not-ready:NoExecute                                  
node.kubernetes.io/not-ready:NoSchedule                           
Unschedulable: false
Lease:
HolderIdentity: worker-node-1
AcquireTime:
RenewTime: Tue, 15 Sep 2020 07:48:41 -0400
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
<<<FALSE is fine this means that there is nothing taking away for this resource>>>
MemoryPressure False Tue, 15 Sep 2020 07:47:13 -0400 Tue, 15 Sep 2020 07:42:11 -0400 KubeletHasSufficientMemory kubelet has sufficient memory available   
DiskPressure False Tue, 15 Sep 2020 07:47:13 -0400 Tue, 15 Sep 2020 07:42:11 -0400 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Tue, 15 Sep 2020 07:47:13 -0400 Tue, 15 Sep 2020 07:42:11 -0400 KubeletHasSufficientPID kubelet has sufficient PID available

<<<Here is the culprit something didn't get configured or installed properly, on to the next section>>>
Ready False Tue, 15 Sep 2020 07:47:13 -0400 Tue, 15 Sep 2020 07:42:11 -0400 KubeletNotReady runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Addresses:
InternalIP: 10.0.1.172
Hostname: worker-node-1
Capacity:
cpu: 8
ephemeral-storage: 20027216Ki
hugepages-2Mi: 0
memory: 16239224Ki
pods: 110
Allocatable:
cpu: 8
ephemeral-storage: 18457082236
hugepages-2Mi: 0
memory: 16136824Ki
pods: 110
System Info:
Machine ID: a16736b44ac547e49019d9376f8e88b3
System UUID: 4c4c4544-0035-3310-8054-b6c04f595231
Boot ID: be95d903-db08-45fa-b8c8-acce31e620a3
Kernel Version: 4.18.0-193.14.2.el8_2.x86_64
OS Image: CentOS Linux 8 (Core)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://19.3.12
Kubelet Version: v1.19.1
Kube-Proxy Version: v1.19.1
Non-terminated Pods: (2 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
kube-system kube-proxy-9txvs 0 (0%) 0 (0%) 0 (0%) 0 (0%) 38m
kube-system weave-net-bckhv 100m (1%) 0 (0%) 200Mi (1%) 0 (0%) 38m
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 100m (1%) 0 (0%)
memory 200Mi (1%) 0 (0%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Starting 38m kubelet, worker-node-1 Starting kubelet.
Normal NodeHasSufficientMemory 38m (x2 over 38m) kubelet, worker-node-1 Node worker-node-1 status is now: NodeHasSufficientMemory
Normal NodeHasNoDiskPressure 38m (x2 over 38m) kubelet, worker-node-1 Node worker-node-1 status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientPID 38m (x2 over 38m) kubelet, worker-node-1 Node worker-node-1 status is now: NodeHasSufficientPID
Normal NodeAllocatableEnforced 38m kubelet, worker-node-1 Updated Node Allocatable limit across pods
Normal Starting 38m kube-proxy, worker-node-1 Starting kube-proxy.
Normal Starting 33m kubelet, worker-node-1 Starting kubelet.
Normal NodeHasSufficientMemory 33m kubelet, worker-node-1 Node worker-node-1 status is now: NodeHasSufficientMemory
Normal NodeHasNoDiskPressure 33m kubelet, worker-node-1 Node worker-node-1 status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientPID 33m kubelet, worker-node-1 Node worker-node-1 status is now: NodeHasSufficientPID
Normal NodeAllocatableEnforced 33m kubelet, worker-node-1 Updated Node Allocatable limit across pods
Normal NodeAllocatableEnforced 22m kubelet, worker-node-1 Updated Node Allocatable limit across pods
Normal Starting 22m kubelet, worker-node-1 Starting kubelet.
Normal NodeHasNoDiskPressure 22m kubelet, worker-node-1 Node worker-node-1 status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientPID 22m kubelet, worker-node-1 Node worker-node-1 status is now: NodeHasSufficientPID
Normal NodeHasSufficientMemory 22m kubelet, worker-node-1 Node worker-node-1 status is now: NodeHasSufficientMemory
Normal Starting 6m37s kubelet, worker-node-1 Starting kubelet.
Normal NodeHasSufficientMemory 6m37s (x2 over 6m37s) kubelet, worker-node-1 Node worker-node-1 status is now: NodeHasSufficientMemory
Normal NodeHasNoDiskPressure 6m37s (x2 over 6m37s) kubelet, worker-node-1 Node worker-node-1 status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientPID 6m37s (x2 over 6m37s) kubelet, worker-node-1 Node worker-node-1 status is now: NodeHasSufficientPID
Normal NodeAllocatableEnforced 6m37s kubelet, worker-node-1 Updated Node Allocatable limit across pods
Warning Rebooted 6m37s kubelet, worker-node-1 Node worker-node-1 has been rebooted, boot id: be95d903-db08-45fa-b8c8-acce31e620a3
Normal NodeNotReady 6m37s kubelet, worker-node-1 Node worker-node-1 status is now: NodeNotReady
Normal Starting 6m31s kube-proxy, worker-node-1 Starting kube-proxy.

OK now we know that the issue is now we just need to figure out what it means?

I found out by searching the error on the Web and came across CNI missing. This is a plugin that Docker needs and for some reason it didn’t get installed. I tried reinstalling docker but the config file still was not there. Normally it should be here:

[root@worker-node-2 net.d]# ls
10-flannel.conflist 10-weave.conflist

After doing some digging I found that flannel did not get deployed properly. This is even after I backed the installs out and did it again. Obviously something got missed even thought he config was scripted. I was able to track down someone on GitHub that had the same issue. They had an older version of “flannel” which mine and they needed 1.16. My version was fine but decided to run the install just to make sure, can’t hurt anything at this point. I ran the update on the Master Node.

[root@master-node ~]# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml

Once I ran this I restarted kublet for good measure and wallah now the node is ready.

[root@master-node ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master-node Ready master 22h v1.19.1
worker-node-1 Ready 131m v1.19.1
worker-node-2 Ready 131m v1.19.1

I know something weird happened just which I knew what. At some point I will just bring the OS back to new and try again to see if I run into this again. For now my lab systems are running and will leave it alone.