Getting started with Kubernetes

·

10 min read

Cover Image for Getting started with Kubernetes

NOTE: I wrote this blog in my freshman year and since then there have been some changes in naming conventions such as "control plane node", deprecation of "replication controller", etc. Kindly refer to the official website for more.

What is Kubernetes?

Kubernetes is a "container-orchestration system" which was open-sourced by Google in 2014. In simple terms, it makes it easier for us to manage containers by automating various tasks. If you are not familiar with containers, checkout my other blog.

Why use Kubernetes?

A container-orchestration engine is used to automate deploying, scaling and managing containerized applications on a group of servers. As I mentioned above, Kubernetes makes it easier for us to manage containers and ensure that there is no downtime. To give you an example, suppose one of the containers that you are running went down, it won’t take much effort to restart it manually. But suppose a large number of containers went down, wouldn’t it be easier if the system handles this issue automatically? Kubernetes can do this for us. Some of the features include scheduling, scaling, load balancing, fault tolerance, deployment, automated rollouts, and rollbacks, etc.

Kubernetes Architecture

Alt Text

Brief

Kubernetes Architecture consists of master node which manages other worker nodes. Worker nodes are nothing but virtual machines / physical servers running within a data center. They expose the underlying network and storage resources to the application. All these nodes join together to form a cluster with providing fault tolerance and replication. These nodes were previously called minions.

Master Node (Control plane)

Alt Text

It is responsible for managing the whole cluster. It monitors the health check of worker nodes and shows the information about the members of the cluster as well as their configuration. For example, if a worker node fails, the master node moves the load to another healthy worker node. Kubernetes master is responsible for scheduling, provisioning, controlling and exposing API to the clients. It coordinates activities inside the cluster and communicates with worker nodes to keep Kubernetes and applications running.

Components of the Master Node

API server

Gatekeeper for the entire cluster. CRUD operations for servers go through the API. API server configures the API objects such as pods, services, replication controllers () and deployments. It exposes API for almost every operation. How to interact with this API? Using a tool called kubectl aka kubecontrol. It talks to the API server to perform any operations that we issue from cmd. In most cases, the master node does not contain containers. It just manages worker nodes, and also makes sure that the cluster of worker nodes are running healthy and successfully.

Scheduler

It is responsible for physically scheduling Pods across multiple nodes. Depending upon the constraints mentioned in the configuration file, scheduler schedules these Pods accordingly. For example, if you mention CPU has 1 core, memory is 10 GB, DiskType is SSD, etc. Once this artifact is passed to API server, the scheduler will look for the appropriate nodes that meet these criteria & will schedule the Pods accordingly.

Control Manager

There are 4 controllers behind the control manager.

  • Node Controller

  • Replication Controller

  • Endpoint Controller

  • Service Accountant Token Controller: These controllers are responsible for the overall health of the entire cluster. It ensures that nodes are up and running all the time as well as the correct number of Pods are running as mentioned in the spec file.

etcd

Distributed key-value lightweight database. Central database to store current cluster state at any point of time. Any component of Kubernetes can query etcd to understand the state of the cluster so this is going to be the single source of truth for all the nodes, components and the masters that are forming Kubernetes cluster.

Worker Node

It is basically any VM or physical server where containers are deployed. Every Node in Kubernetes cluster must run a runtime such as Docker or Rocket.

Alt Text

Components of the Worker Node

Kubelet

Primary Node agent that runs on each worker node inside the cluster. The primary objective is that it looks at the pod spec that was submitted to the API server on the Kubernetes master and ensures that containers described in that pod spec are running and healthy. Incase Kubelet notices any issues with the pods running on the worker nodes, it tries to restart the Pod on the same node. If the fault is with the worker node itself, then Kubernetes Master detects a Node failure and it decides to recreate the Pod on another healthy Node. This also depends on if the Pod is controlled by Replica set or Replication Controller (ensures that the specified number of pods are running at any time). If none of them are behind this Pod then Pod dies & cannot be recreated anywhere. So it is advised to use Pods as deployment or replica set.

Kube-Proxy

Responsible for maintaining the entire network configuration. It maintains the distributed network across all the nodes, across all the pods, and all containers. Also exposes services to the outside world. It is the core networking component inside Kubernetes. The kube-proxy will feed its information about what pods are on this node to iptables. iptables is a firewall in Linux and can route traffic. So when a new pod is launched, kube-proxy is going to change the iptable rules to make sure that this pod is routable within the cluster.

Pods

A scheduling unit in Kubernetes. Like a virtual machine in the virtualization world. In the Kubernetes world, we have a Pod. Each Pod consists of one or more containers. There are scenarios where you need to run two or more dependent containers together within a pod where one container will be helping another container. With the help of Pods, we can deploy multiple dependent containers together. Pod acts as a Wrapper around these containers. We interact and manage containers through Pods.

Containers

Containers are Runtime Environments for containerized applications. We run container applications inside the containers. These containers reside inside Pods. Containers are designed to run Micro-services. For more detailed information, check out my blog on Docker.

Complete Architecture

Alt Text

Installation

Play-with-k8s

Imagine that, you want to quickly test something on your Kubernetes cluster. But it is not readily available. And you don’t want to set up Kubernetes cluster. Play-with-k8s provides with a Kubernetes playground which is similar to play-with-docker. A GitHub or Docker account is required. https://labs.play-with-k8s.com

Minikube

It is used in case you want to install it on the system but have limited system resources. Minikube is all in one system, i.e. no multiple architectures of master and worker Node. The same system acts as a master as well as a worker node. It can be used for testing purposes. https://kubernetes.io/docs/setup/learning-environment/minikube/

Kubeadm

It is an actual real-time setup. Using Kubeadm tool we can setup multi-node Kubernetes cluster. It is very popular and you can have multiple VMs on your machine and configure Kubernetes master and node components. If you have limited resources but want to use Kubeadm, then you need cloud-based VMs. https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

Cloud Platforms

Various cloud services are being provided to run and manage Kubernetes. One can define a number of nodes in cluster, CPU and RAM configurations, etc. and the cloud will manage those resources. Some of the examples include Civo, GCE, AWS, Azure, CloudStack, etc. For more information, check: https://kubernetes.io/docs/concepts/cluster-administration/cloud-providers/

More about Pods

Every node inside a Kubernetes cluster has its unique IP address known as Node IP Address. In Kubernetes, there is an additional IP Address called Pod IP address. So once we deploy a Pod on the worker node, it will get it’s own IP Address. Containers in pods communicate with the outside world by network namespace. All the containers inside a pod operate within that same network namespace as the pod. Means all the containers in a pod will have the same IP Address as their worker node. There is a unique way to identify each container. It can be done by using ports. Note: Containers within the same pod, not only just share the same IP Address, but will also share the access to the same volumes, c-group limits, and even same IPC names.

Pod Networking

How do Pods communicate with one another?

  • Inter-Pod communication: All the Pod IP addresses are fully routable on the Pod Network inside the Kubernetes cluster.

    Alt Text

    How do containers communicate in the same pod?

  • Intra-Pod Communication: Containers use shared Local Host interface. All the containers can communicate with each other’s port on local host.

    Alt Text

Pod Lifecycle

  • Define the pod configuration inside manifest file (explained ahead) in yaml/json. Submit the manifest file on the api server of the Kubernetes master.

  • It will then get scheduled on a worker node inside the cluster. Once it is scheduled, it goes in the pending state. - During this pending state, the node will download all container images and start running the containers. It stays in the pending state until all containers are up and running.

  • Now it goes in the running state. When the purpose is achieved, it gets shutdown & state is changed to succeeded.

  • Failed state: It happens when the pod is in the pending state and fails due to some particular reason. If a pod dies, you cannot bring it back. You can replace it with a new pod.

Alt Text

Pod Manifest file

apiVersion: v1
kind: Pod
metadata:
    name: nginx-pod
    labels:
        app: nginx
        tier: dev
spec:
    containers:
      - name: nginx-container
        image: nginx

We can define Kubernetes objects in 2 formats: yaml and json. Most of the Kubernetes objects consist of 4 top level required fields :

apiVersion

It defines the version of the Kubernetes API you’re using to create this object.

  • v1: It means that the Kubernetes object is part of first stable release of the Kubernetes API. So it consists of core objects such as Pods, ReplicationController and Service.

  • apps/v1: Includes functionality related to running apps in Kubernetes.

  • batch/v1: Consists of objects related to bash processes and jobs like tasks.

kind

It defines the type of object being created.

metadata

Data that helps uniquely identify the object, including a name string, UID, and optional namespace.

spec

The precise format of the object spec is different for every Kubernetes object, and contains nested fields specific to that object. For more information, check out: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.15/ The above file will create one instance of nginx container inside your Kubernetes cluster.

DEMO

In this demo we will use the manifest file nginx-pod.yaml described above.

  • Deploy the pod from nginx-pod.yaml
$ kubectl create -f nginx-pod.yaml
pod/nginx-pod created
  • To list all the pods
$ kubectl get pods 
NAME        READY   STATUS    RESTARTS   AGE
nginx-pod   1/1     Running   0          9m25s
  • Every pod has a unique IP address
$ kubectl get pod nginx-pod -o wide
NAME        READY   STATUS    RESTARTS   AGE   IP           NODE       NOMINATED NODE   READINESS GATES
nginx-pod   1/1     Running   0          11m   172.17.0.7   minikube   <none>           <none>
  • Get pod configuration in YAML format $ kubectl get pod nginx-pod -o yaml

  • Get pod configuration in JSON format $ kubectl get pod nginx-pod -o json

  • This will display details of the pod which includes list of all events from the time the pod is sent to the node till the current status of the pod $ kubectl describe pod nginx-pod

  • Check if the pods are accessible: Verify if the connectivity from the master node to the pod is working by using the pod’s IP address $ ping 172.17.0.7

  • Expose the pod using NodePort service

$ kubectl expose pod nginx-pod --type=NodePort --port=80
service/nginx-pod exposed
  • Here you can see the NodePort
$ kubectl describe svc nginx-pod
Name:                     nginx-pod
Namespace:                default
Labels:                   app=nginx
                          tier=dev
Annotations:              <none>
Selector:                 app=nginx,tier=dev
Type:                     NodePort
IP:                       10.101.229.154
Port:                     <unset>  80/TCP
TargetPort:               80/TCP
NodePort:                 <unset>  30843/TCP
Endpoints:                172.17.0.7:80
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>
  • Now lets get inside the pod and execute some commands
$ kubectl exec -it nginx-pod -- /bin/sh
# ls
bin  boot  dev etc  home  lib lib64  media  mnt  opt proc  root  run  sbin  srv  sys  tmp  usr  var
# hostname
nginx-pod
# exit
  • Delete the Pod
$ kubectl delete pod nginx-pod
pod "nginx-pod" deleted