Beginner's Guide to Kubernetes

Written By
Arun Mathew Kurian

Understanding Kubernetes

The objective of this article is to provide an introduction and a general understanding about Kubernetes. We will look into architectural concepts, advantages of using Kubernetes, how to set up Kubernetes, and how to run commands on a local machine using Minikube.

What is Kubernetes?

Before we dive deep into Kubernetes, let's first understand the concept of containers. Containerization is the process of packaging code and dependencies of an application into a single image, and running it in a single computing environment called the container.

Docker revolutionized containerization by packaging software into a virtual container that can run on any environment. With containers, one can deploy applications onto multiple platforms like AWS, GCP, or Digital Ocean. 

But challenges arise when the number of containers increases. It becomes difficult to deploy and maintain hundreds of containers across many servers or VM instances. This led to the evolution of the container orchestration software.

An orchestrator’s role is to coordinate a set of container workloads across a series of servers or nodes. This includes things like making sure the containers have the right resources, the correct number of containers are available at all times, rolling deploys to prevent uptime, and more. Kubernetes is the most popular container orchestrator.

Kubernetes, also known as k8s, was released in 2015 by Google and is now maintained by an open-source community (of which Google is also a part of). Kubernetes provides a set of APIs and command-line interfaces to manage containers deployed across servers. It automates the deployment, scaling, and management of the containers across clusters of hosts. This makes it a popular choice to host microservice-based implementations as it addresses many concerns of microservice implementations like configuration management, service discovery, and job scheduling.

Apart from the core Kubernetes implementation , there are many versions/flavors of Kubernetes implementation provided by various cloud services and distributions that conform to the Kubernetes spec.

Kubernetes Architecture

Let’s look at the components of Kubernetes architecture

Kubectl

We communicate with Kubernetes through its API. The popular tool that is used for this is called Kubectl (pronounced cube control or cube CTL).

Node

A single server in a Kubernetes cluster is known as a node. Workloads are assigned from the control plane, or master to the nodes. The various components of a node server are:

  1. Container Runtime: The container runtime runs and manages applications present in the containers. The container runtime is usually Docker although it supports other container runtimes like Containerd.
  2. Kubelet: Kubelet is a container that runs as an agent on each node to communicate with the Kubernetes control plane (master node). Kubelet also makes sure that the pods (basic unit of deployment) running in a node are healthy and running based on the YAML file called Podspec.
  3. Kube-proxy: A proxy service that runs on each of the node servers that performs tasks like forwarding correct requests to containers and make services available to other components.

Control Plane

The Control plane, also known as the master, is in charge of managing the Kubernetes cluster. It is a group of containers and each one of them does a single job. These components make the decisions that are applicable to the whole cluster like scheduling, detecting and responding to cluster events. The various components of the control plane are

  1. etcd: An important concern of a microservice architecture is that the configuration data should be kept isolated from the code and be accessible. etcd is the Kubernetes component which is a persistent, lightweight, and distributed key-value data store which stores configuration data that represents the state of the cluster.
  2. API server: As mentioned, Kubernetes consists of a set of APIs. This API is served using JSON over HTTP by the component in the control plane known as the API server. It provides interfaces to Kubernetes and is the API server that updates the states of API objects in etcd, allowing clients to configure workloads and containers across nodes
  3. Scheduler: The scheduler is the component of the control plane which assigns unassigned workloads(pods) to the nodes in the cluster. The scheduler tracks the states of the nodes and ensures that workload is equally distributed.
  4. Controller manager: The controllers are the components that work to turn the state of the cluster to the user’s desire. It creates updates and deletes resources in the controller. These controllers include ReplicationController, JobsController, and DaemonSet Controller. The controller manager is the component that manages all these controllers.

Kubernetes Objects

Kubernetes provides layers of abstractions over the container to provide mechanisms that deploy, maintain and scale applications. So the users will interact with the primitives provided by the Kubernetes object instead of directly communicating with the containers. Let’s take a look at the different types of objects available in Kubernetes.

Pods

A pod is made up of one or more containers running in a node. It is the basic unit of a deployment. Containers are not directly deployed in Kubernetes. They are deployed as pods. A pod is usually one or more containers that are controlled as a single application. Each pod is assigned a unique IP address known as Pod IP. The containers inside the pod can address each other on localhost, but a container has to use a Pod IP to address a container inside a different pod. Various operations can be performed on pods by the controllers that are managed by the Controller manager

Replica Set

When working with Kubernetes, we may have to replicate our pods and manage them for scaling purposes. For example, we may decide that we want three instances of our API running at all times. This is achieved by using ReplicaSets. A ReplicaSet is a grouping set that maintains replicas that are declared for a pod by the user. ReplicaSet maintains a stable set of copies of a pod that are always available.

Services

A set of pods that work together is known as a Service in Kubernetes. The pods are defined by a semantic tag called a Label. The Service discovery component assigns an IP address and DNS name to the service, and load balances traffic into pods that match the selector label. Service discovery can be based on environmental variables or using Kubernetes DNS. An example for this is backend pods grouped into a service with requests from frontend load-balanced among them.

Deployments

Deployments are one of the most important objects in Kubernetes. Deployment does the process of changing the actual state of the objects to the desired state of the user at a controlled rate. Deployments can adjust the replica sets, change the versions of the applications, etc by changing the configurations of the cluster.

There are two ways to deploy pods, by using the command line and by using a YAML file. We will look into this in detail in a later section.

Volumes

By default, an ephemeral storage is provided by filesystem in Kubernetes, meaning a pod restart will wipe out all the data in the containers. This is a problem for applications that require persistent storage. 

A volume provides persistent storage for the data in the pods. Volumes can be mounted at a specific path within containers, by defining them in the pod configuration. The same volume can be shared between all containers in the same pod.

StatefulSets

One of the main challenges faced by a container orchestrator is the preservation of the state. In case of a pod restart or if the application is scaled up or down, the state may need to be redistributed. Also, the ordering of instances is important. An example of a stateful workload is a database. StatefulSets controller is used for managing the stateful applications and enforces properties of uniqueness and ordering amongst instances of a pod. 

DaemonSets

Usually, the location where pods are run is determined by the scheduler. But some pods need to be run on every single node of the cluster for use cases like log collection and storage services. This kind of pod scheduling is implemented by the feature called DaemonSet.

Kubernetes Local Install

Because Kubernetes defines a set of APIs and expected behavior, there are multiple implementations, one of which is a tool called minikube. Minikube allows you to run a Kubernetes cluster on your local machine, which makes it easy to spin up complex services or even test your deployments locally.  Before we start the installation, we can quickly look at some of the other options available.

Docker Desktop: If you are already using Docker Desktop, Kubernetes can be easily enabled in it by clicking the ‘Enable Kubernetes’ checkbox in the Kubernetes tab.

MicroK8s: MicroK8S is developed by Canonical. Although it was made for Ubuntu, it now supports various other Linux distros and recently started support in Windows and Mac.

Running Kubernetes In Browser: If you don't want to install Kubernetes in your local system, you can learn it in a browser. There are two options  Play with K8s and Katacoda

Minikube

Minkube is the version we will be running. Minikube will run a single node Kubernetes cluster in a virtual machine.

Steps to Install Minikube

  1. As a first step make sure that Docker is properly installed in your system, as Docker is used for creating, managing, and controlling the containers.
  2. The next step is that we need a virtual machine or hypervisor in our system like Virtualbox, HyperKit, or KVM..

             For macOS

            Install Virtualbox for Mac using Homebrew

brew cask install virtualbox

             For Linux

sudo apt install virtualbox virtualbox-ext-pack

3. The next step is to make sure that the CLI tool Kubectl is installed.

For macOS

brew install kubectl

For Linux

Download the latest version

curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl

Make the kubectl binary executable.

chmod +x ./kubectl

Move the binary into your PATH:

sudo mv ./kubectl /usr/local/bin/kubectl

4. Now let's install Minikube

For macOS

curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-darwin-amd64 \
&& chmod +x minikube &&\
sudo mv minikube /usr/local/bin/

For Linux

curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 \
&& chmod +x minikube

Add Minikube as executable to the path

sudo mkdir -p /usr/local/bin/
sudo install minikube /usr/local/bin/

Everything should work. Now start Minikube by the command 

minikube start

Running Kubernetes Commands

Let us start with checking the version of Kubectl

kubectl version

We can see that the command displays the versions of both client and server.

There are two ways to deploy a pod in Kubernetes

  1. Via Commands
  2. Via YAML file

Via Commands

We will be making use of the Nginx image to build our pod.

Let's start by creating a deployment named my-nginx from the Nginx image.

kubectl create deployment my-nginx --image nginx


You can see that we have created a deployment object named my-nginx. If the image is present in the local machine it will be used, otherwise, the image will be pulled from the remote registry.

Now let us take a look at the pod that got created using the command

kubectl get pods


We can see that one pod is created for the deployment. 

Let's look at the other objects that are created, using the command

kubectl get all


We can see all the objects that are created. The Kubectl has created a deployment object and replica set along with the pod. What happens under the hood is that when we run the create command, a deployment controller is created. This creates a replication controller, which in turn creates the pods. By default, the replica set will create a single pod. 

Now let's scale the number of replicas to 3

kubectl scale deployment my-nginx --replicas 3


We can see that our deployment got scaled. 

Let’s take a look at the objects again.

kubectl get all

Here we can see that 2 more pods are added, and the desired state of our deployment is changed to 3 pods.

Now let’s see what happens if one of these pods gets deleted. Let's delete the first pod using the following command

kubectl delete pod/my-nginx-6f68f94c7b-2dkg2


We can see that the specified pod got deleted. Let's see how this changed our deployment.

kubectl get all

Even though the pod we specified got deleted, we can see a new pod got automatically created. 

This represents the fact, if a pod gets destroyed in the cluster, the replication controller will create new ones to make the deployment transition to the desired state.

Let's finish by deleting the created deployment

kubectl delete deployment my-nginx

Via YAML file

The usage of commands is fine when we are learning and exploring Kubernetes, but when we move to production the more effective way to create resources is via a configuration file. The YAML files can be specified for different types of resource types like pod, Service, Deployment etc. For brevity, let's focus on a YAML file for the deployment.

We will configure the desired state of the Kubernetes deployment as a YAML file. The Kubectl will then create a deployment based on this file.

A sample YAML file is

apiVersion: apps/v1
kind: Deployment
metadata:
name: my-nginx
labels:
  app: nginx
spec:
replicas: 3
selector:
  matchLabels:
    app: nginx
template:
  metadata:
    labels:
      app: nginx
  spec:
    containers:
    - name: nginx
      image: nginx:1.14.2
      ports:
      - containerPort: 80

There are four parts to this manifest

  1. apiVersion: The API version is the version of Kubernetes API for the object kind we specified.
  2. kind: The Kind can be Kubernetes resource type we need to create. In this case, it is deployment.
  3. metadata: The metadata can also be different for the different resource types. In our case, we specify the name of our deployment and the label for the pods.
  4. spec: Various specifications of the resource type like the number of replicas, the container image, selector labels are specified in the spec section.

Now let's create the deployment for the file using the apply command

kubectl apply -f nginx.yaml

We can see all the objects created

Great! We can see that this is the same as the deployment we created using the create command.

kubectl apply -f. /<foldername>

can be used to run all the YAML files in a folder.

Conclusion

Kubernetes is an orchestrator that helps to scale our application by managing containers. The main advantage is that it does not limit us to a single cloud or platform. Many platforms provide Kubernetes-first support. There are many options of Kubernetes to select from like cloud vendors or distributions like Docker Enterprise, Rancher, Openshift, Canonical VMware PKs. Since everyone supports it, it has the widest adoption and biggest community among the various container orchestrators.

It should be noted that not all solutions require container orchestration. Orchestration is designed to automate the changes in scaled applications. Single server applications that do not have a very high rate of changes do not normally require this kind of orchestration.Instead, they can make use of the abstractions present out of the box in their cloud platform for orchestration.

I hope this article was helpful in understanding the basic concepts of Kubernetes. There are many more capabilities in Kubernetes. Do check out the official documentation, and play with various available commands to fully leverage the system.




Subscribe for more posts like this one