Introduction to Kubernetes

Archie To

This article is meant for total newbies (such as myself a month ago) in Kubernetes. I will try to explain Kubernetes and some of its basic concepts in the simplest terms possible. However, we assume that you already know some basics about Docker containers. If you don’t, I recommend checking out our Docker basics article.

What is Kubernetes?

Kubernetes, also known as K8s, is an open-source platform developed by Google for managing, automating and scaling multiple containerized applications. Think about it this way, if you have multiple apps running as multiple containers on a server, you’d probably want stuffs like:

  • Resource allocation: Each app should have a certain limit of CPU and memory to consume from the server
  • Network and load balancing: Each app can be accessed by other apps or by the public (over a browser). If there is a large number of requests to an app, the request should be properly managed so the app is not crashed.
  • Storage: Each app should have a certain amount of storage mounted from an actual storage
  • Availability, updates and rollbacks: If there is something wrong with an app, there should be an equivalent version to replace it. The apps should be easily updated and rolled back without affecting other apps.
  • Secrets and configuations: Each app should have their own secrets and configurations stored and managed securely from other apps.

Kubernetes helps you with all of the issues above and many more.

Kubernetes cluster architecture

image1.png

There are two main components of a cluster: the control plane and worker nodes.

  • The control plane (sometimes referred to as a control plane node/nodes or a master node/nodes) is a set of components that together act like the brain of the cluster. It makes global decisions about that cluster such as scheduling and managing worker nodes.
  • Worker nodes: Physical or virtual machines that run your containers.

Control plane components

1. kube-apiserver

As the name implies, this is a REST API server that manages resources on your cluster in response to requests it receives. For example, you send a POST request to an endpoint to create a Pod. kubectl is a useful client for interacting with this service.

2. etcd

etcd is a consistent and highly-available key value store that stores all data about your cluster. So if you create a Pod, the information about that Pod is stored in here. That is how K8s knows about the existence of your Pod.

3. kube-scheduler

kube-scheduler specifically watches for a Pod creation event, then finds a suitable node to run this Pod. Factors taken into account include: resource requirements, hardware/software/policy constraints, affinity and anti-affinity specifications, data locality, inter-workload interference, and deadlines. For example, if your Pod has a container that is guaranteed a 100MiB memory limit and 0.5 CPU cores, kube-scheduler will find a node that can accommodate that.

4. kube-controller-manager

kube-controller-manager runs controller processes. Logically, each controller is a separate process, but to reduce complexity, they are all compiled into a single binary and run in a single process. Some of these processes include:

  • Node controller: Responsible for noticing and responding when nodes go down.
  • Job controller: Watches for Job objects that represent one-off tasks, then creates Pods to run those tasks to completion.
  • ServiceAccount controller: Create default ServiceAccounts for new namespaces.

And there are many more.

Node components

1. kubelet

An agent that runs on each node in the cluster. It makes sure that containers are running in a Pod.

The kubelet takes a set of PodSpecs that are provided through various mechanisms and ensures that the containers described in those PodSpecs are running and healthy. The kubelet doesn’t manage containers which were not created by Kubernetes.

2. kube-proxy

kube-proxy is a network proxy that runs on each node in your cluster, implementing part of the Kubernetes Service concept.

kube-proxy maintains network rules on nodes. These network rules allow network communication to your Pods from network sessions inside or outside of your cluster.

3. Container runtime

A fundamental component that empowers Kubernetes to run containers effectively. It is responsible for managing the execution and lifecycle of containers within the Kubernetes environment.

Kubernetes supports container runtimes such as containerd, CRI-O, and any other implementation of the Kubernetes CRI (Container Runtime Interface).

Pods

The smallest deployable units in Kubernetes, and usually apps are deployed as Pods. A Pod includes one primary container and often additional secondary containers to support functionalities of the first container. For example, if you deploy a Django app as a Pod, you’d like to have the main container running the Django app, an extra container running Nginx to serve static files, and possibly a container to run Redis for caching. Each pod has their own unique IP within the cluster.

Deployments

Deployments provide declarative updates for Pods and ReplicaSets. They allow you to describe the desired state of your application, such as which images to use and the number of replicas, and the Deployment controller changes the actual state to the desired state at a controlled rate. Deployments are useful for rolling out updates, rolling back to previous versions, and scaling applications.

Services

An abstraction to expose a set of Pods as a network service with a unique IP address. Services allow other Pods to reach your Pod and users to visit your Pod in the browser.

ConfigMaps

A ConfigMap stores non-confidential data in key-value pairs. Pods can consume ConfigMaps as environment variables, command-line arguments, or as configuration files in a volume.

A ConfigMap allows you to decouple environment-specific configuration from your container images, so that your applications are easily portable.

PersistentVolumes (PVs) and PersistentVolumeClaims (PVCs)

PVs define claimable storage, and PVCs are requests for that storage. PVCs consume PV resources. In essence, PVs provide the storage resources in a cluster, while PVCs allow Pods to request those resources without having to know the specifics of the underlying storage infrastructure.

Conclusion

In short, Kubernetes can be understood simply as a container orchestration tool. When you run multiple apps in containers, Kubernetes allows you to provide and manage the necessary resources for each of those apps. It is a core concept in cloud infrastructure and is used by many big tech companies such as Google, Slack, Shopify, etc.