In this blog post, we will know how does optimistic concurrency helps Kubernetes to scale and accommodate more and more plugins into it. We will also try to give you certain examples in order for us to understand them better.
Brief about Kubernetes.
Kubernetes is an orchestration tool which helps us to orchestrate different application for us. The base or heart of the Kubernetes cluster is a different type of controller which runs together in order to complete a task. As it is based on a declarative model. It means that we throw a configuration or a final state to Kubernetes and a different controller will act on it and try to reach the desired state. Different part of Kubernetes Cluster
It provides persistence to a Kubernetes cluster. Kubernetes uses this for storing each and every state and also used to provide a way for service discovery.
Abstraction layer to talk to etcd. It’s not a good idea to access ETCd directly. It applies different Validation like Authorization and Authentication to request which comes to Kubernetes
It is a simple control loop that reads the state from etcd and reacts towards it. The basic loop is:-
Step 1: Watch or reads for any changes Step 2: Change the state of the object (it may be a Kubernetes object like create a POD or any outer world objects like creating AWS s3 bucket or so on.) in order to achieve the desired state. Step 3: Update the status Step 4: Repeat with Step 1.
An example of controllers is Kubelet, Deployment, Custom Cloud Controller.
Kubernetes API server
It’s not a good thing to talk to etcd directly so we need some kind of middleman which can provide us with different more facilities. It is the only component which talks to etcd.
Brief about Optimistic Concurrency Control.
Let’s imagine a scenario where everyone trying to book a seat in a movie theatre in the month of February. As corner seats are in great demand in a valentine’s week (trust me I have heard this from somewhere). Preventing two couples from booking the same pair of seats at the same time can be done through concurrency control. There are two ways to control concurrency in this scenario:-
1. Pessimistic concurrency control
2. Optimistic concurrency control.
We are going to discuss Optimistic concurrency control as of now. Optimistic concurrency does not take precautions about collision beforehand and interrupts only and only when the collision is detected. It is done by version control. See every document has a version and if you are reading and writing to the same version then there is no collision. But if you reading a different version and while writing the version has changed then (someone else was first) then bam there is a collision. It will throw an error like this:- “Version Mismatched”. This is an indication that you need to retry the transaction again.
|Time of Booking||Couple 1||Couple 2|
|Time 1||Start BOOKING||Starting BOOKING|
|Time 2||Read Booking Information (Version V0)||Read Booking Information|
|Time3||Changes information like V0.name and V0.age||Changes information like V0.name and V0.age|
|Time 4||Write Information to DB||Modifies V0.paymentMethod|
|Time 6||A version of Booking Row Changed to V1||Write record to DB|
|Time 7||As the version changed to V1. Throws Version Mismatch Error.|
|Time 8||Restart Transaction|
Benefit Of Optimistic Concurrency Includes
- It can scale well if the throughput of the write requests is on a higher side.
Disadvantages of Optimistic Concurrency
- Once the Transaction fails we need to restart the whole process again.
How does a Pod Get Launched When you submit A POD definition to Kubernetes?
We consider a very basic Kubernetes cluster that contains only a few controllers like scheduler and kubelet for simplicity case. Optimistic concurrency in Kubernetes is handled by a key resource version which is present in each and every resource in the metadata section. Resource version changes whenever we try to update a resource.
So this is what happens when we try to create a pod and how optimistic concurrency helps us to do thing in a scalable way
Step 1: When we say
kubectl apply -f pod_manifest.yaml. Kubectl which is a Kubernetes client talks to the Kubernetes API server and creates an entry to etcd. A pod resource is created in etcd which has resource version 1 (let’s say).
Step 2: As we have different controllers running. Their job is to listen to the changes ( How does it listen to events? We can have a different blog post for this) and react to it.
Step 3: So as soon as a pod manifest is created. The Kubelet and scheduler controller rises and starts to do their work. Now both try to read the state so that it can start working on it.
Step 4: Scheduler controller as it can see that in pod spec nodeName is not there so it will start its business logic and get the appropriate node for this pod. After getting the appropriate node name it updates the pod spec with the correct node name. (Note that after updates resource version gets increased)
Step 5: Meanwhile, Kubelet sees this, and after getting the correct node name from the spec checks whether the node name equals to kubelet name or not. If it does not equal to kubelet name then it goes back to sleep. If it equals the kubelet name it starts the container and updates the status ( And hence the resource version gets increases).
Seems easy right? I have not taken different controllers like Kube proxy and some low-level intricacies. Moreover, in a production environment, no one tries to deploy a single pod. It is managed by deployment resources or so. Hence increases the complexity manifold.
Role of Optimistic concurrency in these controllers:-
As all the controllers run in a distributed independent manner. So we can think of Kubernetes as a very big distributed system. The concurrency problem can occur at any time. Let us take an example:-
|Time||Scheduler (controller 1)||Kubelet (Controller 2)|
|Time 1||Submits Pod Manifests||Submits Pod Manifests|
|Time 2||Read Pod Information(ResourceVersion V0)||Read Pod Information(ResourceVersion V0)|
|Time3||Changes information like V0.spec.schedulerName and V0.spec.nodeName||Changes information like v0.status.conditions.lastProbeTime|
|Time 4||Contacts API Server for persisting it to etcd||Modifies V0.status.phase to “Running”|
|Time 5||Writes into etcd|
|Time 6||A resource Version of Pod is V1||Contacts API Server for persisting it to etcd|
|Time 7||As the version changed to V1. Apis Throws Version Mismatch Error.|
|Time 8||Retries the whole controller logic again|
As you can imagine controllers have distributed and independent this type of pattern where multiple controllers can read and write into a single resource this scenario is very frequent. As we scale our Kubernetes in terms of functionality like introducing multiple Custom Resources into a cluster this type of issue can come anytime. So introducing Optimistic concurrency not only makes this simple ( Just retries ) and also scalable too.
If you like the article please share and subscribe.