Archive for May 13, 2022

Few basic things about ECS and Kubernetes


Important things to note about ECS and Kubernetes

Few basic things about ECS and Kubernetes


Basic Information about the AWS ECS and Kubernetes and there Components.

AWS ECS Basic Information:
AWS ECS is the Docker-suitable container orchestration solution from Amazon. It allows us to run containerised applications on EC2 instances and scale both of them. The below architecture shows the high-level information about ECS.




As shown above, ECS Clusters consist of Tasks which run in Docker containers, and container instances, among many other components. 

Here are some AWS services commonly used with ECS:

Elastic Load Balancer: This component can route traffic to containers. 3 kinds of load balancing are available: application, network and classic.

Elastic Block Store: This service provides persistent block storage for ECS tasks (workloads running in containers).

CloudWatch: This service collects metrics from ECS. Based on CloudWatch metrics, ECS services can be scaled up or down.

Virtual Private Cloud: An ECS cluster runs within a VPC. A VPC can have one or more subnets.

CloudTrail: This service can log ECS API calls. Details captured include type of request made to Amazon ECS, source IP address, user details, etc.

Elastic File System EFS: One can use EFS file system to mount the Volumes across the Instances running under ECS Cluster to have logs as an example at one place. Note: An Amazon EFS File System can only have mount targets in one VPC at a time.

ECS, which is provided by Amazon as a service, is composed of multiple built-in components which enable us to create clusters, tasks, and services:

State Engine: A container environment can consist of many EC2 container instances and containers. With hundreds or thousands of containers, it is necessary to keep track of the availability of instances to serve new requests based on CPU, memory, load balancing, and other characteristics. The state engine is designed to keep track of available hosts, running containers, and other functions of a cluster manager.

Schedulers: These components use information from the state engine to place containers in the optimal EC2 container instances. The batch job scheduler is used for tasks that run for a short period of time. The service scheduler is used for long running apps. It can automatically schedule new tasks to an ELB.

Cluster: This is a logical placement boundary for a set of EC2 container instances within an AWS region. A cluster can span multiple availability zones (AZs), and can be scaled up/down dynamically. A dev/test environment may have 2 clusters: 1 each for production and test.

Tasks: A task is a unit of work. Task definitions, written in JSON, specify containers that should be co-located (on an EC2 container instance). Though tasks usually consist of a single container, they can also contain multiple containers.

Services: This component specifies how many tasks should be running across a given cluster. You can interact with services using their API, and use the service scheduler for task placement.
Note that ECS only manages ECS container workloads – resulting in vendor lock-in. There’s no support to run containers on infrastructure outside of EC2, including physical infrastructure or other clouds such as Google Cloud Platform and Microsoft Azure. The advantage, of course, is the ability to work with all the other AWS services like Elastic Load Balancers, CloudTrail, CloudWatch etc.

Kubernetes Basic Information:
Kubernetes is a portable, extensible, open-source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation. It has a large, rapidly growing ecosystem. Kubernetes services, support, and tools are widely available.
Kubernetes was introduced in 2014 as an open source version of the internal Google orchestrator Borg. 2017 saw an increase in Kubernetes adoption by enterprise and by 2018, it has become widely adopted across diverse businesses, from software developers to airline companies. 
One of the reasons why Kubernetes gained popularity so fast is its open source architecture and an incredible number of manuals, articles and support provided by its loyal community.
There are a number of components associated with a Kubernetes cluster. 
The master node places container workloads in user pods on worker nodes or itself. 
The other components include the following:

Etcd: This component stores configuration data which can be accessed by the Kubernetes master’s API Server by simple HTTP or JSON API.

API Server: This component is the management hub for the Kubernetes master node. It facilitates communication between the various components, thereby maintaining cluster health.

Controller Manager: This component ensures that the clusters desired state matches the current state by scaling workloads up and down.

Scheduler: This component places the workload on the appropriate node.

Kubelet: This component receives pod specifications from the API Server and manages pods running in the host.

Worker Node(s): A node is a machine – physical or virtual. A node is a worker machine and this is were containers are hosted. It was also known as Minions in the past. It can run multiple PODs depends upon the Instance Configuration.

Cluster: A cluster is a set of nodes grouped together. This way even if one node fails you have your application still accessible from the other nodes. Moreover having multiple nodes helps in sharing load as well.

Master Node: The master is another node with Kubernetes installed in it, and is configured as a Master. The master watches over the nodes in the cluster and is responsible for the actual orchestration of containers on the worker nodes.

Container Runtime Engine: The container runtime is the underlying software that is used to run containers. In our case it happens to be Docker.

Kubelet: Kubelet is the agent that runs on each node in the cluster. The agent is responsible for making sure that the containers are running on the nodes as expected.

Kube-Proxy: An additional component on the Node is the kube-proxy. It takes care of networking within Kubernetes.

The below architecture shows the high-level information about Kubernetes.



Containers within a pod can interact with each other in various ways:
Network: Containers can access any listening ports on containers within the same pod, even if those ports are not exposed outside the pod.
Shared Storage Volumes: Containers in the same pod can be given the same mounted storage volumes, which allows them to interact with the same files.
SharedProcessNamespace: Process namespace sharing can be enabled by setting in the pod spec. This allows containers within the pod to interact with, and signal, one another's processes.

The following list provides some other common terms associated with Kubernetes:
Pods: Kubernetes deploys and schedules containers in groups called pods. Containers in a pod run on the same node and share resources such as filesystems, kernel namespaces, and an IP address.

Deployments: These building blocks can be used to create and manage a group of pods. Deployments can be used with a service tier for scaling horizontally or ensuring availability.

Services: An abstraction layer which provides network access to a dynamic, logical set of pods. These are endpoints that can be addressed by name and can be connected to pods using label selectors. The service will automatically round-robin requests between pods. Kubernetes will set up a DNS server for the cluster that watches for new services and allows them to be addressed by name. Services are the "external face" of your container workloads.

Labels: These are key-value pairs attached to objects. They can be used to search and update multiple objects as a single set. Ex: Labels are properties attached to each item. So you add properties to each item for their class, kind and color.

Selectors: Labels and Selectors are a standard method to group things together. Ex: Lat’s say you have a set of different species. A user wants to be able to filter them based on different criteria using labels based on the colour, type etc and wants to retrieve those filtered items and then this is where the Selectors come in Picture.

Important Terminologies in Kubernetes:
Ingress: Ingress is actually NOT a type of service. Instead, it sits in front of multiple services and act as a smart router or EntryPoint into your cluster and comes under Network Policies.

ReplicaSets: It is one of the Kubernetes controllers used to make sure that we have a specified number of pod replicas running (A controller in Kubernetes is what takes care of tasks to make sure the desired state of the cluster matches the observed state).

Secrets: Secrets are used to store sensitive information, like passwords or keys. They are similar to configMaps, except that they are stored in an encoded or hashed format.

StatefulSets (different ways to deploy your application): StatefulSet is also a Controller but unlike Deployments, it doesn’t create ReplicaSet rather itself creates the Pod with a unique naming convention. e.g. If you create a StatefulSet with name counter, it will create a pod with name counter-0, and for multiple replicas of a statefulset, their names will increment like counter-0, counter-1, counter-2, etc. Every replica of a stateful set will have its own state, and each of the pods will be creating its own PVC(Persistent Volume Claim). So a statefulset with 3 replicas will create 3 pods, each having its own Volume, so total 3 PVCs.

Services: A service creates an Abstraction Layer on top of a set of Replica PODs. You can access the Service rather then accessing the PODs directly, so as PODs come and go, you get Uninterrupted, Dynamic Access to whatever Replicas are up at that time.
Service Types:
ClusterIP: Service is exposed within the cluster using its own IP address and it can be located via the cluster DNS using the service name.
NodePort: Service is exposed externally on a listening port on each node in the cluster
LoadBalancer: Service is exposed via a load balancer created on a cloud platform:
The cluster must be set to work with a cloud provider in order to use this option.
ExternalName: Service does not proxy traffic to pods, but simply provides DNS lookup for an exter- nal address:
This allows components within the cluster to lookup external resources in the same way they look up internal ones: through services.

ConfigMaps: ConfigMaps are used to pass configuration data in the form of key value pairs in Kubernetes (stores configuration data in plain text). When a POD is created, inject the ConfigMap into the POD, so the key value pairs are available as environment variables for the application hosted inside the container in the POD.

Deployments (different ways to deploy your application): Deployment is the easiest and most used resource for deploying your application. It is a Kubernetes controller that matches the current state of your cluster to the desired state mentioned in the Deployment manifest. e.g. If you create a deployment with 1 replica, it will check that the desired state of ReplicaSet is 1 and current state is 0, so it will create a ReplicaSet, which will further create the pod. If you create a deployment with name counter, it will create a ReplicaSet with name counter-, which will further create a Pod with name counter--. Deployments are usually used for stateless applications. However, you can save the state of deployment by attaching a Persistent Volume to it and make it stateful.

Containers: In our case it's a docker images which will run as a container.

Persistent Volumes: A Persistent Volume is a Cluster wide pool of storage volumes configured, to be used by users deploying applications on the cluster. The users can now select storage from this pool using Persistent Volume Claims. Kubernetes persistent volumes remain available outside of the pod lifecycle → this means that the volume will remain even after the pod is deleted.

Volumes: These provide storage to containers that is outside the container and can therefore exist beyond the life of the container. Containers in a pod can share volumes allowing them each to interact with the same files.

Volume Types: Kubernetes supports several types of standard storage solutions such as NFS, glusterFS, Flocker, FibreChannel, CephFS, ScaleIO or public cloud solutions like AWS EBS, Azure Disk or File or Google’s Persistent Disk.

Security: To run the container with least container or POD with least privileges, need to use this feature. If setting at the container level security context it can override the POD settings.

Taints & Tolerations: Taints and tolerations work together to ensure that pods are not scheduled onto inappropriate nodes. One or more taints are applied to a node, this marks that the node should not accept any pods that do not tolerate the taints. Tolerations are applied to pods, and allow (but do not require) the pods to schedule onto nodes with matching taints.

NameSpaces: To isolate different set of services/resources. They are a way to divide cluster resources between multiple users.

CronJobs: Executes jobs on a schedule. Set of batch jobs which needs to do perform set of task(s) and then stop.

Federation: Kubernetes Federation gives you the ability to manage deployments and services across all the Kubernetes clusters located in different regions.

Load Balancing: A LoadBalancer service is the standard way to expose a service to the internet helps in forwarding the traffic to the concerned service.

DaemonSets (different ways to deploy your application): A DaemonSet is a controller that ensures that the pod runs on all the nodes of the cluster. If a node is added/removed from a cluster, DaemonSet automatically adds/deletes the pod. Ex: Monitoring Exporters, Logs Collection Daemon etc.

Jobs: Creates one or more pods to do work and ensures that they successfully finish.

NodeAffinity: To get pods to be scheduled to specific nodes Kubernetes provides nodeSelectors and nodeAffinity. Since node affinity identifies the nodes on which to place pods via labels, we first need to add a label to the required node.

Pod-Affinity and Anti-Affinity: Pod affinity and anti-affinity allows placing pods to nodes as a function of the labels of other pods. These Kubernetes features are useful in scenarios like: an application that consists of multiple services, some of which may require that they be co-located on the same node for performance reasons; replicas of critical services shouldn't be placed onto the same node to avoid loss in the event of node failure.

Egress: Egress rules provide a whitelist for traffic coming out of the pod. Egress rules specify a list of destinations under to, as well as one or more ports.

Network Policy(Security): A Network policy is another object in the kubernetes namespace. Just like PODs, ReplicaSets or Services. You apply a network policy on selected pods. We can link a network policy to one or more pods. One can define rules within the network policy.
Solutions that Support Network Policies:
• Calico
• Romana
• Weave, Cilium and Kube-router
Solutions that Do Not Support Network Policies: Flannel

Readiness Probes: Determines whether the container is ready to serve requests → Requests will not be forwarded to the container until the readiness probe succeeds.

Liveness Probes: Determines whether the container is running properly → When a liveness probe fails, the container will be shut down or restarted, depending on its RestartPolicy.

Service Accounts: ServiceAccounts are used for allowing pods to interact with the Kubernetes API and for controlling what those pods have access to do using the API. 

Resource Request: The amount of resources a container needs to run → Kubernetes uses these values to determine whether or a not a worker node has enough resources available to run a pod.

Resource Limit: The maximum resource usage of a container → The container run time will try to prevent the container from exceeding this amount of resource usage.

Multi Container PODs: Multi-container pods are pods that have more than one container. 

Init Container: Tu use them when we wanted to check if our database server is running and post that can start the usual container for the service.

Rolling Update: Gradually rolling out a change to a set of replica pods to avoid downtime.

Rollback Update: Reverting to a previous state after an update if any issues.


Common design patterns for multi-container pods:
Ambassador: An haproxy ambassador container receives network traffic and forwards it to the main container. Ex: An ambassador container listens on a custom port and forwards the traffic to the main container's hard-coded port. A concrete example would be a configmap storing the haproxy config. Haproxy will listen on port 80 and forward the traffic to the main container, which is hard-coded to listen on any given port number.

Sidecar: A sidecar container enhances the main container in some way, adding functionality to it. Ex: A sidecar periodically syncs files in a webserver container's file system from a Git repository.

Adapter: An adapter container transforms the output of the main container. Ex: An Adapter container reads log output from the main container and transforms it.