Header Ads

Setting up ElasticSearch on Kubernetes using Helm Charts

In this assignment we will set up ElasticSearch on DigitalOceans managed K8s cluster using Helm Charts


Pre-requisites:

i) Minimum 2 node K8s cluster(v 1.14+)
In this assignment I have used a 3 node kubernetes cluster on DigitalOcean.





ii) Helm v3+

In this assignment I have used Helm 3.7


Setting up and Configuring ElasticSearch.


To set up ElasticSearch, we will use the Helm package manager, (I have used the official ElasticSearch Helm chart with some modifications to config).



Step 1: Get the charts on your machine and create namespace
Clone the charts from GitHub:

$> git clone https://github.com/engineerakki/picnic-elasticsearch-test.git

$> cd elasticsearch

$> kubectl create ns es





Step 2: Install the Helm Chart


Once you have verified the values in values.yaml file, you can install the chart by using the following command:
Explanation and rationale behind the config options can be found later in the Questions and Answer section.

$> helm install demo -n es .



This command will install the chart and bring up ElasticSearch statefulset in our K8s cluster.
It takes about 2-3 minutes for the ElasticSearch Cluster to come up.



Step 3: Test the Helm Chart.


We should now test the installed helm chart as follows

$> helm test -n es demo


Once the helm test shows Succeeded status, we are good to use our ES cluster.



Step 4: Cluster is Ready


As we have used service type as LoadBalancer, we get a Public IP for our ElasticSearch SVC.



We can check the status of ElasticSearch cluster by doing a GET request to /_cluster/health endpoint.


We can also check the resources created by this installation:












Questions and Answers.


A) Why deploy the cluster in a specific way, what is your rationale? Be prepared to justify and explain deployment configuration options.



There are multiple ways to install a ES Cluster,
I chose helm charts because:-
i)  It is easy to use and maintain in the long run

ii) Easy rollback/upgrade activities
iii) Version control of manifests and releases

iv) Can maintain charts in separate SCM repositories for easy CI/CD.
v) Very common way to install k8s based apps, with very good documentation and community support.


I also had considered ECK(https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-installing-eck.html) to install ElasticSearch



Rationale Behind important configurations:

I have changed the following options in the helm chart values.yaml file:-


i) imageTag: "7.14.1"

I have decided to use the “7.14.1” version as it is the latest stable version of ElasticSearch docker image.



ii)  esJavaOpts: "-Xmx1g -Xms1g"

This option allows us to set JVM heap size  for ElasticSearch.
Setting up heap size tells ElasticSearch you can use this much heap, but when you hit that limit you've got to tidy up(Start garbage collection).
I have set it to 1g in my cluster to cut costs, but for a prod environment we should use bigger nodes with bigger limits.

iii) requests: and limits.


Setting requests and limits on CPU and memory usage ensures that our ElasticSearch installation does not use an irrational amount of resources and allows other applications to reside on the kubernetes node.



iv) Service.type: LoadBalancer

I wanted this ElasticSearch Cluster to be accessible over the internet, hence have made the service.type as LoadBalancer. If we only want to access the ElasticSearch inside our K8s cluster we can set it to ClusterIP.



v) readinessProbe

I had to troubleshoot my ElasticSearch deployment by editing initialDelayseconds and timeOutSeconds in my values.yaml file. This was needed as maybe ElasticSearch was trying to start before all the resources and init processes were done successfully.



vi) clusterHealthCheckParams: "wait_for_status=yellow&timeout=2s"


While configuring ES, I ran into a fairly common issue of pods not being in ready state(

https://github.com/elastic/helm-charts/issues/783), and changing clusterHealthCheckParams helped.
I think this is too restrictive and ES should just add a simple HTTP check as a readiness probe.


vii) persistence:enabled


The persistence property in the values.yaml enables creation of PVC’s to be attached to the ElasticSearch pods. This is a very important setting as attached PVC’s allow ElasticSearch to retain data even after pod restarts/takeover.




viii) replicas and minimumMasternodes:


With 3 replica’s of our ES pod, it will be ensured that all the requests won’t be clogged on one single pod.


ix) antiAffinity: "hard"

Hard Anti-Affinity  means that by default pods will only be scheduled if there are enough nodes for them and that they will never end up on the same node.
In this assignment setup, 



B) Which metrics to look out for and why?

  ii)  Check ThreadPools (GET /_cat/thread_pool),
  iii) index buffer size is enough.,
  iv) Check HeapSize
  v) Check if any other process like Backups is going on in parallel. etc.



C) What would the deploy, debug, backup, update/roll-back, maintenance processes be?


Deploy: Using Helm.
Debug
  i)   Check Pod logs using kubectl.
 

Backup:  We can use tool like Curator/ArgoWorkflows to create a backup workflow/cronjob using SNAPSHOT API(https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshot-restore.html)


Update/Rollback/Maintenance:
I can use helm upgrade/rollback to handle these tasks.
The usual task is to upgrade to a new release, which can be easily achieved by helm.

No comments

Powered by Blogger.