Skip to content

Latest commit

 

History

History
283 lines (200 loc) · 10.3 KB

aks-cluster-autoscaler.md

File metadata and controls

283 lines (200 loc) · 10.3 KB

Autoscaling using Horizontal Pod Autoscaler and Cluster Autoscaler

Estimated Duration: 60 minutes NOTE: You need to fulfill these requirements and AKS Basic Cluster to complete this exercise.

For autoscaling you are given the following requirements:

  • Cluster autoscaler should be enabled on both the System and User mode nodepools
    • User pool should have a min count of 1 and a max count of 3
  • Since we are rapid testing we want the cluster autoscaler to:
    • Scan every 5 seconds for scale up
    • Scale down when a node is not needed for 1 minute
    • Allow scale down 1m after scale up

You are asked to complete the following tasks:

  1. Configure autoscaling for user nodepool based on the requirements above
  2. Validate pod autoscaling is working
  3. Validate cluster autoscaling is working

Configure Autoscaling on Nodepools

In prior steps we created a basic AKS cluster. You need to enable the cluster autoscaler. One of the requirements is to adjust the autoscaler settings, which we can do using the AKS Cluster Autoscale Profile.

Set the Environment Variables used in the lookups

INITIALS=abc
RG=aks-$INITIALS-rg
CLUSTER_NAME=aks-$INITIALS

First, get the nodepool name:

NODEPOOL_NAME=$(az aks nodepool list -g $RG --cluster-name $CLUSTER_NAME --query '[].name' -o tsv)

Based on the requirements, enable the cluster autoscaler on the node pool to min of 1 and max of 3

az aks nodepool update --resource-group $RG --cluster-name $CLUSTER_NAME \
--name $NODEPOOL_NAME --enable-cluster-autoscaler --min-count 1 --max-count 3

You can confirm the config was applied with the following command

az aks nodepool show -g $RG --cluster-name $CLUSTER_NAME -n $NODEPOOL_NAME -o yaml \
| grep 'enableAutoScaling\|minCount\|maxCount'

enableAutoScaling: true
maxCount: 3
minCount: 1

Update Cluster Autoscale Profile

Update the cluster with the new autoscaler profile settings. This command updates the autoscaler profile to scan for scaling opportunities every 30 seconds, wait for 1 minute before scaling down unneeded nodes, and delay scale-down actions for 1 minute after adding new nodes.

az aks update -g $RG -n $CLUSTER_NAME \
--cluster-autoscaler-profile "scan-interval=30s,scale-down-unneeded-time=1m,scale-down-delay-after-add=1m"

You can verify your updates with the following command:

az aks show -g $RG -n $CLUSTER_NAME -o yaml --query autoScalerProfile

Deploy Horizonal Pod Autoscaler (HPA)

To test the Cluster Autoscaler, we first need to add an HPA we need to create a deployment and a service with resource limits set, and then push the app pods beyond those limits. You can use any image for this, as long as you understand the resource usage characteristics so that you can appropriately set the HPA configuration.

  1. Confirm the helloword deployment is running

    kubectl get all -n helloworld
  2. In case it is not running, run the following commands to create a namespace and deploy hello world application:

    kubectl create namespace helloworld
    kubectl apply -f manifests/aks-helloworld-basic.yaml -n helloworld
  3. Run the following command to verify deployment and service has been created. Re-run command until pod shows a STATUS of Running and EXTERNAL-IP of the service shows a value.

    kubectl get all -n helloworld
  4. Copy the external IP value or use this command to save it into a variable:

    EXTERNAL_IP=$(kubectl get svc aks-helloworld -n helloworld -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
  5. Run curl command to confirm the service is reachable on that address:

    curl -L http://$EXTERNAL_IP
  6. Deploy the Horizontal Pod Autoscaler

    kubectl apply -f manifests/hpa.yaml -n helloworld
  7. In the current terminal watch the status of the HPA and the pod count:

    kubectl get hpa,pods,nodes -n helloworld

Test the Cluster Autoscaler

The Cluster Autoscaler is looking for pods that are in a "Pending" state because there aren't enough nodes to handle the requested resources. To test it, we can just play around with the cpu request size and the max replicas in the HPA. Lets set the request and limit size to 800m cores. This should cause the HPA to create up to 5 pods under load, so we'll quickly spill over the current single node.

Run the command below to set the request and limits for the deployment to 800 millicores (0.8 cores):

kubectl patch deployment aks-helloworld -n helloworld \
-p '{"spec": {"template": {"spec": {"containers": [{"name": "aks-helloworld", "resources": {"requests": {"cpu": "800m"}, "limits": {"cpu": "800m"}}}]}}}}'

Check the number of nodes:

kubectl get nodes

If there are already 3 nodes running, one way to scale down temporarily is to run this command (note that it may take a few minutes for the nodes count to reduce):

az aks nodepool update --resource-group $RG --cluster-name $CLUSTER_NAME \
--name $NODEPOOL_NAME --update-cluster-autoscaler --min-count 1 --max-count 1

Once you confirm there is only one node ready, then run this command to revert the autoscaler settings to a max of 3 nodes:

az aks nodepool update --resource-group $RG --cluster-name $CLUSTER_NAME \
--name $NODEPOOL_NAME --update-cluster-autoscaler --min-count 1 --max-count 3

Confirm the settings have been reverted:

az aks nodepool show -g $RG --cluster-name $CLUSTER_NAME -n $NODEPOOL_NAME -o yaml | grep 'enableAutoScaling\|minCount\|maxCount'

On a separate terminal, run and shell into an Ubuntu pod:

kubectl run -it --rm ubuntu --image=ubuntu -- bash

Run the following in the pod to install ApacheBench:

apt update && apt install -y apache2-utils 

Run a load against the external IP of your "helloworld" service (keep the trailing / in the url below):

EXTERNAL_IP=<EXTERNAL+IP>
ab -t 240 -c 50 -n 200000 http://$EXTERNAL_IP/

You should get output like the following:

Benchmarking 10.140.0.5 (be patient)
Completed 10000 requests
Completed 20000 requests
Completed 30000 requests
Completed 40000 requests
Completed 50000 requests
Completed 60000 requests
Completed 70000 requests
Completed 80000 requests
Completed 90000 requests
Completed 100000 requests
Finished 100000 requests

Next, list the HPA, pods, and nodes:

kubectl get hpa,pods,nodes -n helloworld

You should see the targets in the HPA output increasing. Once it reaches over 50%, then the replicas should increase.

NAME                                                 REFERENCE                   TARGETS           MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/aks-helloworld   Deployment/aks-helloworld   0%/50%, 29%/50%   1         5         4          44h

NAME                                  READY   STATUS    RESTARTS   AGE
pod/aks-helloworld-67f4dc9d98-wnvwr   1/1     Running   0          14m
pod/aks-helloworld-67f4dc9d98-zr2rs   1/1     Running   0          13m
pod/aks-helloworld-bd8fc7bc6-gr8j4    0/1     Pending   0          6m21s
pod/aks-helloworld-bd8fc7bc6-zhvbk    1/1     Running   0          6m21s

If you see pods in Pending state that should eventually trigger the creation of new nodes. Check the status of the pod using:

kubectl get pod <pod-name> -o yaml -n helloworld

You should see a status with the error message below:

status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2024-11-22T18:20:44Z"
    message: '0/3 nodes are available: 1 node(s) had untolerated taint {ToBeDeletedByClusterAutoscaler:
      1732299539}, 2 Insufficient cpu. preemption: 0/3 nodes are available: 1 Preemption
      is not helpful for scheduling, 2 No preemption victims found for incoming pod.'
    reason: Unschedulable
    status: "False"
    type: PodScheduled
  phase: Pending

Alternatively you may check the events in the namespace using:

kubectl events -n helloworld

And you should see a similar warning:

Warning   FailedScheduling               Pod/aks-helloworld-bd8fc7bc6-lvkrc       0/3 nodes are available: 1 node(s) had untolerated taint {ToBeDeletedByClusterAutoscaler: 1732299539}, 2 Insufficient cpu. preemption: 0/3 nodes are available: 1 Preemption is not helpful for scheduling, 2 No preemption victims found for incoming pod.

The pending pods should increase the provisioned nodes up to the maximum (3). If there are no new nodes allocated, check the node cpu requests and limits using:

kubectl describe nodes | grep -E -A 6 "Resource|Requests|Limits"

Cpu limits may be over 100%, but the requests should be under 100% for all nodes:

  Resource           Requests     Limits
  --------           --------     ------
  cpu                1220m (64%)  1720m (90%)
  memory             540Mi (10%)  2786Mi (55%)

Depending on the node size, if the requests % is not high enough then you may need to increase the request/limit size of the deployment, e.g. to 1000m cores.

kubectl patch deployment aks-helloworld -n helloworld \
-p '{"spec": {"template": {"spec": {"containers": [{"name": "aks-helloworld", "resources": {"requests": {"cpu": "1000m"}, "limits": {"cpu": "1000m"}}}]}}}}'

Rerun the load test if needed in your apachebench terminal:

ab -t 240 -c 50 -n 200000 http://EXTERNAL_IP/

Tuning

As you probably saw above, CPU based scaling can sometimes be a bit slow. There are a lot of factors that come into play that you can tune. As you saw above, there's the autoscaler profile settings that can be adjusted. To improve the time taken to add a node to the cluster, you can look at the autoscale mode, which will let you scale down by stopping but not deleting nodes (aka deallocation mode). This way a new node will not need to be provisioned, but rather an existing deallocated node just needs to be started.

Cleanup

Before moving on to next lab, disable cluster autoscaler:

az aks nodepool update --resource-group $RG --cluster-name $CLUSTER_NAME --name $NODEPOOL_NAME --disable-cluster-autoscaler