+++
title = "exploring and configuring scaling"
chapter = false
weight = 1
+++


:imagesdir: /images

== Scaling an OpenShift 4 Cluster

With OpenShift 4.0+, we now have the ability to dynamically scale the cluster size through OpenShift itself.

In this lab you will learn how to scale OpenShift 4 cluster both manually and automatically.
Applying autoscaling to an OpenShift Container Platform cluster involves deploying a ClusterAutoscaler and then deploying MachineAutoscalers for each Machine type in your cluster.

*Machine sets*
MachineSet resources are groups of machines. Machine sets are to machines as replica sets are to pods. If you need more machines or must scale them down, you change the replicas field on the machine set to meet your compute need.

The following custom resources add more capabilities to your cluster:

*Machine autoscaler*
The MachineAutoscaler resource automatically scales machines in a cloud. You can set the minimum and maximum scaling boundaries for nodes in a specified machine set, and the machine autoscaler maintains that range of nodes. The MachineAutoscaler object takes effect after a ClusterAutoscaler object exists. Both ClusterAutoscaler and MachineAutoscaler resources are made available by the ClusterAutoscalerOperator object.

*Cluster autoscaler*
This resource is based on the upstream cluster autoscaler project. In the OpenShift Container Platform implementation, it is integrated with the Machine API by extending the machine set API. You can set cluster-wide scaling limits for resources such as cores, nodes, memory, GPU, and so on. You can set the priority so that the cluster prioritizes pods so that new nodes are not brought online for less important pods. You can also set the scaling policy so that you can scale up nodes but not scale them down.

*Machine health check*
The MachineHealthCheck resource detects when a machine is unhealthy, deletes it, and, on supported platforms, makes a new machine.


=== Manual Cluster Scale Up/Down (Option 1)

Step 1:: Viewing components:

----
Login to the OpenShift web console

Select the admin console

Expand Compute

Select the *openshift-machine-api* project

Select nodes

Select machine sets

Select Machine autoscalers

----

You should see the list any existing scale compnents available for the platform:

image::ocp4-machinesets.png[image]


Step 2:: Create a Machine Set

Machine sets are single AZ constructs, for multi AZ deployments a machine set can be created in each AWS AZ.
Alternatively machine pools can be used.

----
Select Machine sets

Click on create machine set

Replace the YAML with the following

apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  creationTimestamp: '2021-07-08T16:51:09Z'
  generation: 2
  name: worker
  namespace: openshift-machine-api
  resourceVersion: '550802'
  selfLink: >-
    /apis/machine.openshift.io/v1beta1/namespaces/openshift-machine-api/machinesets/worker
  uid: b1b84b30-e00c-11eb-97d5-0248f54f61c7
spec:
  replicas: 1
  selector:
    matchLabels:
      foo: bar
  template:
    metadata:
      labels:
        foo: bar
    spec:
      providerSpec:
        value:
          ami:
            id: ami-046fe691f52a953f9
          apiVersion: awsproviderconfig.openshift.io/v1beta1
          blockDevices:
            - ebs:
                iops: 0
                volumeSize: 120
                volumeType: gp2
          deviceIndex: 0
          instanceType: m4.large
          kind: AWSMachineProviderConfig
          placement:
            availabilityZone: us-wast-2a
            region: us-east-1

Click Save
----

This will create a machine set for a worker node in us-west-2

Step 3:: Create a machine autoscaler

----
Select machine auto scaler

Create machine autoscaler

Change the name to us-west-2-worker

Change the scale target ref name to worker 

click save
----

This creates a definition of minimum and maximun scaling of the machin set created in Step 2.

Step 4:: manual scaling

----
Select machine sets

Select a machine set

Click on actions 

Select edit count

Increase the count

Click save
----

image::ocp4-ms-count.png[image]

----
Click the Machines tab and see the allocated machines.

Click Overview tab, it will let you know when when the machines become ready.

Click Machine Sets under Machines on the left-hand side again, you will also see the status of the machines in the set:
----

image::ocp4-ms-count3.png[image]

*You may need to wait a few minutes before the machines become ready*

In the background additional EC2 instances are being provisioned and then registered and configured to participate in the cluster, so yours may still show 1/3.

You can also view this in the CLI:


=== Managing nodes via the CLI

Step 1:: list machine sets

----
oc get machinesets -n openshift-machine-api
----

This should return something like:
----
NAME                                   DESIRED   CURRENT   READY   AVAILABLE   AGE
cluster-3e5f-kqr2b-worker-us-east-2a   3         3         3       3           24h
cluster-3e5f-kqr2b-worker-us-east-2b   1         1         1       1           24h
cluster-3e5f-kqr2b-worker-us-east-2c   1         1         1       1           24h
----

Step 2:: list nodes

----
oc get nodes
----


Step 3:: scaling via CLI

You can alter the Machine Set count in several ways in the web UI console,
but you can also perform the same operation via the CLI by using the oc edit
command on the machineset in the openshift-machine-api project.

----
oc get machineset -n openshift-machine-api
----

----
NAME                                   DESIRED   CURRENT   READY   AVAILABLE   AGE
cluster-4c7b-lkw4d-worker-us-east-2a   1         1         1       1           6h17m
cluster-4c7b-lkw4d-worker-us-east-2b   1         1         1       1           6h17m
cluster-4c7b-lkw4d-worker-us-east-2c   1         1         1       1           6h17m
----

you can use oc edit machineset to open the YAML in vi and update it, you will need to collect the machine set name from oc get above.

----
oc edit machineset cluster-4c7b-lkw4d-worker-us-east-2a -n openshift-machine-api
----

- You will see the following text:

```
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  annotations:
    autoscaling.openshift.io/machineautoscaler: openshift-machine-api/autoscale-us-east-2a-ts7rr
    machine.openshift.io/cluster-api-autoscaler-node-group-max-size: "4"
    machine.openshift.io/cluster-api-autoscaler-node-group-min-size: "1"
  creationTimestamp: "2019-05-13T20:34:26Z"
  generation: 9
  labels:
    machine.openshift.io/cluster-api-cluster: cluster-3e5f-kqr2b
  name: cluster-3e5f-kqr2b-worker-us-east-2a
  namespace: openshift-machine-api
  resourceVersion: "446823"
  selfLink: /apis/machine.openshift.io/v1beta1/namespaces/openshift-machine-api/machinesets/cluster-3e5f-kqr2b-worker-us-east-2a
  uid: 80644a16-75be-11e9-bb7c-02f7ee4a116e
spec:
  replicas: 1
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: cluster-3e5f-kqr2b
      machine.openshift.io/cluster-api-machine-role: worker
      machine.openshift.io/cluster-api-machine-type: worker
      machine.openshift.io/cluster-api-machineset: cluster-3e5f-kqr2b-worker-us-east-2a
  template:
(...)
```


- The above is just an example of the entire machine set configuration. Update `replicas` from 1 to 2
- Save your changes simply save and quit from your editor.
OpenShift will now patch the configuration. You should see that your modified
machine set (depending on which one you edited) will be confirmed:

- Once that has been changed, you can view the outcome here:


----
oc get machinesets -n openshift-machine-api
----

```
NAME                                   DESIRED   CURRENT   READY   AVAILABLE   AGE
cluster-3e5f-kqr2b-worker-us-east-2a   3         3         3       3           24h
cluster-3e5f-kqr2b-worker-us-east-2b   1         1         1       1           24h
cluster-3e5f-kqr2b-worker-us-east-2c   1         1         1       1           24h
```

Again, before you move forward, return the `replica` count back to 1, using the same method as above.


=== Automatic Cluster Scale Up via CLI

You can edit exiting machine autoscalers in a similar way to how you modified machine sets above.


==== Define a MachineAutoScaler

- Configure a MachineAutoScaler - you'll need to fetch the following YAML file:

```
[~] $ wget https://raw.githubusercontent.com/RedHatWorkshops/openshiftv4-workshop/master/solutions/machine-autoscale-example.yaml

```
The file has the following contents:

```
kind: List
metadata: {}
apiVersion: v1
items:
- apiVersion: "autoscaling.openshift.io/v1beta1"
  kind: "MachineAutoscaler"
  metadata:
    generateName: autoscale-<aws-region-az>-
    namespace: "openshift-machine-api"
  spec:
    minReplicas: 1
    maxReplicas: 4
    scaleTargetRef:
      apiVersion: machine.openshift.io/v1beta1
      kind: MachineSet
      name: <clusterid>-worker-<aws-region-az>
- apiVersion: "autoscaling.openshift.io/v1beta1"
  kind: "MachineAutoscaler"
  metadata:
    generateName: autoscale-<aws-region-az>-
    namespace: "openshift-machine-api"
  spec:
    minReplicas: 1
    maxReplicas: 4
    scaleTargetRef:
      apiVersion: machine.openshift.io/v1beta1
      kind: MachineSet
      name: <clusterid>-worker-<aws-region-az>
- apiVersion: "autoscaling.openshift.io/v1beta1"
  kind: "MachineAutoscaler"
  metadata:
    generateName: autoscale-<aws-region-az>-
    namespace: "openshift-machine-api"
  spec:
    minReplicas: 1
    maxReplicas: 4
    scaleTargetRef:
      apiVersion: machine.openshift.io/v1beta1
      kind: MachineSet
      name: <clusterid>-worker-<aws-region-az>
```

- Check the MachineSets with the CLI, you noticed that they all had the format of:

```
<clusterid>-worker-<aws-region-az>

```

- MachineAutoscaler resources must be defined for each region-AZ that you want to
autoscale.
- Using the example output and MachineSets above, and selecting "us-east-2a"
as the region we're going to autoscale into, you would need to modify the YAML
file to look like the following:
- To ensure you make no mistakes, here is the command you can use to update the yaml

```
$ export CLUSTER_NAME=$(oc get machinesets -n openshift-machine-api | awk -F'-worker-' 'NR>1{print $1;exit;}')
$ export REGION_NAME=us-east-2a

$ sed -i s/\<aws-region-az\>/$REGION_NAME/g machine-autoscale-example.yaml
$ sed -i s/\<clusterid\>/$CLUSTER_NAME/g machine-autoscale-example.yaml
```

- Here is the working sample of an MachineAutoScaler:

```
[~] $ cat machine-autoscale-example.yaml
kind: List
metadata: {}
apiVersion: v1
items:
- apiVersion: "autoscaling.openshift.io/v1beta1"
  kind: "MachineAutoscaler"
  metadata:
    generateName: autoscale-us-east-2a-
    namespace: "openshift-machine-api"
  spec:
    minReplicas: 1
    maxReplicas: 4
    scaleTargetRef:
      apiVersion: machine.openshift.io/v1beta1
      kind: MachineSet
      name: cluster-4c7b-lkw4d-worker-us-east-2a
- apiVersion: "autoscaling.openshift.io/v1beta1"
  kind: "MachineAutoscaler"
  metadata:
    generateName: autoscale-us-east-2a-
    namespace: "openshift-machine-api"
  spec:
    minReplicas: 1
    maxReplicas: 4
    scaleTargetRef:
      apiVersion: machine.openshift.io/v1beta1
      kind: MachineSet
      name: cluster-4c7b-lkw4d-worker-us-east-2a
- apiVersion: "autoscaling.openshift.io/v1beta1"
  kind: "MachineAutoscaler"
  metadata:
    generateName: autoscale-us-east-2a-
    namespace: "openshift-machine-api"
  spec:
    minReplicas: 1
    maxReplicas: 4
    scaleTargetRef:
      apiVersion: machine.openshift.io/v1beta1
      kind: MachineSet
      name: cluster-4c7b-lkw4d-worker-us-east-2a
```

NOTE: If you aren't deployed into this region, or don't want to use us-east-2a, adapt the instructions to suit.

- **Make sure** that you properly modify both generateName and name.
  * Note which one has the <clusterid> and which one does not.
  * Note that generateName has a trailing hyphen.
  * You can specify the minimum and maximum quantity of nodes that are allowed
  to be created by adjusting the minReplicas and maxReplicas.

- You do not have to define a MachineAutoScaler for each MachineSet. But remember
that each MachineSet corresponds to an AWS region/AZ. So, without having multiple
MachineAutoScalers, you could end up with a cluster fully scaled out in a single
AZ. If that's what you're after, it's fine. However if AWS has a problem in that
AZ, you run the risk of losing a large portion of your cluster.

NOTE: You should probably choose a small-ish number for maxReplicas. The next lab
will autoscale the cluster up to that maximum. You're paying for the EC2 instances.


- Once the file has been modified appropriately, you can now create the autoscaler:

```
$ oc create -f machine-autoscale-example.yaml -n openshift-machine-api
```

==== Define a ClusterAutoscaler

- Define a ClusterAutoscaler, this configures some boundaries and behaviors for
how the cluster will autoscale. An example definition file can be found at:

```
https://raw.githubusercontent.com/RedHatWorkshops/openshiftv4-workshop/master/solutions/cluster-autoscaler.yaml
```

- This definition is set for a maximum of 20 workers, but we need to reduce that
with our labs to minimize the cost. Let's first download that file:

```
[~] $ wget https://raw.githubusercontent.com/RedHatWorkshops/openshiftv4-workshop/master/solutions/cluster-autoscaler.yaml

```

- Modify the max number of replicas:

```
$ sed -i s/20/10/g cluster-autoscaler.yaml
```

- Here is an example of ClusterAutoscaler yaml.

```
[~] $ cat machine-autoscale-example.yaml
apiVersion: "autoscaling.openshift.io/v1"
kind: "ClusterAutoscaler"
metadata:
  name: "default"
spec:
  resourceLimits:
    maxNodesTotal: 10
  scaleDown:
    enabled: true
    delayAfterAdd: 10s
    delayAfterDelete: 10s
    delayAfterFailure: 10s
```

- Create the ClusterAutoscaler with the following command:

```
$ oc create -f cluster-autoscaler.yaml
clusterautoscaler.autoscaling.openshift.io/default created
```

NOTE: The ClusterAutoscaler is not a namespaced resource -- it exists at the cluster scope.


==== Define a Job

The following example YAML file defines a Job:

https://raw.githubusercontent.com/openshift/training/master/assets/job-work-queue.yaml

It will produce a massive load that the cluster cannot handle, and will force the
autoscaler to take action (up to the maxReplicas defined in your ClusterAutoscaler YAML).

NOTE: If you did not scale down your machines earlier, you may have too much capacity to trigger an autoscaling event. Make sure you have no more than 3 total workers before continuing.

- Create a project to hold the resources for the Job, and switch into it:

```
$ oc adm new-project autoscale-example && oc project autoscale-example
Created project autoscale-example
Now using project "autoscale-example" on server "{{API_URL}}".
```

==== Open Grafana

- Go to OpenShift web console
- Click `Monitoring` --> `Dashboards`
- This will open a new browser tab for Grafana. You will also get a certificate
error similar to the first time you logged in.
- Click `Advance` when you see the SSL certificate error.
- Click `Process to ...` link
- Click `Log in with OpenShift`
- Click `htpasswd`
- Enter your provided admin username and password and click login
- CLick `Allow selected permissions`
- you will see the Grafana homepage.
Grafana is configured to use an OpenShift user and inherits permissions of that user for accessing cluster information.

- Click the dropdown on `Home` and choose `Kubernetes / Compute Resources / Cluster`. Leave this browser window open while you start the Job so that you can observe the CPU utilization of the cluster rise:

image::ocp4-grafana.png[image]

==== Force an Autoscaling Event

- Go to web terminal, create the Job:

```
$ oc create -n autoscale-example -f https://raw.githubusercontent.com/openshift/training/master/assets/job-work-queue.yaml
job.batch/work-queue-qncs2 created
```

- Check status of the Job. It will create a lot of Pods:

```
$ oc get pod -n autoscale-example
NAME                     READY     STATUS    RESTARTS   AGE
work-queue-qncs2-26x9c   0/1       Pending   0          33s
work-queue-qncs2-28h6r   0/1       Pending   0          33s
work-queue-qncs2-2tdz9   0/1       Pending   0          33s
work-queue-qncs2-526hl   0/1       Pending   0          33s
work-queue-qncs2-55nr7   0/1       Pending   0          33s
work-queue-qncs2-5d98k   0/1       Pending   0          33s
work-queue-qncs2-7pd5p   0/1       Pending   0          31s
work-queue-qncs2-8k76z   0/1       Pending   0          32s
(...)
```

- After a few moments, look at the list of Machines:

```
$ oc get machines -n openshift-machine-api
NAME                                          INSTANCE              STATE     TYPE        REGION      ZONE         AGE
beta-190305-1-79tf5-master-0                  i-080dea906d9750737   running   m4.xlarge   us-east-2   us-east-2a   26h
beta-190305-1-79tf5-master-1                  i-0bf5ad242be0e2ea1   running   m4.xlarge   us-east-2   us-east-2b   26h
beta-190305-1-79tf5-master-2                  i-00f13148743c13144   running   m4.xlarge   us-east-2   us-east-2c   26h
beta-190305-1-79tf5-worker-us-east-2a-8dvwq   i-06ea8662cf76c7591   running   m4.large    us-east-2   us-east-2a   2m7s  <--------
beta-190305-1-79tf5-worker-us-east-2a-9pzvg   i-0bf01b89256e7f39f   running   m4.large    us-east-2   us-east-2a   2m7s  <--------
beta-190305-1-79tf5-worker-us-east-2a-vvddp   i-0e649089d42751521   running   m4.large    us-east-2   us-east-2a   2m7s  <--------
beta-190305-1-79tf5-worker-us-east-2a-xx282   i-07b2111dff3c7bbdb   running   m4.large    us-east-2   us-east-2a   26h
beta-190305-1-79tf5-worker-us-east-2b-hjv9c   i-0562517168aadffe7   running   m4.large    us-east-2   us-east-2b   26h
beta-190305-1-79tf5-worker-us-east-2c-cdhth   i-09fbcd1c536f2a218   running   m4.large    us-east-2   us-east-2c   26h
```

- You should see a scaled-up cluster with three new additions as worker nodes in
us-east-2a, you can see the ones that have been auto-scaled from their age.

- Depending on when you run the command, your list may show all running workers,
or some pending. After the Job completes, which could take anywhere from a few
minutes to ten or more (depending on your ClusterAutoscaler size and your
MachineAutoScaler sizes), the cluster should scale down to the original count of
worker nodes. You can watch the output with the following (runs every 10s)-

```
$ watch -n10 'oc get machines -n openshift-machine-api'
```

- In Grafana, be sure to click the autoscale-example project in the graphs

image::granfana-autoscale.png[image]