Creating an AWS Auto Scaling Architecture with a monitoring dashboard

Introduction

In today’s world, auto scaling services have become essential to optimise on cost and provide highly available applications. The infrastructure you build should handle the traffic that comes its way. Under provisioning infrastructure leads to downtime or slowdowns during peak periods or when your application is under heavy load. Over provisioning your infrastructure for periods of high load wastes money when demand is low. A balance between the two needs to be found, and auto scaling is a good solution to this problem.

Auto scaling helps ensure that you always have the correct number of servers to handle the load of your application. Collections of servers or instances are grouped into Auto Scaling Groups. AWS allows you to specify the minimum number of servers in each auto scaling group. The Auto Scaling Group ensures that you don’t go below this minimum. You can also specify the maximum number of servers can be added to an Auto Scaling Group. AWS ensures that you don’t go over this maximum. You can also define a desired capacity or desired number of servers you want to run at a given time. AWS Auto Scaling dynamically adjusts the number of servers or running instances based on predefined metrics such as the CPU load, RAM usage, network traffic or other custom metrics you set up. When these metrics reach a threshold you specify, auto scaling actions are triggered and AWS either adds or removes instances from the Auto Scaling Group.

For example, a scaling policy based on CPU usage could be something like: “If the CPU usage is above 80% for 5 minutes, add 1 new server”. AWS Auto Scaling would remove the additional server(s) if CPU usage went down. The benefit of this that your application remains responsive even under high load, you only pay for what you use and it reduces idling costs. Auto scaling prevents servers from crashing due to being overloaded because they handle unpredictable traffic spikes efficiently.

If you’re interested in learning about the different approaches to auto scaling, checkout this article that introduces vertical, horizontal and dynamic scaling.

In this article I will show you how to scale EC2 Instances horizontally using AWS EC2 Auto Scaling. To create a dynamic and responsive auto scaling environment, we’ll use CloudFormation, CloudWatch and EC2 Auto Scaling. AWS CloudFormation is an Infrastructure as Code tool we’ll use to automate building the auto scaling infrastructure. CloudWatch is a monitoring service from AWS that will provide the eyes and ears for triggering scaling operations based on real-time performance.

Setting up Auto Scaling Architecture with CloudFormation

For the purpose of this post, we’ll create an auto scaling group with a minimum size of 1 EC2 instance and a maximum size of 3 EC2 instances, but you can use whatever values make sense for your application. We’ll use a Load Balancer to evenly distribute traffic across the instances and CloudWatch to create a monitoring dashboard for CPU and Network utilisation information for the Instances. We will use CloudFormation parameters (similar to variables in programming) to define environment-specific information like the VPC and type of EC2 instances we’re interested in. Using parameters makes the template reusable.

CloudFormation Parameters

This tutorial assumes that you have a VPC with at least two public subnets in your AWS account. Let’s create the Auto Scaling Group using CloudFormation. First, we define seven parameters. We’ll reference these parameters later in the CloudFormation template.

AWSTemplateFormatVersion: "2010-09-09"
Parameters:
  InstanceType:
    Description: EC2 instance type
    Type: String
    Default: t3.micro
    AllowedValues:
      - t3.micro
      - t3.small
      - t3.medium
  KeyName:
    Description: Name of an exisiting EC2 key pair to allow SSH access to the instances
    Type: "AWS::EC2::KeyPair::KeyName"
  LatestAmiId:
    Description: The latest Amazon Linux 2 AMI from the Parameter Store
    Type: "AWS::SSM::Parameter::Value<AWS::EC2::Image::Id>"
    Default: "/aws/service/ami-amazon-linux-latest/amzn2-ami-hvm-x86_64-gp2"
  OperatorEmail:
    Description: The email address to notify when there are any scaling activities
    Type: String
  SSHLocation:
    Description: The IP address range that can be used to SSH to the EC2 instances
    Type: String
    MinLength: 9
    MaxLength: 18
    Default: 0.0.0.0/0
    ConstraintDescription: must be a valid IP CIDR range of the form x.x.x.x/x
  Subnets:
    Description: At least two public subnets in different Availability Zones in the selected VPC
    Type: "List<AWS::EC2::Subnet::Id>"
  VPC:
    Type: "AWS::EC2::VPC::Id"
    Description: A virtual private cloud that enables resources in public subnets to connect to the internet

InstanceType limits the types of instances we can add to our auto scaling group to t3.micro, t3.small and t3.medium. This feature can be useful to keep costs low or to prevent users from spinning up expensive instances.

KeyName defines the name of an SSH Key to allow SSH access to the EC2 instances that will be created.

LatestAmiId Specify the ID of the AMI to launch. Before we can launch an EC2 instance, we need to specify which Amazon Machine Image(AMI) to use to launch the EC2 instance. An AMI is an image that provides the information required to launch an instance.

OperatorEmail: An email address to send scaling notifications to.

SSHLocation: The IP addresses that can SSH into the EC2 instances.

Subnets: This parameter will store the IDs of the subnets we want to deploy EC2 instances to.

VPC: The VPC parameter stores the ID of the VPC to create the Auto Scaling Group in.

In the next section, we’ll create the resources necessary to complete our architecture.

CloudFormation Resources

In this section, we’ll define the resources we need to setup an auto scaling architecture; Security Groups, a Load Balancer, an EC2 Auto Scaling Group, an SNS Notification Topic, and a Launch Template.

Resources:
  ELBSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: ELB Security Group
      VpcId: !Ref VPC
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: 80
          ToPort: 80
          CidrIp: 0.0.0.0/0

  EC2SecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: EC2 Security Group
      VpcId: !Ref VPC
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: 80
          ToPort: 80
          SourceSecurityGroupId:
            Fn::GetAtt:
              - ELBSecurityGroup
              - GroupId
        - IpProtocol: tcp
          FromPort: 22
          ToPort: 22
          CidrIp: !Ref SSHLocation

  EC2TargetGroup:
    Type: AWS::ElasticLoadBalancingV2::TargetGroup
    Properties:
      HealthCheckIntervalSeconds: 30
      HealthCheckProtocol: HTTP
      HealthCheckTimeoutSeconds: 15
      HealthyThresholdCount: 5
      Matcher:
        HttpCode: 200
      Name: EC2TargetGroup
      Port: 80
      Protocol: HTTP
      TargetGroupAttributes:
        - Key: deregistration_delay.timeout_seconds
          Value: "20"
      UnhealthyThresholdCount: 3
      VpcId: !Ref VPC

  ALBListener:
    Type: AWS::ElasticLoadBalancingV2::Listener
    Properties:
      DefaultActions:
        - Type: forward
          TargetGroupArn: !Ref EC2TargetGroup
      LoadBalancerArn: !Ref ApplicationLoadBalancer
      Port: 80
      Protocol: HTTP

  ApplicationLoadBalancer:
    Type: AWS::ElasticLoadBalancingV2::LoadBalancer
    Properties:
      Scheme: internet-facing
      Subnets: !Ref Subnets
      SecurityGroups:
        - !GetAtt ELBSecurityGroup.GroupId

  LaunchTemplate:
    Type: AWS::EC2::LaunchTemplate
    Properties:
      LaunchTemplateName: !Sub ${AWS::StackName}-launch-template
      LaunchTemplateData:
        ImageId: !Ref LatestAmiId
        InstanceType: !Ref InstanceType
        KeyName: !Ref KeyName
        SecurityGroupIds:
          - !Ref EC2SecurityGroup
        UserData:
          Fn::Base64: !Sub |
            #!/bin/bash
            yum update -y
            yum install -y docker
            systemctl start docker
            docker pull your_dockerhub_username/image
            docker run  --rm -p 80:5000 your_dockerhub_username/image
  NotificationTopic:
    Type: AWS::SNS::Topic
    Properties:
      Subscription:
        - Endpoint: !Ref OperatorEmail
          Protocol: email

  WebServerGroup:
    Type: AWS::AutoScaling::AutoScalingGroup
    Properties:
      LaunchTemplate:
        LaunchTemplateId: !Ref LaunchTemplate
        Version: !GetAtt LaunchTemplate.LatestVersionNumber
      MaxSize: "3"
      MinSize: "1"
      DesiredCapacity: "1"
      NotificationConfigurations:
        - TopicARN: !Ref NotificationTopic
          NotificationTypes:
            [
              "autoscaling:EC2_INSTANCE_LAUNCH",
              "autoscaling:EC2_INSTANCE_LAUNCH_ERROR",
              "autoscaling:EC2_INSTANCE_TERMINATE",
              "autoscaling:EC2_INSTANCE_TERMINATE_ERROR",
            ]
      TargetGroupARNs:
        - !Ref EC2TargetGroup
      VPCZoneIdentifier: !Ref Subnets

Let’s go through the contents of the template above. We first define the ELBSecurityGroup and EC2SecurityGroups security groups to allow incoming HTTP traffic to the Load Balancer and HTTP and SSH to the EC2 Instances.

Next, we define a Target Group. In this context, a Target Group is a collection of EC2 instances that the load balancer will direct traffic towards. The EC2TargetGroup resource will perform HTTP health checks on port 80 every 30 seconds. It will expect a successful response with HTTP status code 200 and will consider targets unhealthy after 3 consecutive failures. The target group will reside within the VPC referenced by the VPC resource we will define shortly and will deregister unhealthy targets after a 20-second delay.

The next section defines an Listener for the Load Balancer. The ALBListener acts as an entry point for web traffic on port 80 and enables load balancing that ensures that traffic is distributed across the healthy available instances in the EC2TargetGroup.

After defining the listener, we define the ApplicationLoadBalancer and attach the ELBSecurityGroup we defined earlier to it. Next, we define a Launch Template. A Launch Template is a blueprint that defines the configuration for creating and launching EC2 Instances. It serves as a reusable template containing all the necessary parameters, allowing you to consistently launch instances with the same configuration across deployments. The Launch Template we defined here will launch the type of EC2 instance we defined in the parameters at the top, attach the EC2 instance to the EC2SecurityGroup we defined earlier and then run a script to install docker and pull an image and run it. Here, you’ll want to replace your_dockerhub_username with your own DockerHub username and image with the name of a Docker image you want to use.

Next, we define an SNS notification topic and add a subscription to it using the email address we defined in the parameters earlier. We will use this to send emails whenever a scaling event happens.

The last resource defines the Auto Scaling Group called WebServerGroup and configures it to use the Launch Template we defined earlier. We configure the auto scaling group to have a desired capacity of 1 EC2 instance, a minimum of 1 and a maximum of 3 instances. Next, we configure the auto scaling group to send messages whenever an instance is created or destroyed and whenever these operations fail.

The entire template should now look like this:

AWSTemplateFormatVersion: "2010-09-09"
Parameters:
  InstanceType:
    Description: EC2 instance type
    Type: String
    Default: t3.micro
    AllowedValues:
      - t3.micro
      - t3.small
      - t3.medium

  KeyName:
    Description: Name of an exisiting EC2 key pair to allow SSH access to the instances
    Type: "AWS::EC2::KeyPair::KeyName"
  LatestAmiId:
    Description: The latest Amazon Linux 2 AMI from the Parameter Store
    Type: "AWS::SSM::Parameter::Value<AWS::EC2::Image::Id>"
    Default: "/aws/service/ami-amazon-linux-latest/amzn2-ami-hvm-x86_64-gp2"
  OperatorEmail:
    Description: The email address to notify when there are any scaling activities
    Type: String
  SSHLocation:
    Description: The IP address range that can be used to SSH to the EC2 instances
    Type: String
    MinLength: 9
    MaxLength: 18
    Default: 0.0.0.0/0
    ConstraintDescription: must be a valid IP CIDR range of the form x.x.x.x/x
  Subnets:
    Description: At least two public subnets in different Availability Zones in the selected VPC
    Type: "List<AWS::EC2::Subnet::Id>"
  VPC:
    Type: "AWS::EC2::VPC::Id"
    Description: A virtual private cloud that enables resources in public subnets to connect to the internet

Resources:
  ELBSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: ELB Security Group
      VpcId: !Ref VPC
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: 80
          ToPort: 80
          CidrIp: 0.0.0.0/0

  EC2SecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: EC2 Security Group
      VpcId: !Ref VPC
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: 80
          ToPort: 80
          SourceSecurityGroupId:
            Fn::GetAtt:
              - ELBSecurityGroup
              - GroupId
        - IpProtocol: tcp
          FromPort: 22
          ToPort: 22
          CidrIp: !Ref SSHLocation

  EC2TargetGroup:
    Type: AWS::ElasticLoadBalancingV2::TargetGroup
    Properties:
      HealthCheckIntervalSeconds: 30
      HealthCheckProtocol: HTTP
      HealthCheckTimeoutSeconds: 15
      HealthyThresholdCount: 5
      Matcher:
        HttpCode: 200
      Name: EC2TargetGroup
      Port: 80
      Protocol: HTTP
      TargetGroupAttributes:
        - Key: deregistration_delay.timeout_seconds
          Value: "20"
      UnhealthyThresholdCount: 3
      VpcId: !Ref VPC

  ALBListener:
    Type: AWS::ElasticLoadBalancingV2::Listener
    Properties:
      DefaultActions:
        - Type: forward
          TargetGroupArn: !Ref EC2TargetGroup
      LoadBalancerArn: !Ref ApplicationLoadBalancer
      Port: 80
      Protocol: HTTP

  ApplicationLoadBalancer:
    Type: AWS::ElasticLoadBalancingV2::LoadBalancer
    Properties:
      Scheme: internet-facing
      Subnets: !Ref Subnets
      SecurityGroups:
        - !GetAtt ELBSecurityGroup.GroupId

  LaunchTemplate:
    Type: AWS::EC2::LaunchTemplate
    Properties:
      LaunchTemplateName: !Sub ${AWS::StackName}-launch-template
      LaunchTemplateData:
        ImageId: !Ref LatestAmiId
        InstanceType: !Ref InstanceType
        KeyName: !Ref KeyName
        SecurityGroupIds:
          - !Ref EC2SecurityGroup
        UserData:
          Fn::Base64: !Sub |
            #!/bin/bash
            yum update -y
            yum install -y docker
            systemctl start docker
            docker pull your_dockerhub_username/image
            docker run  --rm -p 80:5000 your_dockerhub_username/image
  NotificationTopic:
    Type: AWS::SNS::Topic
    Properties:
      Subscription:
        - Endpoint: !Ref OperatorEmail
          Protocol: email

  WebServerGroup:
    Type: AWS::AutoScaling::AutoScalingGroup
    Properties:
      LaunchTemplate:
        LaunchTemplateId: !Ref LaunchTemplate
        Version: !GetAtt LaunchTemplate.LatestVersionNumber
      MaxSize: "3"
      MinSize: "1"
      DesiredCapacity: "1"
      NotificationConfigurations:
        - TopicARN: !Ref NotificationTopic
          NotificationTypes:
            [
              "autoscaling:EC2_INSTANCE_LAUNCH",
              "autoscaling:EC2_INSTANCE_LAUNCH_ERROR",
              "autoscaling:EC2_INSTANCE_TERMINATE",
              "autoscaling:EC2_INSTANCE_TERMINATE_ERROR",
            ]
      TargetGroupARNs:
        - !Ref EC2TargetGroup
      VPCZoneIdentifier: !Ref Subnets

Running this configuration in CloudFormation will create the resources defined in it. In the next section we’ll implement monitoring with CloudWatch.

Implementing Monitoring with CloudWatch

Monitoring is an important aspect when deploying applications to the cloud, it allows you to track the usage, availability and performance of your cloud resources. AWS has a built in monitoring tool called CloudWatch that can monitor and visualise metrics such as CPU utilisation, network traffic and other custom metrics you define.

CloudWatch Dashboard

CloudWatch Dashboards are customisable pages that allow you to view the state of different resources in a single page. They allow you to visualise resource metrics using different widgets such as line graphs, gauges and charts. You can select the colour used for each metric and add custom annotations to the graphs. In this section, I’ll show you how to create a basic dashboard in the AWS Console that monitors CPU Usage, number of running instances and the number of HTTP Requests handled by a load balancer.

Creating a dashboard in the console:

To create a dashboard, open the CloudWatch Console

In the navigation page, choose Dashboards and click Create Dashboard.

In the “Create New Dashboard” box, enter a name for the dashboard and select Create Dashboard.

In the next dialog box , you can choose widgets to add to the dashboard.

For this project, I will create a dashboard that uses gauges for the CPU and number of instances metrics and a line graph for the HTTP Requests.

CPU Utilisation

Add a gauge from the widget menu and configure it to use the Autoscaling group as its data source in the Source tab. In the metrics section, add the region and name of the Auto Scaling Group you created using CloudFormation in the previous steps.

My full source configuration looks like this:

{
    "metrics": [
        [ "AWS/EC2", "CPUUtilization", "AutoScalingGroupName", "ELBStack-WebServerGroup-yKSOzDlcZMde", { "region": "eu-west-1" } ]
    ],
    "sparkline": true,
    "view": "gauge",
    "region": "eu-west-1",
    "period": 10,
    "stat": "Average",
    "stacked": true,
    "yAxis": {
        "left": {
            "min": 0,
            "max": 100
        }
    },
    "liveData": true,
    "singleValueFullPrecision": false,
    "annotations": {
        "horizontal": [
            {
                "color": "#ff7f0e",
                "label": "Max Utilisation",
                "value": 70,
                "fill": "below"
            },
            {
                "color": "#2ca02c",
                "label": "Untitled annotation",
                "value": 30,
                "fill": "below"
            },
            {
                "color": "#d62728",
                "label": "High CPU Usage",
                "value": 70,
                "fill": "above"
            }
        ]
    },
    "title": "Average CPU Utilisation",
    "start": "-PT5M",
    "end": "P0D"
}

This creates a gauge that displays the average CPU Utilisation from the auto scaling group. If the CPU usage is between 0% and 30% the gauge will be in the green. If the usage is between 30 and 70% it will be in the orange range and if the usage is over 70% it will be in the red range. Here’s an example of how the CPU gauge should look when its in the red:

Running Instances Gauge

The process to create a gauge to show the number of running instances is similar to the one above. Add a gauge widget and select the GroupServiceInstances metric as its source and use the auto scaling group you created earlier. This metric displays the number of running instances that are part of the auto scaling group.

You can adjust the gauge min and max values by adding min and max to the yaxis property. This gauge should look something like this when working correctly:

HTTP Request Count Graph

The last widget we’ll add is one that shows the number of HTTP Requests the load balancer has handled in a given time period. Measuring the number of HTTP requests can be useful to keep track of traffic spikes, analyse traffic volume over time and detect anomalies. For this widget, use a line graph and select the RequestCount metric from the Load Balancer:

The graph should look something like this once the load balancer starts receiving traffic:

The Complete Dashboard

Once you have done all the steps above, you’ll have a single dashboard that shows CPU Usage, the number of running instances and the number of HTTP requests the load balancer has handled in one view.

Conclusion:

This article covered why auto scaling is important in keeping your workloads and applications online and cost efficient. It also covered how to automate the creation of auto scaling architecture using CloudFormation, how to create reusable templates using CloudFormation Parameters and lastly, how to add monitoring dashboards using CloudWatch. I hope you found it useful.