Automate alarm creation for Karpenter EC2 Nodes

Paulo Srulevitch
Aug 22, 2025
3 min read

Updated: Sep 1, 2025

Monitoring Karpenter-managed nodes can be tricky since they don't use auto-scaling groups. Our solution automates the creation and deletion of CloudWatch alarms for these individual EC2 instances using a smart, adaptable Terraform module. This ensures you have proactive insights and seamless integration with your EKS cluster, reducing manual effort and boosting reliability.

Introduction

Karpenter automatically launches just the right compute resources to handle your cluster's applications. It is designed to let you take full advantage of the cloud with fast and simple compute provisioning for Kubernetes clusters.

When talking about EKS and AWS, Karpenter is a very powerful tool to replace Cluster Autoscaler, because it scales faster, is more flexible in terms of configuration, and has the ability to consolidate nodes based on pods demand.

However, when you're managing a Kubernetes cluster on AWS, monitoring nodes becomes a challenge. This is basically because Karpenter works by launching nodes by itself and doesn’t do it inside an auto-scaling group or node group. It manages EC2 instances individually. This is a pain when talking about monitoring and automation.

Nonetheless, there is a way we can automate the process of alarm creation dynamically.

Let’s go through it.

Solution

This module integrates AWS EKS Karpenter autoscaling with automatic alarms, streamlining Kubernetes node provisioning and alerting on scaling events.

Use Cases & Value

Automatically scale AWS EC2 nodes based on workload demand.
Monitor Karpenter activity and set up CloudWatch alarms for scale-outs/ins.
Ideal for cost efficiency, reliability, and enhanced observability.

Prerequisites

Terraform ≥ 1.3
AWS Provider ≥ v5.x
Existing EKS cluster with Karpenter installed or managed via Helm/terraform-aws-modules.

How It Works

Internals:
- Attaches CloudWatch alarms to Karpenter EC2 nodes.
- Triggers SNS or other notification channels on scaling events through EventBridge rules.
- Manages the creation and deletion of alarms based on EC2 nodes' lifecycle.
Benefits:
- Proactive insight into node scale events
- Seamless integration with existing EKS + Karpenter setup
- Reduces manual alert creation and maintenance effort

Example usage with Terraform

To use this solution, you can apply all this configuration using the following example code:

module "karpenter_nodes_cpu_alarms" {
  source = "git@github.com:teracloud-io/terraform_modules.git//eks/karpenter_node_auto_alarms"

  lambda_prefix           = "karpenter_nodes"
  provisioner_tag_key     = "karpenter.k8s.aws/ec2nodeclass"      	provisioner_tag_value   = "apps"

  subscription_email      = "myemail@mycompany.com"

  alarm_period = 900

  alarm_threshold = 80

  alarm_description       = "This metric monitors Karpenters EKS EC2 CPU utilization"
  alarm_metric_name         = "CPUUtilization"
}

Please adjust values as needed, such as subscription_email and provisioner_tag_key/provisioner_tag_value.

Keep in mind that the example provided is intended to track the “CPUUtilization” metric on EC2 nodes. But you can change this value to any metric name available for an EC2 instance, such as “NetworkIn”. You can check the available metrics in the official documentation.

It’s also necessary to add the following code snippet (you can create a new file provider.tf) to add some values necessary for the tf code to apply correctly.

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = ">= 6.3"
    }
  }
}

provider "aws" {
  region = "us-east-1"
}

The execution of the previous code will generate some resources similar to the following screenshots:

The SNS topic created will send an email to the defined account to confirm the subscription. Please make sure that you confirm the subscription for the email alerting to work properly.

Now, to test the functionality of the deployment, launch an EC2 test instance and attach the selected tag key and values to it. Make sure you assign the same key and value defined in tf code.

You should now see a new CloudWatch alarm created and receiving an email alert for the OK status.

And check your inbox to validate if you have received the notification.

Finally, to test the deletion, you can manually terminate the EC2 test instance or node, and check the deletion of CloudWatch alarm.

That’s it! You have now automated the creation and deletion of CloudWatch alarms for Karpenter EC2-managed nodes. Good luck!

FAQ

How is Karpenter different from the Cluster Autoscaler?

Karpenter is faster and more flexible because it launches nodes directly, while the Cluster Autoscaler typically works with auto-scaling groups and manages node groups.

Does this solution work for all EC2 instance types?

Yes, this solution works for any EC2 instance managed by Karpenter.

Is this an open-source solution?

Yes, our solution is open source and available on GitHub, benefiting from our collective expertise and diverse perspectives.

How does this help with monitoring unscheduled pods?

Karpenter automatically provisions nodes for unscheduled pods. By setting up these alarms, you get immediate visibility into the health of the newly provisioned nodes that are created to handle them.

Cloud Engineer

Joaquin San Román