Automatically Creating CloudWatch Alarms for Dynamic Resources

· 8 min read
cloudwatch automation
CloudWatch alarms automatic creation pipeline diagram

In fast-moving environments, especially SaaS platforms, infrastructure is often created dynamically. New S3 buckets and CloudFront distributions may be provisioned on demand, driven by customer onboarding flows, automation pipelines, or application logic.

The operational problem usually appears later:

“How do we make sure every newly created resource is properly monitored?”

Manually creating CloudWatch alarms does not scale when resources are created dynamically. Missing alarms mean blind spots, and blind spots eventually turn into incidents.

In this article, I’ll walk through a practical solution that automatically creates CloudWatch alarms whenever a new S3 bucket or CloudFront distribution is created, using:

While the example focuses on S3 and CloudFront, the same approach can be easily adjusted and expanded to cover any AWS resource type by changing the CloudTrail event filters and alarm logic.

The Core Idea

The idea is simple:

  1. CloudTrail records API calls such as CreateBucket and CreateDistribution
  2. EventBridge listens for those events
  3. Lambda reacts to resource creation events
  4. CloudWatch alarms are created automatically for the new resource

From that point on, every resource is monitored consistently, without manual effort.

Architecture Overview

There is an important regional nuance in this setup.

Because of this, CloudFront creation events cannot directly trigger a Lambda in Frankfurt. To bridge this gap, SNS is used as a cross-region delivery mechanism.

High-level flow:

S3 (eu-central-1)
→ CloudTrail
→ EventBridge (eu-central-1)
→ Lambda (eu-central-1)
CloudFront (global / us-east-1)
→ CloudTrail
→ EventBridge (us-east-1)
→ SNS (us-east-1)
→ Lambda (eu-central-1)

Event Sources and Routing

This setup relies on CloudTrail events routed through EventBridge, with different handling for S3 and CloudFront due to regional behavior.

EventBridge Rule (Shared Logic)

The core EventBridge rule listens for resource creation API calls coming from CloudTrail:

{
"source": [
"aws.cloudfront",
"aws.s3"
],
"detail-type": [
"AWS API Call via CloudTrail"
],
"detail": {
"eventName": [
"CreateDistribution",
"CreateDistributionWithTags",
"CreateBucket",
"DeleteBucket",
"DeleteDistribution"
]
}
}

Rule details:

This single pattern allows us to detect both S3 bucket and CloudFront distribution creation events using CloudTrail as the source of truth.

S3 Bucket Creation (eu-central-1)

Since S3 bucket creation is regional, buckets created in Frankfurt can directly invoke the Lambda through EventBridge.

CloudFront Distribution Creation (Global / us-east-1)

Source: aws.cloudfront

Event names:

CloudTrail events are delivered in us-east-1 (N. Virginia)

Because the Lambda runs in eu-central-1, CloudFront creation events cannot directly invoke it. To handle this cross-region case:

  1. An EventBridge rule with the following pattern is created in us-east-1:
{
"source": ["aws.cloudfront"],
"detail-type": ["AWS API Call via CloudTrail"],
"detail": {
"eventName": ["CreateDistribution", "CreateDistributionWithTags", "DeleteDistribution"]
}
}
  1. The EventBridge rule targets an SNS topic in us-east-1
  2. The SNS topic invokes the Lambda in eu-central-1

This SNS hop acts as a reliable cross-region event bridge, keeping the alarm creation logic centralized.

Lambda Function

The Lambda function handles both direct EventBridge events and SNS-wrapped events. It detects the source, extracts the CloudTrail detail, and creates the appropriate alarms.

import boto3
import os
import json
import logging

logger = logging.getLogger()
logger.setLevel(logging.INFO)

cloudwatch = boto3.client('cloudwatch')

SNS_TOPIC_ARN = os.environ.get('SNS_TOPIC_ARN', '<YOUR ARN TOPIC>')

def lambda_handler(event, context):
logger.info(f"Received event: {json.dumps(event)}")

try:
if 'Records' in event and event['Records'][0]['EventSource'] == 'aws:sns':
logger.info("Event received from SNS, parsing message")
sns_message = json.loads(event['Records'][0]['Sns']['Message'])
detail = sns_message['detail']
logger.info(f"Parsed SNS message, detail: {json.dumps(detail)}")
else:
logger.info("Event received directly from EventBridge")
detail = event['detail']

event_source = detail['eventSource']
event_name = detail['eventName']

logger.info(f"Event source: {event_source}, Event name: {event_name}")

# S3 Bucket Creation
if event_source == 's3.amazonaws.com' and event_name == 'CreateBucket':
bucket_name = detail['requestParameters']['bucketName']
logger.info(f"S3 bucket creation detected: {bucket_name}")
create_s3_alarms(bucket_name)
logger.info(f"Successfully created alarms for S3 bucket: {bucket_name}")

# S3 Bucket Deletion
elif event_source == 's3.amazonaws.com' and event_name == 'DeleteBucket':
bucket_name = detail['requestParameters']['bucketName']
logger.info(f"S3 bucket deletion detected: {bucket_name}")
delete_s3_alarms(bucket_name)
logger.info(f"Successfully deleted alarms for S3 bucket: {bucket_name}")

# CloudFront Distribution Creation
elif event_source == 'cloudfront.amazonaws.com' and event_name in ['CreateDistribution', 'CreateDistributionWithTags']:
distribution_id = detail['responseElements']['distribution']['id']
logger.info(f"CloudFront distribution creation detected: {distribution_id}")
create_cloudfront_alarms(distribution_id)
logger.info(f"Successfully created alarms for CloudFront distribution: {distribution_id}")

# CloudFront Distribution Deletion
elif event_source == 'cloudfront.amazonaws.com' and event_name == 'DeleteDistribution':
distribution_id = detail['requestParameters']['id']
logger.info(f"CloudFront distribution deletion detected: {distribution_id}")
delete_cloudfront_alarms(distribution_id)
logger.info(f"Successfully deleted alarms for CloudFront distribution: {distribution_id}")

else:
logger.warning(f"Unhandled event - Source: {event_source}, Name: {event_name}")

return {'statusCode': 200}

except Exception as e:
logger.error(f"Error processing event: {str(e)}", exc_info=True)
raise

def create_s3_alarms(bucket_name):
try:
logger.info(f"Creating storage alarm for bucket: {bucket_name}")
cloudwatch.put_metric_alarm(
AlarmName=f'{bucket_name}-storage-alarm',
AlarmDescription=f'S3 bucket {bucket_name} storage exceeded 1GB threshold',
MetricName='BucketSizeBytes',
Namespace='AWS/S3',
Statistic='Average',
Period=86400,
EvaluationPeriods=1,
Threshold=1 * 1024 * 1024 * 1024,
ComparisonOperator='GreaterThanThreshold',
Dimensions=[
{'Name': 'BucketName', 'Value': bucket_name},
{'Name': 'StorageType', 'Value': 'StandardStorage'}
],
AlarmActions=[SNS_TOPIC_ARN],
TreatMissingData='notBreaching'
)
logger.info(f"Storage alarm created: {bucket_name}-storage-alarm")
except Exception as e:
logger.error(f"Failed to create S3 alarm for {bucket_name}: {str(e)}")
raise

def delete_s3_alarms(bucket_name):
try:
alarm_name = f'{bucket_name}-storage-alarm'
logger.info(f"Deleting storage alarm: {alarm_name}")

cloudwatch.delete_alarms(AlarmNames=[alarm_name])
logger.info(f"Successfully deleted alarm: {alarm_name}")
except Exception as e:
logger.error(f"Failed to delete S3 alarm for {bucket_name}: {str(e)}")
raise

def create_cloudfront_alarms(distribution_id):
try:
logger.info(f"Creating high-requests alarm for distribution: {distribution_id}")
cloudwatch.put_metric_alarm(
AlarmName=f'high-requests-{distribution_id}',
AlarmDescription=f'CloudFront distribution {distribution_id} exceeded 1000 requests/second',
MetricName='Requests',
Namespace='AWS/CloudFront',
Statistic='Sum',
Period=60,
EvaluationPeriods=1,
Threshold=1000,
ComparisonOperator='GreaterThanThreshold',
Dimensions=[{'Name': 'DistributionId', 'Value': distribution_id}, {'Name': 'Region', 'Value': 'Global'}],
AlarmActions=[SNS_TOPIC_ARN],
TreatMissingData='notBreaching'
)
logger.info(f"High-requests alarm created: high-requests-{distribution_id}")

logger.info(f"Creating 5xx-error-rate alarm for distribution: {distribution_id}")
cloudwatch.put_metric_alarm(
AlarmName=f'5xx-error-rate-{distribution_id}',
AlarmDescription=f'CloudFront distribution {distribution_id} 5xx error rate exceeded 5%',
MetricName='5xxErrorRate',
Namespace='AWS/CloudFront',
Statistic='Average',
Period=300,
EvaluationPeriods=1,
Threshold=5,
ComparisonOperator='GreaterThanThreshold',
Dimensions=[{'Name': 'DistributionId', 'Value': distribution_id}, {'Name': 'Region', 'Value': 'Global'}],
AlarmActions=[SNS_TOPIC_ARN],
TreatMissingData='notBreaching'
)
logger.info(f"5xx-error-rate alarm created: 5xx-error-rate-{distribution_id}")
except Exception as e:
logger.error(f"Failed to create CloudFront alarms for {distribution_id}: {str(e)}")
raise

def delete_cloudfront_alarms(distribution_id):
try:
alarm_names = [
f'high-requests-{distribution_id}',
f'5xx-error-rate-{distribution_id}'
]

logger.info(f"Deleting CloudFront alarms: {alarm_names}")
cloudwatch.delete_alarms(AlarmNames=alarm_names)
logger.info(f"Successfully deleted alarms for distribution: {distribution_id}")
except Exception as e:
logger.error(f"Failed to delete CloudFront alarms for {distribution_id}: {str(e)}")
raise

Alarm Creation Logic

S3 Storage Alarm

For each newly created S3 bucket, a storage alarm is created:

cloudwatch.put_metric_alarm(
AlarmName=f'{bucket_name}-storage-alarm',
MetricName='BucketSizeBytes',
Namespace='AWS/S3',
Period=86400,
Threshold=1 * 1024 * 1024 * 1024,
ComparisonOperator='GreaterThanThreshold',
Dimensions=[
{'Name': 'BucketName', 'Value': bucket_name},
{'Name': 'StorageType', 'Value': 'StandardStorage'}
],
AlarmActions=[SNS_TOPIC_ARN]
)

CloudFront Alarms

For each new distribution, two alarms are created:

High request rate

5xx error rate

These alarms provide early signals for traffic spikes and origin failures.

Why This Approach Works Well

This pattern is especially effective in multi-tenant or self-service platforms.

Limitations and Considerations

These trade-offs are usually acceptable compared to the operational risk of missing alarms entirely.

Final Thoughts

Automated alarm creation is one of those things that feels unnecessary, until the first incident caused by an unmonitored resource.

By combining CloudTrail, EventBridge, Lambda, and SNS, you can enforce monitoring standards across dynamically created infrastructure with minimal ongoing effort.

If your platform creates AWS resources dynamically, this pattern is a strong foundation for keeping observability consistent and reliable.