Automatically Creating CloudWatch Alarms for Dynamic Resources

In fast-moving environments, especially SaaS platforms, infrastructure is often created dynamically. New S3 buckets and CloudFront distributions may be provisioned on demand, driven by customer onboarding flows, automation pipelines, or application logic.

The operational problem usually appears later:

“How do we make sure every newly created resource is properly monitored?”

Manually creating CloudWatch alarms does not scale when resources are created dynamically. Missing alarms mean blind spots, and blind spots eventually turn into incidents.

In this article, I’ll walk through a practical solution that automatically creates CloudWatch alarms whenever a new S3 bucket or CloudFront distribution is created, using:

AWS CloudTrail
Amazon EventBridge
AWS Lambda
Amazon SNS
Amazon CloudWatch

While the example focuses on S3 and CloudFront, the same approach can be easily adjusted and expanded to cover any AWS resource type by changing the CloudTrail event filters and alarm logic.

The Core Idea

The idea is simple:

CloudTrail records API calls such as CreateBucket and CreateDistribution
EventBridge listens for those events
Lambda reacts to resource creation events
CloudWatch alarms are created automatically for the new resource

From that point on, every resource is monitored consistently, without manual effort.

Architecture Overview

There is an important regional nuance in this setup.

The Lambda function runs in eu-central-1 (Frankfurt)
S3 bucket creation events are regional, so S3 buckets created in Frankfurt can directly invoke the Lambda via EventBridge
CloudFront is a global service, and its CloudTrail events are delivered in us-east-1 (N. Virginia)

Because of this, CloudFront creation events cannot directly trigger a Lambda in Frankfurt. To bridge this gap, SNS is used as a cross-region delivery mechanism.

High-level flow:

S3 (eu-central-1)
  → CloudTrail
  → EventBridge (eu-central-1)
  → Lambda (eu-central-1)

CloudFront (global / us-east-1)
  → CloudTrail
  → EventBridge (us-east-1)
  → SNS (us-east-1)
  → Lambda (eu-central-1)

Event Sources and Routing

This setup relies on CloudTrail events routed through EventBridge, with different handling for S3 and CloudFront due to regional behavior.

EventBridge Rule (Shared Logic)

The core EventBridge rule listens for resource creation API calls coming from CloudTrail:

{
  "source": [
    "aws.cloudfront",
    "aws.s3"
  ],
  "detail-type": [
    "AWS API Call via CloudTrail"
  ],
  "detail": {
    "eventName": [
      "CreateDistribution",
      "CreateDistributionWithTags",
      "CreateBucket",
      "DeleteBucket",
      "DeleteDistribution"
    ]
  }
}

Rule details:

Event bus: default
Description: <YOUR_DESCRIPTION>
Service principal: events.amazonaws.com

This single pattern allows us to detect both S3 bucket and CloudFront distribution creation events using CloudTrail as the source of truth.

S3 Bucket Creation (eu-central-1)

Source: aws.s3
Event name: CreateBucket
EventBridge rule is created in eu-central-1 (Frankfurt)
Target: Lambda function

Since S3 bucket creation is regional, buckets created in Frankfurt can directly invoke the Lambda through EventBridge.

CloudFront Distribution Creation (Global / us-east-1)

Source: aws.cloudfront

Event names:

CreateDistribution
CreateDistributionWithTags

CloudTrail events are delivered in us-east-1 (N. Virginia)

Because the Lambda runs in eu-central-1, CloudFront creation events cannot directly invoke it. To handle this cross-region case:

An EventBridge rule with the following pattern is created in us-east-1:

{
  "source": ["aws.cloudfront"],
  "detail-type": ["AWS API Call via CloudTrail"],
  "detail": {
    "eventName": ["CreateDistribution", "CreateDistributionWithTags", "DeleteDistribution"]
  }
}

The EventBridge rule targets an SNS topic in us-east-1
The SNS topic invokes the Lambda in eu-central-1

This SNS hop acts as a reliable cross-region event bridge, keeping the alarm creation logic centralized.

Lambda Function

The Lambda function handles both direct EventBridge events and SNS-wrapped events. It detects the source, extracts the CloudTrail detail, and creates the appropriate alarms.

import boto3
import os
import json
import logging

logger = logging.getLogger()
logger.setLevel(logging.INFO)

cloudwatch = boto3.client('cloudwatch')

SNS_TOPIC_ARN = os.environ.get('SNS_TOPIC_ARN', '<YOUR ARN TOPIC>')

def lambda_handler(event, context):
    logger.info(f"Received event: {json.dumps(event)}")
    
    try:
        if 'Records' in event and event['Records'][0]['EventSource'] == 'aws:sns':
            logger.info("Event received from SNS, parsing message")
            sns_message = json.loads(event['Records'][0]['Sns']['Message'])
            detail = sns_message['detail']
            logger.info(f"Parsed SNS message, detail: {json.dumps(detail)}")
        else:
            logger.info("Event received directly from EventBridge")
            detail = event['detail']
        
        event_source = detail['eventSource']
        event_name = detail['eventName']
        
        logger.info(f"Event source: {event_source}, Event name: {event_name}")
        
        # S3 Bucket Creation
        if event_source == 's3.amazonaws.com' and event_name == 'CreateBucket':
            bucket_name = detail['requestParameters']['bucketName']
            logger.info(f"S3 bucket creation detected: {bucket_name}")
            create_s3_alarms(bucket_name)
            logger.info(f"Successfully created alarms for S3 bucket: {bucket_name}")
        
        # S3 Bucket Deletion
        elif event_source == 's3.amazonaws.com' and event_name == 'DeleteBucket':
            bucket_name = detail['requestParameters']['bucketName']
            logger.info(f"S3 bucket deletion detected: {bucket_name}")
            delete_s3_alarms(bucket_name)
            logger.info(f"Successfully deleted alarms for S3 bucket: {bucket_name}")
        
        # CloudFront Distribution Creation
        elif event_source == 'cloudfront.amazonaws.com' and event_name in ['CreateDistribution', 'CreateDistributionWithTags']:
            distribution_id = detail['responseElements']['distribution']['id']
            logger.info(f"CloudFront distribution creation detected: {distribution_id}")
            create_cloudfront_alarms(distribution_id)
            logger.info(f"Successfully created alarms for CloudFront distribution: {distribution_id}")
        
        # CloudFront Distribution Deletion
        elif event_source == 'cloudfront.amazonaws.com' and event_name == 'DeleteDistribution':
            distribution_id = detail['requestParameters']['id']
            logger.info(f"CloudFront distribution deletion detected: {distribution_id}")
            delete_cloudfront_alarms(distribution_id)
            logger.info(f"Successfully deleted alarms for CloudFront distribution: {distribution_id}")
        
        else:
            logger.warning(f"Unhandled event - Source: {event_source}, Name: {event_name}")
        
        return {'statusCode': 200}
    
    except Exception as e:
        logger.error(f"Error processing event: {str(e)}", exc_info=True)
        raise

def create_s3_alarms(bucket_name):
    try:
        logger.info(f"Creating storage alarm for bucket: {bucket_name}")
        cloudwatch.put_metric_alarm(
            AlarmName=f'{bucket_name}-storage-alarm',
            AlarmDescription=f'S3 bucket {bucket_name} storage exceeded 1GB threshold',
            MetricName='BucketSizeBytes',
            Namespace='AWS/S3',
            Statistic='Average',
            Period=86400,
            EvaluationPeriods=1,
            Threshold=1 * 1024 * 1024 * 1024,
            ComparisonOperator='GreaterThanThreshold',
            Dimensions=[
                {'Name': 'BucketName', 'Value': bucket_name},
                {'Name': 'StorageType', 'Value': 'StandardStorage'}
            ],
            AlarmActions=[SNS_TOPIC_ARN],
            TreatMissingData='notBreaching'
        )
        logger.info(f"Storage alarm created: {bucket_name}-storage-alarm")
    except Exception as e:
        logger.error(f"Failed to create S3 alarm for {bucket_name}: {str(e)}")
        raise

def delete_s3_alarms(bucket_name):
    try:
        alarm_name = f'{bucket_name}-storage-alarm'
        logger.info(f"Deleting storage alarm: {alarm_name}")
        
        cloudwatch.delete_alarms(AlarmNames=[alarm_name])
        logger.info(f"Successfully deleted alarm: {alarm_name}")
    except Exception as e:
        logger.error(f"Failed to delete S3 alarm for {bucket_name}: {str(e)}")
        raise

def create_cloudfront_alarms(distribution_id):
    try:
        logger.info(f"Creating high-requests alarm for distribution: {distribution_id}")
        cloudwatch.put_metric_alarm(
            AlarmName=f'high-requests-{distribution_id}',
            AlarmDescription=f'CloudFront distribution {distribution_id} exceeded 1000 requests/second',
            MetricName='Requests',
            Namespace='AWS/CloudFront',
            Statistic='Sum',
            Period=60,
            EvaluationPeriods=1,
            Threshold=1000,
            ComparisonOperator='GreaterThanThreshold',
            Dimensions=[{'Name': 'DistributionId', 'Value': distribution_id}, {'Name': 'Region', 'Value': 'Global'}],
            AlarmActions=[SNS_TOPIC_ARN],
            TreatMissingData='notBreaching'
        )
        logger.info(f"High-requests alarm created: high-requests-{distribution_id}")
        
        logger.info(f"Creating 5xx-error-rate alarm for distribution: {distribution_id}")
        cloudwatch.put_metric_alarm(
            AlarmName=f'5xx-error-rate-{distribution_id}',
            AlarmDescription=f'CloudFront distribution {distribution_id} 5xx error rate exceeded 5%',
            MetricName='5xxErrorRate',
            Namespace='AWS/CloudFront',
            Statistic='Average',
            Period=300,
            EvaluationPeriods=1,
            Threshold=5,
            ComparisonOperator='GreaterThanThreshold',
            Dimensions=[{'Name': 'DistributionId', 'Value': distribution_id}, {'Name': 'Region', 'Value': 'Global'}],
            AlarmActions=[SNS_TOPIC_ARN],
            TreatMissingData='notBreaching'
        )
        logger.info(f"5xx-error-rate alarm created: 5xx-error-rate-{distribution_id}")
    except Exception as e:
        logger.error(f"Failed to create CloudFront alarms for {distribution_id}: {str(e)}")
        raise

def delete_cloudfront_alarms(distribution_id):
    try:
        alarm_names = [
            f'high-requests-{distribution_id}',
            f'5xx-error-rate-{distribution_id}'
        ]
        
        logger.info(f"Deleting CloudFront alarms: {alarm_names}")
        cloudwatch.delete_alarms(AlarmNames=alarm_names)
        logger.info(f"Successfully deleted alarms for distribution: {distribution_id}")
    except Exception as e:
        logger.error(f"Failed to delete CloudFront alarms for {distribution_id}: {str(e)}")
        raise

Alarm Creation Logic

S3 Storage Alarm

For each newly created S3 bucket, a storage alarm is created:

Metric: BucketSizeBytes
Threshold: 1 GB
Period: 1 day

cloudwatch.put_metric_alarm(
    AlarmName=f'{bucket_name}-storage-alarm',
    MetricName='BucketSizeBytes',
    Namespace='AWS/S3',
    Period=86400,
    Threshold=1 * 1024 * 1024 * 1024,
    ComparisonOperator='GreaterThanThreshold',
    Dimensions=[
        {'Name': 'BucketName', 'Value': bucket_name},
        {'Name': 'StorageType', 'Value': 'StandardStorage'}
    ],
    AlarmActions=[SNS_TOPIC_ARN]
)

CloudFront Alarms

For each new distribution, two alarms are created:

High request rate

Metric: Requests
Threshold: 1000 requests per minute

5xx error rate

Metric: 5xxErrorRate
Threshold: 5%

These alarms provide early signals for traffic spikes and origin failures.

Why This Approach Works Well

Monitoring is automatically enforced for every resource
No reliance on naming conventions or manual steps
Centralized alarm logic
Scales naturally with dynamic infrastructure

This pattern is especially effective in multi-tenant or self-service platforms.

Limitations and Considerations

CloudTrail delivery is not real-time; alarms may be created a few minutes after resource creation
Thresholds are static and may need tuning per workload

These trade-offs are usually acceptable compared to the operational risk of missing alarms entirely.

Final Thoughts

Automated alarm creation is one of those things that feels unnecessary, until the first incident caused by an unmonitored resource.

By combining CloudTrail, EventBridge, Lambda, and SNS, you can enforce monitoring standards across dynamically created infrastructure with minimal ongoing effort.

If your platform creates AWS resources dynamically, this pattern is a strong foundation for keeping observability consistent and reliable.