Trim CloudWatch Observability Costs From Your AWS Bill
Understanding CloudWatch Pricing
AWS bills CloudWatch in three buckets that often appear together on the same line item:
- Metrics – standard (5‑minute) vs. detailed (1‑minute) monitoring, plus any custom metrics you publish.
- Alarms – each alarm evaluated every minute incurs a charge.
- Logs – ingestion, storage, and optional Insights queries.
The pricing page shows a per‑metric‑type cost (e.g., $0.30 per metric per month for custom metrics) and a per‑GB cost for log ingestion ($0.50/GB) and storage ($0.03/GB‑month). Because the UI aggregates these, a sudden rise in the bill can look like a generic "CloudWatch" charge, making it hard to pinpoint the source.
Common Cost Traps
| Trap | Why it hurts | Quick check |
|---|---|---|
| Enabling detailed monitoring on every EC2 instance | 1‑minute metrics cost 6× standard. | aws ec2 describe-instances --query 'Reservations[].Instances[?Monitoring.State==enabled].InstanceId' |
| Publishing high‑cardinality custom metrics (e.g., per‑user counters) | Each unique metric name + dimension pair counts. | aws cloudwatch list-metrics --namespace MyApp --query 'Metrics[?Dimensions[?Name==UserId]]' |
| Log groups left at the default infinite retention | Storage grows forever at $0.03/GB‑month. | aws logs describe-log-groups --query 'logGroups[?retentionInDays==null].logGroupName' |
| Subscription filters that forward every log line to Kinesis or Lambda | Ingestion charges double, plus downstream compute cost. | aws logs describe-subscription-filters --log-group-name /aws/lambda/my-func |
| Running CloudWatch Logs Insights queries without limits | Queries scan all stored logs; pricing is $0.01 per GB scanned. | Review query history in the console or via aws logs start-query with --limit. |
Tuning Metrics and Alarms
-
Audit detailed monitoring
bash # List instances with detailed monitoring aws ec2 describe-instances \ --filters Name=monitoring-state,Values=enabled \ --query 'Reservations[].Instances[].InstanceId' \ --output textFor each instance, decide if 5‑minute granularity is sufficient. Switch back with:bash aws ec2 monitor-instances --instance-ids i-0123456789abcdef0(or use the console → Actions → Monitor → Disable detailed monitoring). -
Consolidate custom metrics - Replace per‑entity counters with a single metric that uses a dimension for the entity. AWS charges per unique dimension set, so grouping reduces count. - If you only need aggregates, push a single metric like
RequestsTotaland compute per‑entity stats in your own dashboard. -
Prune stale alarms
bash aws cloudwatch describe-alarms --query 'MetricAlarms[?StateValue==`OK`].AlarmName' --output text | while read alarm; do aws cloudwatch delete-alarms --alarm-names "$alarm" doneKeep alarms only for resources that matter (e.g., production services). Use Composite Alarms to combine multiple conditions into one alarm.
Optimizing Log Retention and Export
-
Set appropriate retention
bash # Example: set 30‑day retention for all app log groups for lg in $(aws logs describe-log-groups --query 'logGroups[?contains(logGroupName, `app`)].logGroupName' --output text); do aws logs put-retention-policy --log-group-name "$lg" --retention-in-days 30 doneChoose the shortest period that satisfies compliance. For audit logs, 90 days is common; for debug logs, 7 days may be enough. -
Archive cold logs to S3 - Create a subscription filter that sends logs to an S3 bucket via a Lambda function. - After 30 days, move the bucket to Glacier Deep Archive to cut storage to <$0.001/GB‑month.
bash aws s3 cp s3://my-cloudwatch-archive/ --recursive --storage-class DEEP_ARCHIVE -
Limit Logs Insights scans - Add
--limit 1000to your query command to cap scanned data. - Schedule regular clean‑up of old log groups that are no longer needed.
Automation and Ongoing Governance
-
Tagging for cost allocation - Apply a
cost-centertag to every CloudWatch resource (log groups, metric filters, alarms). - Enable Cost Allocation Tags in the Billing console to see which teams drive observability spend. -
Scheduled compliance checks
bash # Lambda function (Python) that runs daily import boto3, datetime cw = boto3.client('cloudwatch') logs = boto3.client('logs') def lambda_handler(event, context): # Flag log groups without retention groups = logs.describe_log_groups()['logGroups'] for g in groups: if 'retentionInDays' not in g: logs.put_retention_policy(logGroupName=g['logGroupName'], retentionInDays=30)Deploy via SAM or the console and set a CloudWatch Event rule (rate(1 day)). -
Use AWS Budgets alerts - Create a budget for the
CloudWatchservice code (AWS/CloudWatch). - Set an alert at 80 % of the monthly forecast; route the notification to Slack or email.bash aws budgets create-budget --account-id 123456789012 \ --budget file://budget.json(budget.json contains the service filter for CloudWatch.)
Quick Reference Checklist
- [ ] Disable detailed monitoring on non‑critical EC2/ELB resources.
- [ ] Consolidate custom metrics; delete unused ones.
- [ ] Delete or archive log groups older than the retention policy.
- [ ] Set explicit retention for every log group.
- [ ] Review and prune alarms that are always
OK. - [ ] Tag all observability resources for cost attribution.
- [ ] Enable a budget alert for the CloudWatch line item.
How CloudBudgetMaster helps – Our platform continuously scans your AWS account, flags detailed‑monitoring instances, orphaned log groups, and high‑cardinality custom metrics, and shows the exact dollar impact of each item so you can remediate with a single click.
CloudBudgetMaster