CloudBudgetMasterCloudBudgetMaster

← All articles

AWS

Spot Instances 101: Safely Slash Compute Costs up to 90%

June 22, 2026·5 min read·CloudBudgetMaster

What Spot Instances Are and When They Make Sense

Spot instances are spare capacity offered by the cloud providers at deep discounts (often 70‑90%). The trade‑off is that the provider can reclaim the capacity with little warning. They are ideal for workloads that are: - Stateless or easily restartable (batch jobs, CI pipelines, data processing). - Can tolerate interruption (machine learning training with checkpointing, image rendering farms). - Behind a resilient architecture (auto‑scaling groups, Kubernetes deployments, or queue‑driven services). If your application meets any of these criteria, you can start moving a portion of its baseline compute to spot without sacrificing availability.

AWS Spot: From Request to Production

1. Choose the Right Launch Method

2. Set a Maximum Price (or use the default)

aws ec2 request-spot-instances \
  --instance-count 4 \
  --type "one-time" \
  --launch-specification file://spec.json \
  --spot-price "0.03"

The --spot-price is optional; if omitted, AWS uses the current Spot market price. Keeping a ceiling protects you from sudden price spikes.

3. Handle Interruption Notices

AWS sends a two‑minute warning via the instance metadata URL http://169.254.169.254/latest/meta-data/spot/termination-time. A simple daemon can poll this endpoint and gracefully stop services or checkpoint state.

while true; do
  curl -s http://169.254.169.254/latest/meta-data/spot/termination-time && break
  sleep 5
done
# trigger graceful shutdown here

Integrate this script with your init system or container entrypoint.

4. Use Capacity Rebalancing (newer feature)

Add the flag --instance-interruption-behavior terminate and enable Capacity Rebalancing on the ASG. The service will proactively launch replacement Spot instances before the actual interruption, reducing churn.

aws autoscaling update-auto-scaling-group \
  --auto-scaling-group-name my-asg \
  --capacity-rebalance-enabled

5. Monitor Spot Utilization

GCP Preemptible VMs: Simplicity with Limits

GCP’s preemptible VMs are the equivalent of Spot, but they have a fixed maximum lifetime of 24 hours and a 30‑second termination notice.

1. Create with gcloud

gcloud compute instances create preemptible‑worker \
  --machine-type n1-standard-4 \
  --preemptible \
  --image-family debian-11 \
  --image-project debian-cloud \
  --metadata shutdown-script-url=gs://my‑scripts/handle‑preempt.sh

The --preemptible flag applies the discount automatically.

2. Use Managed Instance Groups (MIG) for Auto‑Scaling

gcloud compute instance-groups managed create preempt‑mig \
  --template preemptible‑template \
  --size 5 \
  --target-size 10 \
  --autoscaling-policy max-num-replicas=20,scale-based-on-cpu-utilization=0.6

MIG will replace preempted VMs automatically, keeping the target size constant.

3. Capture the 30‑second Notice

GCP writes a preempted flag to the instance metadata service. Poll it in a background process:

while true; do
  curl -s -H "Metadata-Flavor: Google" \
    http://metadata.google.internal/computeMetadata/v1/instance/preempted && break
  sleep 5
done
# graceful shutdown actions

4. Cost Visibility

Enable Billing Export to BigQuery and query cost where service.description = "Compute Engine" AND sku.id LIKE "%Preemptible%". This gives you a precise dollar impact.

Azure Spot VMs: Leveraging Eviction Policies

Azure Spot VMs let you choose an eviction policy that determines what happens when capacity is reclaimed.

1. Deploy via Azure CLI

az vm create \
  --resource-group rg-prod \
  --name spot‑worker \
  --image UbuntuLTS \
  --size Standard_D4s_v3 \
  --priority Spot \
  --eviction-policy Deallocate \
  --max-price -1   # -1 means pay the current Spot price

Deallocate keeps the OS disk, allowing a quick restart; Delete removes the VM entirely.

2. Use Scale Sets for Resilience

az vmss create \
  --resource-group rg-prod \
  --name spot‑ss \
  --image UbuntuLTS \
  --upgrade-policy-mode Automatic \
  --instance-count 3 \
  --priority Spot \
  --eviction-policy Deallocate \
  --max-price -1

The scale set automatically provisions replacement instances when evictions occur.

3. React to Eviction Events

Azure writes an event to the Azure Activity Log. Set up an Event Grid subscription that triggers an Azure Function to log the event and optionally spin up a fallback On‑Demand VM.

{
  "subject": "/subscriptions/<sub>/resourceGroups/rg-prod/providers/Microsoft.Compute/virtualMachineScaleSets/spot-ss/eviction",
  "eventType": "Microsoft.Resources.ResourceWriteSuccess",
  "data": { "status": "Succeeded" }
}

4. Budget Alerts

Create a cost alert that fires when Spot spend exceeds a percentage of your total compute budget:

az consumption budget create \
  --budget-name SpotAlert \
  --amount 500 \
  --time-grain Monthly \
  --category Cost \
  --notifications "{\"operator\":\"GreaterThan\",\"threshold\":80,\"contactEmails\":[\"finops@example.com\"]}"

Best‑Practice Checklist Across All Clouds

Calculating the Dollar Impact

  1. Export the last 30 days of compute spend to a CSV or BigQuery table.
  2. Filter rows where sku.description contains Spot, Preemptible, or Spot VM.
  3. Sum the cost column – that is the amount you saved compared to the On‑Demand price for the same usage.
  4. Divide by the total compute cost to get the percentage reduction.

How CloudBudgetMaster Helps

CloudBudgetMaster continuously scans your AWS, GCP, and Azure accounts, automatically flags Spot‑eligible workloads, and shows the exact dollar impact of moving them to Spot. It surfaces interruption‑risk metrics and recommends the optimal Spot‑On‑Demand mix, letting you act on savings without manual digging.

Stop guessing where your cloud money goes

CloudBudgetMaster scans AWS, GCP & Azure and finds idle, unused, and overspending resources automatically.

Try Free — No Credit Card