AWS Burstable Instances: Risks and Alternatives for Production Environments

AWS Burstable Instances: Risks and Alternatives for Production Environments

burstable instances risk performance drops in production. Stable instance types are recommended for critical applications.

Published May 10, 2024
Last Modified May 17, 2024
AWS burstable instances, or T type instances, offer flexibility and cost efficiency, but they carry potential risks for production environments. In this blog post, we'll explore how T type instances operate, why they may not be the best choice for production environments, and alternative solutions.

T Type Instances and the CPU Credit Mechanism
AWS T type instances are an excellent solution for variable workloads because they accumulate "CPU credits" during periods of low CPU usage, which can be used when high CPU performance is required. Models like T3 and T3a have an hourly credit earning rate, and these credits can be consumed during periods of high performance demand. However, this mechanism presents some challenges in production environments.
Risks of Using Burstable Instances in Production Environments
Performance Drops and Instability
The credit mechanism can lead to unwanted performance fluctuations in production environments. When credits are exhausted, the instance automatically drops to the base performance level. Especially during high traffic periods, this can lead to significant performance drops and service disruptions.
Resource Limitations
Production environments often require continuous and predictable performance. The credit limit of T type instances can lead to rapid depletion of resources under heavy loads, causing applications to perform below expected standards.
SLA Non-Compliance
For business-critical applications and services that require high availability, T type instances may not be reliable enough. Services that are obligated to provide high availability and performance guarantees as per Service Level Agreements (SLAs) may find this type of instance inappropriate.
Additional Considerations
Deprecation of T2 Instances
The T2 instance type is now very old, having been introduced in July 2014. Amazon RDS has already deprecated T2-based instance types, with support ending on March 29, 2024 . Therefore, it is highly recommended to avoid using T2 instances, as they are based on a nearly decade-old CPU architecture and carry significant performance throttling risks when CPU credits run out.
Training and Free Tier Usage
Many users are introduced to T2 instances through AWS's 12-month free tier, which offers 720 hours of T2.micro usage. However, AWS now also includes T3.micro in this free tier, which provides better performance, and usually offers a T4g free trial as well. T3 was introduced in August 2018, followed by the Graviton-based T4g in September 2020. While none of these are new, they still represent a significant improvement over T2 instances.
Unlimited Mode
T3 and later T4 instances come with an "Unlimited by default" mode, which prevents applications from being throttled but incurs extra costs if CPU usage exceeds 100% for approximately 7 hours a day . The additional cost for CPU credits in Unlimited mode varies:
For T4g instances, CPU credits are charged at $0.04 per vCPU-Hour for Linux, RHEL, and SLES.
For T2 and T3 instances, CPU credits are charged at $0.05 per vCPU-Hour for Linux, RHEL, and SLES, and $0.096 per vCPU-Hour for Windows and Windows with SQL Web .
Cost Considerations
The main reason to use T series instances over M series is based on cost and workload. The break-even point for T series instances is 42.5% utilization . At 100% utilization, a T series instance can be 1.5 times more expensive than an M5.large instance. Thus, if not carefully monitored, using T series instances under the assumption of cost savings can end up being more expensive.
While AWS T type instances are appealing for their cost-effectiveness and flexibility, their use in production environments involves certain risks. For situations requiring business-critical applications and high availability, it is recommended to opt for instance types with more stable and predictable resources. This approach helps ensure business continuity and customer satisfaction, preventing unexpected performance issues.
For those still considering T type instances, be aware of the potential for increased costs and performance instability. It is crucial to monitor your workloads and ensure that the instance type chosen aligns with your application's performance and availability requirements.