Amazon AppStream 2.0 Cost optimisation
There are various disparate documented locations that cover the different methods to apply cost savings within an Amazon AppStream 2.0 deployment.
Roy
Amazon Employee
Published Sep 9, 2024
**Pain points are we solving?**There are various disparate documented locations that cover the different methods to apply cost savings within an Amazon AppStream 2.0 deployment. This blog brings those methods into one place and shows they when you apply them in a strategic manner they can influence each other which require some extra thought.Introduction/Purpose Amazon AppStream 2.0, an application streaming service for End User Computing, is a managed offering from AWS. In this post, we share design decisions and techniques to optimize the service's cost while delivering a positive user experience and driving business value. Cost optimization for Amazon AppStream 2.0 can be achieved by carefully selecting the Fleet type, Instance family, and size.The service's capabilities and implementation include fleet policies, image builder cost optimization, and user fees, which factor into cost estimates for an Amazon AppStream 2.0 business case using the pricing tool.This blog will delve into each of these decision points, exploring how they impact costs, and provide methods to control costs operationally and optimize the service. The graph below shows the impact on applying the levers associated with cost optimisation.For illustrative purposes. The actual impact may vary based on your specific requirements and usage patterns. The optimization choices may overlap or have compounding effects, further influencing the overall cost savings but can be significant when combined.
- Changes to an on-demand vs always on circa 15-20% reduction
- Use standard medium instances reduction circa 40% reduction
- If you’re able to use Linux OS circa 30% reduction
**Lets start - Which Amazon AppStream fleet do you chose?**There are 3 Fleet types; and the choice will will depend on the use case and also end user experience. The business application being served will impact which fleet type is most appropriate, as well as the user persona. Having an understanding of these is key to ensuring you’re able to optimize the solution.Each fleet type has a cost profile dependent on the choices made.
- Always on
- On-demand
- Elastic
Always on is exactly that, and suits the needs when the application is required to be available 24x7x365. It may be an SaaS application serving a 24x7 manufacturing user base, or an operational application which requires access immediately. The cost profile is such that due to it being 24x7 then the charges reflect this and the application is always available, instantly. Dependent on the Instances type and OS then the charges are either per second for Linux, or per hour for Microsoft Windows; both with a min of 15 mins associated cost.Here is a short video with a summary between Always on and On Demand.On-Demand is an instance in standby mode. When the user initiates a session, the instance is brought back in to the running state with a 1-2 minute delay before being available. In such cases where there is no need for an immediate response then on-demand could be sufficient. If the session ends early and there is no activity the session will disconnect based on the timeout value or idle disconnect. When the user reconnects then there will be the same delay as the instance goes from standby into ‘Running’ state; the fleet configuration may need to consider the use of session scripts to manage the user expereince.- Multi-Session feature in Amazon AppStream 2.0 enables multiple concurrent user sessions within a single AppStream 2.0 instance. This means that multiple users can access and use the same set of applications simultaneously, without the need for separate instances for each user. By sharing resources among multiple users, the Multi-Session feature can help reduce the overall cost of running applications on AppStream 2.0, as you don't need separate instances for each user.The charges are based on the instances type and OS chosen. This is per second for Linux and hourly for windows - both require a minimum of 15 minutes before charged, visit the AppStream 2.0 pricing page & simple pricing tool. NB: There is a reduced charge when the instances are in a standby mode; to cover the cost of the reserved infrastructure.Elastic fleets Elastic fleets are a serverless fleet type that removes the need for customers to predict usage, create and manage AWS Auto Scaling policies and create images it consist of an Amazon Appstream AppBlock. This is a virtual hard drive configured with all application binaries associated with running that application on the VHD (virtual hard drive). The VHD is uploaded on to Amazon S3 and when a session is initiated the VHD is downloaded and mounted onto an instance. The VHD download can take up to 90 secs to download. Setup scripts that may have been configured during the process of creating the Appblock such as the mount point are then ran. Charges are based on instances type per second with a minimum of 15 minutes.The Amazon AppStream 2.0 Multi-Session feature is particularly useful for scenarios where multiple users need to access and use the same set of applications concurrently, such as in educational institutions, corporate environments, or shared workspaces. It provides a cost-effective and scalable solution for delivering desktop applications to multiple users from the cloud.Dive deeper with this walkthrough: Stream applications at a lower cost with Amazon Amazon Appstream 2.0 Elastic fleets and Linux compatibilityChoosing the right instance typeChoosing the right sized instance is key to ensuring the end user experience meets or exceeds expectations at the same time meets business value. Utilizing a tried and tested user acceptance test (UAT) is recommended to experience the needs of the software requirements and the resulting performance when the users are initiating sessions. Visit the service quotas page to view the quotas associated with the instance types and availability within the region chosen. The actual quotas for your account could be higher or lower, depending on when you created your account. NB: Requesting a quota increase that is aligned with the volume of expected users for your fleet when at capacity.The instance type that you specify determines the hardware of the host computers used for your fleet. Each instance type offers different compute, memory, and GPU capabilities. Instance types are grouped into instance families based on these capabilities which can be aligned to the performance requirements of your applications. Recommend visiting Amazon Appstream pricing sheet to verify the Region of your deployment has the required instance family.Illustration purposes of AppStream 2.0 Instance family types, for the latest information visit this link https://docs.aws.amazon.com/appstream2/latest/developerguide/instance-types.htmlEach of the instance families are compute optimised for application streaming purposes, for the more graphics intense families they come with GPU Memory. The pricing is dependent on the instance type chosen from its respective family and the region to host the Amazon Appstream deployment.**My demand is not constant - How can I automate scaling up and down?**Using policies you can develop a plan that scales your fleet based on scaling needs and solution considerations.Once you’ve visited the https://d1.awsstatic.com/whitepapers/best-practices-for-deploying-amazon-appstream-2.pdf VPC design section and verified that your VPC is able to meet the volume of users connecting to the fleet. You can move to effective fleet scaling, to meet current and anticipated Amazon AppStream 2.0 user demand, while avoiding unnecessary resource usage costs.Recommended from the best practices VPC design is that;- The size of the Subnet CIDR blocks match the quantity of anticipated users plus any growth and should be factored into the scaling policies
- Check each instance type that you plan to use, and the number of fleet instances that your VPC can support i.e. Quotas - is greater than the number of anticipated concurrent users for the same instance type.A
- Check each instance type that you plan to use, and the number of fleet instances that your VPC can support i.e. Quotas - is greater than the number of anticipated concurrent users for the same instance type.A
- Disconnect time out time – how long after a user disconnects will the session be terminated. If the fleet is set to on-demand then renewing a session after this time out limit will result in 1-2 min delays. Conversely leaving it too long will result in unnecessary cost.
- Disconnect idle time – how long can the session be idle before the session is closed. Example there is an idle time of 15mins and a timeout of 15 mins then that is 30 mins the session has been in a running state being charged.
- Setting minimum and maximum capacities for which the policy will operate and the % tolerances based on the values of cloud watch metrics CapacityUtilization, AvailableCapacity, InsufficientCapacityError
- You can have multiple scaleout /scale in policies
Scheduled policyAn alternative cost optimization approach, suitable for addressing anticipated demand patterns, eg a surge of users at 8am and downturn at 5pm involves establishing a static count of active instances tailored to specific times or days.This strategy proves advantageous when dealing with consistent user logins at different intervals throughout the day. Such scenarios as training classes, call center shifts, or school computer labs. Utilizing the "update-fleet" command in Amazon AppStream 2.0 makes it straightforward to define the precise number of active instances. Simply adjust the "Desired" value for your fleet's compute capacity, and the count of running instances will automatically synchronize with the specified "Desired" value.Target tracking scaling policyTarget tracking scaling policies provide the ability to define a specific capacity utilization threshold for your fleet.Establishing a target tracking scaling policy can be configured via AWS CLI. Application Auto Scaling takes care of the creation and maintenance of Amazon CloudWatch alarms that are responsible for triggering the scaling policy.**How can you view both live and historical data?**Amazon Appstream 2.0 publishes metrics to Amazon CloudWatch to enable detailed tracking and deep analysis. These statistics are recorded for an extended period, so you can access historical information and gain a better understanding about how your fleets are performing. Viewing Instance and Session Performance Metrics Using the Console.Using the fleet dashboard provides a visual representation of the workload patterns and enables identification over a period of either 4 hours, 1 day, 1 week or 2 weeks. This is using the same data that is captured in Amazon Cloudwatch logs; and underpins the policies. In this example the policy is set to have a min of 1 instance, and when utilisation exceeds 50% to scale out by 1. with a max capacity of 4.**Top tip - Turning Image builders off is a good start!**Image builder instances are charged per hour for windows and per second for Linux. If they are in a pending, updating_agent or running state then charges are applied dependent on the instance type & Size fees.Deploying the Amazon AppStream 2.0 Image builder requires you to choose the instance family which needs to be related to the Fleet you wish to deploy your Amazon AppStream 2.0 instances.AppBlock builder instances are charged per second. If they are in a pending, updating_agent or running state then charges are applied dependent on the instance type and size fees.The AppBlock builder has only the one instance family available which is the general purpose instance family. Any charges for running either builder will relate directly to the instance type you have chosen and the region it is located.
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.