Sustainability at the core of data strategy
A 3-part series taking a deeper look into building a sustainable data management practice. This post talks about designing a data strategy that implements technology and deploys workloads with sustainability as a core principle.
diverse use cases, tools, and evolving needs, an end-to-end data strategy is essential for
adopting sustainable data management practices. Refer to the AWS modern data architecture
page to find out how you can build such a strategy on AWS. By embedding sustainability into
the core of your data strategy, you ensure that every data-related decision, from collection,
storage, processing, consumption, and governance, aligns with sustainability principles. The
data analytics lens of the Well-architected framework will help you embed sustainability
principles for designing new or optimizing existing data management processes.
The diagram below illustrates the five key principles for building sustainable practices for data
management. It is important to apply these strategies on end-to-end data lifecycle from data
ingestion to data consumption.

large servers, provisioned for peak traffic. But outside of those peaks, those servers are
underutilized, consuming electricity. It is like running a large furnace for a small pot of soup.
Cloud services provide a better solution with on-demand capacity provisioning through auto-
scaling – automatically scaling up compute during peaks and scaling down as demand subsides. Not only do you maximize utilization, but with a pay-as-you-go model, you just pay for what you use.
and reducing waste:
- Use managed services where suitable: Managed services perform infrastructure provisioning and on-demand scaling without you having to manage them. This removes overhead on your part and since AWS takes that responsibility - with dedicated teams experienced in managing infrastructure, it is done efficiently.
- Use the correct type of compute: When using provisioned compute qualify the right kind of compute for the workload. Using the right type of compute suitable for the workload would improve efficiency. For non-critical jobs use spare EC2 capacity in the data center through Amazon EC2 Spot instances, it also reduces cost of your data processing jobs by 90%. Learn from recommendations in AWS Compute Optimizer for optimal resource configurations on EC2 instances, EBS volumes, ECS configurations, etc.
- Reduce storage footprint: Design processes with the principle of reducing the storage footprint. Create ingestion processes to filter out unnecessary data, compress data objects before you store them so they consume less space, and use the right format to store objects so they can be consumed efficiently by downstream processes. You can also implement processes to automate general clean-up mechanisms to remove duplicate or unused data.
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.