Persistent Storage for EKS: What Are the Options?
A critical aspect of running Kubernetes clusters on EKS is persistent storage — how Kubernetes stores data so it remains accessible even after specific pods or workloads have shut down. EKS gives you access to several AWS storage services. We’ll discuss these services, their pros and cons, and their usefulness for production Kubernetes workloads.
Published Jul 17, 2024
Amazon Elastic Kubernetes Service (EKS) is a managed service that simplifies the deployment, management, and scaling of containerized applications using Kubernetes on AWS. A critical aspect of running Kubernetes clusters on EKS is persistent storage — how Kubernetes stores data so it remains accessible even after specific pods or workloads have shut down.
EKS gives you access to several AWS storage services. We’ll discuss these services, their pros and cons, and their usefulness for production Kubernetes workloads.
By nature, Kubernetes pods are ephemeral, meaning they can be started and stopped at will. This is great for scaling and resilience but poses a problem when you need to store data beyond the life cycle of a single pod. That's where persistent storage comes into play.
In Kubernetes, persistent storage is achieved through the use of Persistent Volumes (PVs) and Persistent Volume Claims (PVCs). A PV is a piece of storage in the cluster that has been provisioned by an administrator. It's a resource in the cluster just like a node is a cluster resource. PVCs are requests for those resources, similar to how pods consume node resources.
Amazon EBS provides block-level storage volumes that can be attached to a running Amazon EC2 instance hosting a Kubernetes node. Data is automatically replicated within an Amazon availability zone (AZ), however EBS is limited to a single AZ, which means it offers lower durability than other Amazon storage services like S3. EBS offers low latency performance needed for workloads that require persistent storage.
EBS is a good match for applications that require a file system or a database. However, note that EBS volumes can only be mounted to one node at a time (readwriteonce). This means that pods that use the same EBS volume can't be scheduled on different nodes.
Amazon EFS is a file storage service for Amazon EC2 instances. EFS provides a simple, scalable, and fully managed elastic file system for use with AWS cloud services and on-premises resources. It is designed to provide massively parallel shared access to thousands of Amazon EC2 instances.
EFS is a good option for workloads that require multiple pods to read and write to the same storage volume simultaneously (readwritemany), regardless of the node they're scheduled on. However, keep in mind that the performance of EFS can be lower than EBS, especially for workloads that require high IOPS (input/output operations per second).
Amazon FSx for NetApp ONTAP (FSx for ONTAP) provides high-performance, highly reliable file storage that is powered by the NetApp ONTAP storage operating system. FSx for ONTAP offers a variety of features, including data deduplication, compression, and thin provisioning, which can help reduce storage costs. Like EFS, it allows the same volume to be used by different nodes or pods simultaneously (readwritemany).
FSx for ONTAP can be a good option for workloads that require advanced data management features, high performance, and improved availability. In addition, FSx for ONTAP can scale to multiple petabytes of data and is very cost effective for large deployments. However, FSx for ONTAP requires additional storage operations overhead, as it lacks the full elasticity features of EFS.
When running mission critical or production workloads, there are special considerations for database storage. Let’s take a look at the suitability of each storage option for demanding use cases.
Amazon EBS provides persistent block storage volumes for use with Amazon EC2 instances. Each EBS volume is automatically replicated within its Availability Zone (AZ) to offer fault tolerance, but is limited to a single AZ, which may not offer sufficient durability for production workloads. In addition, EBS can be difficult to scale for workloads with high data volumes.
EBS volumes offer the flexibility of being easily attached or detached from EC2 instances, allowing for storage scalability without impacting application availability. With provisioned IOPS (Input/Output Operations Per Second), EBS can deliver high performance for I/O intensive applications.
Amazon EFS is designed to provide high availability and durability, distributing data across multiple AZs automatically. It is especially useful for scenarios requiring shared access to file storage for applications running on multiple instances or containers, such as content management systems, web serving, and data analytics applications.
However, for mission critical or high performance workloads, EFS often does not provide the required performance. For applications that require low-latency access to storage, the inherent latency of a network file system like EFS may impact performance. In addition, EFS is not optimized for workloads that require high IOPS.
Amazon FSx for NetApp ONTAP (FSx for ONTAP) has several features specifically designed for production applications:
- Performance: FSx for ONTAP uses the ONTAP operating system to provide low-latency access and high IOPS, providing fast and consistent performance for production workloads.
- Data management: Offers features like data replication, storage-efficient snapshots, and zero-capacity cloning, which simplify testing, backup, and disaster recovery processes.
- Cost-effectiveness: FSx for ONTAP enables significant cost savings through its data reduction technologies like deduplication and compression. These features reduce the amount of storage required for data, directly impacting and lowering storage costs.
Here are a few best practices that can help you effectively implement persistent storage in EKS, and their applicability to the three storage options discussed above.
Best Practice | Why It's Important | Why It's Important | EBS | FSx for ONTAP |
---|---|---|---|---|
Leverage dynamic provisioning | Simplifies managing storage resources by creating storage volumes on-demand | Supported (highly scalable) | Supported (highly scalable) | |
Consider data protection | Ensures data safety with regular backups, data replication across zones for fault tolerance, and disaster recovery planning | Not included | Built-in | Built-in |
Right-size resources | Avoids over- or under-provisioning by analyzing storage needs based on application data usage and growth rates | Supported | Supported | Supported |
Use high availability | Enhances application resilience with deployment across multiple zones, data replication, and automatic failover for continuous accessibility | Limited to single AZ | Multi-AZ | Multi-AZ with advanced replication options |
Consider storage quotas | Prevents resource monopolization and ensures efficient utilization by setting limits on storage usage per Kubernetes namespace | Limited to 64 TB | Unlimited storage per pod | Unlimited storage per pod |
Monitor and adjust based on performance | Keeps storage performance and capacity optimal by regularly monitoring metrics and adjusting provisioning as needed | Supports high performance with provisioned IOPS | Does not support low latency use cases | High performance built in |
Implement security measures | Secures storage by implementing access controls, encryption in transit and at rest, and following AWS best practices | Supported | Supported | Supported |
In conclusion, effectively managing persistent storage in Amazon EKS involves a strategic approach that incorporates selecting the right storage options, leveraging dynamic provisioning, ensuring data protection, right-sizing resources, and utilizing high availability features. By understanding and applying these best practices, organizations can optimize their Kubernetes deployments for performance, scalability, and cost-efficiency.
To learn more about persistent storage for EKS, visit our GitHub repository.