Persistent Storage for EKS: What Are the Options?

Persistent Storage for EKS: What Are the Options?

A critical aspect of running Kubernetes clusters on EKS is persistent storage — how Kubernetes stores data so it remains accessible even after specific pods or workloads have shut down. EKS gives you access to several AWS storage services. We’ll discuss these services, their pros and cons, and their usefulness for production Kubernetes workloads. 

Published Jul 17, 2024
Amazon Elastic Kubernetes Service (EKS) is a managed service that simplifies the deployment, management, and scaling of containerized applications using Kubernetes on AWS. A critical aspect of running Kubernetes clusters on EKS is persistent storage — how Kubernetes stores data so it remains accessible even after specific pods or workloads have shut down.
EKS gives you access to several AWS storage services. We’ll discuss these services, their pros and cons, and their usefulness for production Kubernetes workloads. 

Introduction to Persistent Storage in Kubernetes

By nature, Kubernetes pods are ephemeral, meaning they can be started and stopped at will. This is great for scaling and resilience but poses a problem when you need to store data beyond the life cycle of a single pod. That's where persistent storage comes into play.
In Kubernetes, persistent storage is achieved through the use of Persistent Volumes (PVs) and Persistent Volume Claims (PVCs). A PV is a piece of storage in the cluster that has been provisioned by an administrator. It's a resource in the cluster just like a node is a cluster resource. PVCs are requests for those resources, similar to how pods consume node resources.

Persistent Storage Options for Amazon EKS

Amazon Elastic Block Store (EBS)

Amazon EBS provides block-level storage volumes that can be attached to a running Amazon EC2 instance hosting a Kubernetes node. Data is automatically replicated within an Amazon availability zone (AZ), however EBS is limited to a single AZ, which means it offers lower durability than other Amazon storage services like S3.  EBS offers low latency performance needed for workloads that require persistent storage.
EBS is a good match for applications that require a file system or a database. However, note that EBS volumes can only be mounted to one node at a time (readwriteonce). This means that pods that use the same EBS volume can't be scheduled on different nodes.

Amazon Elastic File System (EFS)

Amazon EFS is a file storage service for Amazon EC2 instances. EFS provides a simple, scalable, and fully managed elastic file system for use with AWS cloud services and on-premises resources. It is designed to provide massively parallel shared access to thousands of Amazon EC2 instances.
EFS is a good option for workloads that require multiple pods to read and write to the same storage volume simultaneously (readwritemany), regardless of the node they're scheduled on. However, keep in mind that the performance of EFS can be lower than EBS, especially for workloads that require high IOPS (input/output operations per second).

Amazon FSx for NetApp ONTAP (FSx for ONTAP)

Amazon FSx for NetApp ONTAP (FSx for ONTAP) provides high-performance, highly reliable file storage that is powered by the NetApp ONTAP storage operating system. FSx for ONTAP offers a variety of features, including data deduplication, compression, and thin provisioning, which can help reduce storage costs. Like EFS, it allows the same volume to be used by different nodes or pods simultaneously (readwritemany).
FSx for ONTAP can be a good option for workloads that require advanced data management features, high performance, and improved availability. In addition, FSx for ONTAP can scale to multiple petabytes of data and is very cost effective for large deployments. However, FSx for ONTAP requires additional storage operations overhead, as it lacks the full elasticity features of EFS.

Which Storage Option Is Suitable for EKS Production Workloads?

When running mission critical or production workloads, there are special considerations for database storage. Let’s take a look at the suitability of each storage option for demanding use cases.

EBS for Production Workloads

Amazon EBS provides persistent block storage volumes for use with Amazon EC2 instances. Each EBS volume is automatically replicated within its Availability Zone (AZ) to offer fault tolerance, but is limited to a single AZ, which may not offer sufficient durability for production workloads. In addition, EBS can be difficult to scale for workloads with high data volumes.
EBS volumes offer the flexibility of being easily attached or detached from EC2 instances, allowing for storage scalability without impacting application availability. With provisioned IOPS (Input/Output Operations Per Second), EBS can deliver high performance for I/O intensive applications.

EFS for Production Workloads

Amazon EFS is designed to provide high availability and durability, distributing data across multiple AZs automatically. It is especially useful for scenarios requiring shared access to file storage for applications running on multiple instances or containers, such as content management systems, web serving, and data analytics applications. 
However, for mission critical or high performance workloads, EFS often does not provide the required performance. For applications that require low-latency access to storage, the inherent latency of a network file system like EFS may impact performance. In addition, EFS is not optimized for workloads that require high IOPS.

FSx for ONTAP for Production Workloads

Amazon FSx for NetApp ONTAP (FSx for ONTAP) has several features specifically designed for production applications:
  • Performance: FSx for ONTAP uses the ONTAP operating system to provide low-latency access and high IOPS, providing fast and consistent performance for production workloads.
  • Data management: Offers features like data replication, storage-efficient snapshots, and zero-capacity cloning, which simplify testing, backup, and disaster recovery processes. 
  • Cost-effectiveness: FSx for ONTAP enables significant cost savings through its data reduction technologies like deduplication and compression. These features reduce the amount of storage required for data, directly impacting and lowering storage costs.

7 Best Practices for Implementing Persistent Storage in EKS

Here are a few best practices that can help you effectively implement persistent storage in EKS, and their applicability to the three storage options discussed above.
Best PracticeWhy It's ImportantWhy It's ImportantEBSFSx for ONTAP
Leverage dynamic provisioningSimplifies managing storage resources by creating storage volumes on-demandSupported (highly scalable)Supported (highly scalable)
Consider data protectionEnsures data safety with regular backups, data replication across zones for fault tolerance, and disaster recovery planningNot includedBuilt-inBuilt-in
Right-size resourcesAvoids over- or under-provisioning by analyzing storage needs based on application data usage and growth ratesSupportedSupportedSupported
Use high availabilityEnhances application resilience with deployment across multiple zones, data replication, and automatic failover for continuous accessibilityLimited to single AZMulti-AZMulti-AZ with advanced replication options
Consider storage quotasPrevents resource monopolization and ensures efficient utilization by setting limits on storage usage per Kubernetes namespaceLimited to 64 TBUnlimited storage per podUnlimited storage per pod
Monitor and adjust based on performanceKeeps storage performance and capacity optimal by regularly monitoring metrics and adjusting provisioning as neededSupports high performance with provisioned IOPSDoes not support low latency use casesHigh performance built in
Implement security measuresSecures storage by implementing access controls, encryption in transit and at rest, and following AWS best practicesSupportedSupportedSupported
In conclusion, effectively managing persistent storage in Amazon EKS involves a strategic approach that incorporates selecting the right storage options, leveraging dynamic provisioning, ensuring data protection, right-sizing resources, and utilizing high availability features. By understanding and applying these best practices, organizations can optimize their Kubernetes deployments for performance, scalability, and cost-efficiency.
To learn more about persistent storage for EKS, visit our GitHub repository.
 

Comments