Designing Scalable and Versatile Storage Solutions on Amazon EKS with the Amazon EFS CSI
Configure persistent shared storage for container workloads on Amazon EKS with the Amazon EFS CSI.
AWS Admin
Amazon Employee
Published Sep 29, 2023
Last Modified Jun 21, 2024
Step 1: Set Environment Variables
Step 2: Verify or Create the IAM Role for Service Account
Step 3: Verify or Install the EFS CSI Driver Add-on
Step 4: Create an EFS File System
Step 5: Configure Mount Targets for the EFS File System
Step 6: Set Up a Storage Class for the Sample Application
Managing storage solutions for containerized applications requires careful planning and execution. Kubernetes workloads such as content management systems and video transcoding may benefit from using Amazon Elastic File System (EFS). EFS is designed to provide serverless, fully elastic file storage that lets you share file data without provisioning or managing storage capacity and performance. EFS is ideal for applications that need shared storage across multiple nodes or even across different availability zones. In contrast, Amazon Elastic Block Store (EBS) requires configuring volumes that are limited by size and region. The Amazon Elastic File System (EFS) CSI Driver add-on exposes EFS File Systems to your workloads, handling the complexities of this versatile and scalable storage solution.
In this tutorial, you will set up the Amazon EFS CSI Driver on your Amazon EKS cluster and configure persistent shared storage for container workloads. More specifically, you will install the EFS CSI Driver as an Amazon EKS add-on. Amazon EKS add-ons automate installation and management of a curated set of add-ons for EKS clusters. You'll configure the EFS CSI Driver to handle both lightweight applications, akin to microservices, and more substantial systems, comparable to databases or user authentication systems, achieving seamless storage management.
About | |
---|---|
✅ AWS experience | 200 - Intermediate |
⏱ Time to complete | 30 minutes |
🧩 Prerequisites | - [AWS Account](https://aws.amazon.com/resources/create-account/ |
📢 Feedback | Any feedback, issues, or just a 👍 / 👎 ? |
⏰ Last Updated | 2023-09-29 |
Before you begin this tutorial, you need to:
- Install the latest version of kubectl. To check your version, run:
kubectl version --short
. - Install the latest version of eksctl. To check your version, run:
eksctl info
.
This tutorial is the second installment in a series on optimizing stateful applications in Amazon EKS, focusing on the Amazon EFS CSI Driver for managing complex containerized applications that require persistent and shared storage content across availability zones. This tutorial not only guides you through the configuration of the EFS CSI Driver in the "kube-system" namespace but also delves into the intricacies of storage management within the cluster. It covers the following components:
- Authentication: Leverage the pre-configured IAM Role for the Amazon EFS CSI Driver, integrated with the OpenID Connect (OIDC) endpoint, to ensure secure communication between Kubernetes pods and AWS services.
- EFS CSI Driver Setup: Deploy the Amazon EFS CSI Driver within the Amazon EKS cluster, focusing on Custom Resource Definitions (CRDs) and the installation of the driver itself.
- Sample Application Deployment: Build and deploy a stateful application that writes the current date to a shared EFS volume. Define routing rules and annotations for a Persistent Volume Claim (PVC) based on the CentOS image. Utilize custom annotations for the PVC, specifically the 'accessMode' and 'storageClassName', to instruct the EFS CSI Driver on how to handle storage requests. For Dynamic Provisioning, use the 'storageClassName' annotation to automate the creation of Persistent Volumes (PVs).
- EFS Access Points and Security: Explore how the Amazon EFS CSI Driver leverages Amazon EFS access points as application-specific entryways. Understand the role of port 2049 for NFS traffic and how to configure security groups to allow or restrict access based on CIDR ranges. These access points enforce both user identity and root directory, offering flexibility to further refine access based on specific subnets. By fine-tuning the security group to permit traffic on this port, you empower the instances or pods within your designated CIDR range to interact with the shared file system. This granular control not only ensures the accessibility of the EFS File System by relevant cluster resources but also allows for further access restrictions based on specific subnets.
- Mount Targets: Delve into the creation of mount targets for the EFS File System, which serve as crucial connection points within the VPC for EC2 instances. It's essential to note that EFS File System only allows one mount target to be created in each Availability Zone, regardless of the subnets within that zone. If you are using an EKS cluster with nodes in private subnets, we recommend creating the Mount targets for those specific subnets.
Note If you are still in your initial 12-month period, you can get started with EFS for free by receiving 5 GB of EFS storage in the EFS Standard storage class.
Before interacting with your Amazon EKS cluster using command-line tools, it's essential to define specific environment variables that encapsulate your cluster's details. These variables will be used in subsequent commands, ensuring that they target the correct cluster and resources.
- First, confirm that you are operating within the correct cluster context. This ensures that any subsequent commands are sent to the intended Kubernetes cluster. You can verify the current context by executing the following command:
- Define the
CLUSTER_NAME
environment variable for your EKS cluster. Replace the sample value for clusterregion
.
- Define the
CLUSTER_REGION
environment variable for your EKS cluster. Replace the sample value for clusterregion
.
- Define the
CLUSTER_VPC
environment variable for your EKS cluster.
Now we will verify that the necessary IAM roles for service accounts are configured in your Amazon EKS cluster. These IAM roles are essential for enabling AWS services to interact seamlessly with Kubernetes, allowing you to leverage AWS capabilities within your pods.
Make sure the required service accounts for this tutorial are correctly set up in your cluster.
The expected output should look like this:
Optionally, if you do not already have the
“efs-csi-controller-sa”
service account set up, or you receive an error, the following commands will create the service account. Note that you must have an OpenID Connect (OIDC) endpoint associated with your cluster before you run these commands.Define the ROLE_NAME environment variable:
Run the following command to create the IAM role, Kubernetes service account, and attach the AWS managed policy to the role:
It takes a few minutes for the operation to complete. The expected output should look like this:
Run the following command to add the Kubernetes service account name to the trust policy for the IAM role:
Update the IAM role:
Here, we will verify that the EFS CSI Driver managed add-on is properly installed and active on your Amazon EKS cluster. The EFS CSI Driver is crucial for enabling Amazon EFS to work seamlessly with Kubernetes, allowing you to mount EFS File Systems as persistent volumes for your batch workloads.
Check whether the EFS CSI driver add-on is installed on your cluster:
The expected output should look like this:
Optionally, if the EFS CSI Driver add-on is not installed on your cluster, or you receive an error, the following commands show you the steps to install it. Note that you must have an OpenID Connect (OIDC) endpoint associated with your cluster before you run these commands.
List the add-ons available in eksctl. Replace the sample value for
kubernetes-version
.The expected output should look like this:
Set an environment variable for the “aws-efs-csi-driver” add-on:
List the versions of add-ons available for Kubernetes v1.27:
The expected output should look like this:
Retrieve the
Arn
for the “AmazonEKS_EFS_CSI_DriverRole” we created in previous steps:The expected output should look like this:
Run the following command to install the EFS CSI Driver add-on. Replace the
service-account-role-arn
with the ARN from the previous step.The expected output should look like this:
In this section, you will create an EFS File System and create a security group. The security group permits ingress from the CIDR for your cluster’s VPC to the EFS service. The security group is further restricted to port 2049, the standard port for NFS.
- Retrieve the CIDR range for your cluster's VPC and store it in an environment variable:
- Create a security group with an inbound rule that allows inbound NFS traffic for your Amazon EFS mount points:
- Create an inbound rule that allows inbound NFS traffic from the CIDR for your cluster's VPC.
The expected output should look like this:
Now, we’ll create an Amazon EFS File System for our Amazon EKS cluster.
EFS only allows one mount target to be created in each Availability Zone, so you’ll need to place the mount target on the appropriate subnet. If your worker nodes are on a private subnet, you should create the mount target on that subnet. We will now create mount targets for the EFS File System, the mount target is an IP address on a VPC subnet that accepts NFS traffic. The Kubernetes nodes will open NFS connections with the IP address of the mount target.
- Determine the IP address of your cluster nodes:
The expected output should look like this:
- Determine the IDs of the subnets in your VPC and which Availability Zone the subnet is in.
The expected output should look like this:
- Add Mount Targets for the Subnets Hosting Your Nodes: Run the following command to create the mount target, specifying each subnet. For example, if the cluster has a node with an IP address of 192.168.56.0, and this address falls within the CidrBlock of the subnet ID subnet-EXAMPLEe2ba886490, create a mount target for this specific subnet. Repeat this process for each subnet in every Availability Zone where you have a node, using the appropriate subnet ID.
Note: EFS only allows 1 mount target to be created in one Availability Zone, irrespective of the subnets in the Availability Zone. If you are using an EKS cluster with Worker Nodes in Private Subnets, it would be recommended to create the mount targets for the same subnets.
In Kubernetes, there are two ways to provision storage for container applications: Static Provisioning and Dynamic Provisioning. In Static Provisioning, a cluster administrator manually creates Persistent Volumes (PVs) that specify the available storage for the cluster's users. These PVs are part of the Kubernetes API and can be easily used. On the other hand, Dynamic Provisioning automates the creation of PVs. Kubernetes uses Storage Classes to generate PVs automatically when a pod requests storage through PersistentVolumeClaims. This method simplifies the provisioning process and adjusts to the specific needs of the application. To learn more, refer PersistentVolumes in Kubernetes documentation. For our lab, we will use the “Dynamic Provisioning Method” method.
- First, download the Storage Class manifest:
- Verify that the
$FILE_SYSTEM_ID
environment variable is configured:
- Update the FileSystem identifier in the
storageclass.yaml
manifest:
- Optionally, if you're running this on macOS, run the following command to change the file system:
- Deploy the Storage Class:
The expected output should look like this:
Let's deploy a sample application that writes the current date to a shared location on the EFS Shared Volume. The deployment process begins by downloading a manifest file containing the specifications for a Persistent Volume Claim (PVC) and a pod based on the CentOS image.
- First, download the PersistentVolumeClaim manifest:
Note: This manifest includes our Sample pod that utilizes the CentOs Image. It incorporates specific parameters to record the Current Date and store it in the /data/out directory on the EFS Shared Volume. Furthermore, the Manifest includes a PVC (Persistent Volume Claim) object that requests 5Gi Storage from the Storage Class we established in the preceding step.
- Now, we will deploy our sample application that will write "Current Date" to a shared location:
- Run the following command to ensure the sample application was deployed successfully and is running as expected:
The expected output should look like this:
Note: It will take a couple minutes for the pod to transition to theRunning
state. If the pod remains in aContainerCreating
state, make sure that you have included a mount target for the subnet where your node is located (as shown in step two). Without this step, the pod will remain stuck in theContainerCreating
state.
- Verify that the data is being written to the /data/out location within the shared EFS volume. Use the following command to verify:
The expected output should look like this:
Optionally, terminate the node hosting your pod and await pod rescheduling. Alternatively, you may delete the pod and redeploy it. Repeat the previous step once more, ensuring the output contains the prior output.
To avoid incurring future charges, you should delete the resources created during this tutorial.
With the completion of this tutorial, you have successfully configured Amazon EFS for persistent storage for your EKS-based container workloads. The sample pod, leveraging the CentOS image, has been configured to capture the current date and store it in the
/data/out
directory on the EFS shared volume. To align with best practices, we recommend running your container workloads on private subnets, exposing only ingress controllers in the public subnets. Furthermore, make sure that EFS mount targets are established in all availability zones where your EKS cluster resides; otherwise, you may encounter a 'Failed to Resolve' error. To Troubleshoot EFS File System Mounting Issues, Please refer here for detailed troubleshooting instructions. To continue your journey, you're now ready to store your stateful workloads like batch processes or machine learning training data to your EFS volume.Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.