Lets look at Storage and DB !

A writeup summarizing the third session of the BeSA batch 5.

Published Apr 26, 2024
Last Modified May 4, 2024
Todays session starts with the customer looking at what Storage AWS is able to offer them. AWS has three types of storage offerings:
  1. Block Storage :Basically think of your hard disk. You can update just the relevant block you need without having to update/overwrite the entire storage unit. It doesnt store any metadata about the data unit . eg : EBS, Instance Store
  2. File Storage : Network connected File Server that can be attached/mounted to various clients/EC2. eg : EFS
  3. Object Storage :Think of Google Drive or DropBox .It stores the Object as well as its metadata. When the object needs to be updated you end up storing a new version of the whole Object. eg : S3
Lets understand better what Direct Attached Storage vs Network Attached Storage vs Storage Area Network are before continuing on to the AWS storage services.
Elastic Block Storage : Everytime you connect to your EC2 instance it could be a different hardware in the background hosting your EC2. So in order to have data that persists across various logins we use EBS volumes to store this data. They are available in different sizes and types(General Purpose, Provisioned IOPS and Throughput Optimized) . EBS can be used as the root/boot volume.
EFS can be used when there is a need for multiple servers to be connected to the same data source that can store files for us. While EBS can be multi-attached too, there are limitations and so EFS is preferred. NFS protocol is used when using EFS. For the windows machines AWS FSx and Active Directory integration is offered by AWS Since these both are fully managed so AWS ensures the durability (retention) and availability.
During our discussion the customer mentions the need to store a large(unsure of how big it can grow) amount of unstructured data . AWS S3 would be a good option for the customer to store this data. Unlike the current tape drives that customer is using, retrieval would be easier and faster with S3. S3 is a fully managed AWS service which provides eleven 9's of durability. S3 also provides different storage classes from the cheapest S3 Glacier Deep Archive (good for archiving and retrieval takes upto 24/48hrs) to the Standard S3 storage which provides instantenous access. Since S3 is a global service the bucket names must be globally unique. The data on S3 is stored within region by default (for GDPR ), unless customer specifically chooses multi-region storage on S3. the S3 buckets and objects in them can be access via their URL hence the unique naming convention.
Customer has a large volume of data which they need to move from their datacenter to AWS DC and doing it over the internet would take way too long. In this case AWS has the SNOW family which can help, SnowCone (8TB), Snowball(50/80 TB) can be used to physically move data over to the AWS DC.
For the Database needed by the customer AWS has options for relational (for structured data) and non-relational (for unstructured data) databases. RDS is a fully managed DB service provided by AWS for structured data and DynamoDB for unstructured data.
Watch the session recording here for the great analogy by Ashish Prajapati to understand Durability and Availability much better.
Disclaimer/Clarification : These are just personal notes I have created summarizing the session I attended. All credit and thanks to the speakers and organizers .
BeSA is a volunteer run attempt to teach skills to become a Solutions Architect.Watch it Live here .Signup for upcoming batches here.