You Don't Need an Army of Engineers to Migrate to the Cloud

You Don't Need an Army of Engineers to Migrate to the Cloud

An interview with AWS Hero Dave Stauffacher on automating cloud migrations and video of his talk at re:Invent 2023.

Mark Pergola
Amazon Employee
Published Oct 30, 2023
Last Modified Dec 8, 2023
AWS Hero Dave Stauffacher spends a lot of his time brainstorming new ways systems can fail. As the chief platform engineer at Direct Supply, it’s his job to prepare for the worst so the worst doesn’t happen. With a background in data storage and protection, Dave has helped Direct Supply safely navigate a 30,000% data growth over the last 15 years and more recently automated a 20TB file server migration—the subject of his talk at this year’s re:Invent. In this brief interview, Dave talks about automating cloud migrations and why it’s at the heart of generative AI. You can also watch his talk from AWS re:Invent 2023 below.
Who is your re:Invent talk for and what is the most important thing you want them to understand?
“My talk is going to target those who are planning a file system migration or looking to learn more about cloud automation and storage best practices. The most important thing I want attendees to take away from my talk is that with a bit of up front planning, and some helpful automation, a large scale file system migration can be done quickly and successfully without an army of engineers. And of course, if I can do it, anyone can probably do it.
I’m not going to go into detail on the benefits of using infrastructure as code, and I’m not going into detail on how to streamline and structure your infrastructure code. If you’re new to building in the cloud, or you’re looking to better standardize and automate your cloud deployments, infrastructure as code is a great starting point. My focus is going to be more on the tools I had to build to manage a large-scale data migration. While Terraform played an important role in my project, it won’t be center stage.
I love sharing what I’ve learned with others and talking to people who are just starting to figure out the cloud and are hungry to learn more.“
What does someone need to understand first and are there community content resources that would help them prepare?
“I’ve written a few blog posts in the past about various storage related topics that would be helpful to read (Using Storage Gateway for Backup and Performance Testing FSx Gateway) as well as my journey from datacenter engineer to AWS Hero. It would also be helpful to have a basic understanding of managing and operating terabyte scale file systems.”
What didn't you cover in the talk that you wish you could because you only had an hour?
“I would love to get deeper into the details and "gotchas" in the scripts and automations I built, but that could be a separate talk altogether. Automating a process is not a substitute for knowing how a process works and should behave. When you’re automating a file server migration, you need to make sure the automations you build don’t put your data at risk of not being copied or of being overwritten. Understanding how the process is designed to work, and all the nuance to the process, will help you ensure anything you automate keeps your data safe. A good example of this is the work I did to manipulate the “preserve deleted files” parameters on each data sync task based on where it fell in the overall migration plan, which was critical in minimizing the down-time associated with my file system migration.“
What’s one question you wish someone would ask you about this topic?
“I want people to ask me questions that they feel will help them improve their projects and processes. I’m not shy about the lessons I’ve learned — especially the ones I’ve learned the hard way. In my talk, I’ll be sharing a lot of the lessons I learned around performance tuning a file system and scaling from a few datasync tasks to more than one hundred. If there’s anything I can share that helps someone do their best work or grow their skills, I’m excited for the conversation!”
How did you become an expert in this area, and why is it an area you are passionate about?
“Data storage is an area of tech that comes naturally to me. Over the last 20 years, I’ve been responsible for supporting a massive (30,000%) storage growth, I’ve been an early adopter of new and transformational tech, and I’ve ridden through some statistically impossible failures. Have you ever met someone who lost two separate RAID sets on two separate arrays in the same night?”
Tech comes naturally to you, so what are you finding exciting right now and why?
“Everybody is talking about artificial intelligence, and I’ve been branching out to learn more about the foundations of AI and how I can leverage it to reduce repetitive toil. I’ve also been playing around with patterns for automating data recoveries and disaster recoveries. Maybe that will make a great topic for my next talk!”
How are you seeing GenAI impacting this topic and area of the cloud?
“AI is all about the data—data for training, data to be refined, data to be analyzed, summarized, and recombined. And the more use we find for AI, the more data we’re going to need to secure, access, scan, and create. Storage is going to be at the heart of that.
If you’re hoping to leverage AI to better understand and extract residual value from your existing data, that data is going to have to live somewhere where it can be analyzed and used for training new data models. So, you’re going to need storage technologies to house all this data and make it available. Plus, you have to account for performance, protection of the data, maybe even replication of the data for development and QA testing of new training patterns. To prepare, I think it’s important to develop foundational knowledge on how AI works and how you can integrate your own data. There are many online resources that can help, including online labs for Amazon Sagemaker Studio. If you’re attending re:Invent, there are plenty of sessions and labs that can help develop your skills working with AI.“
About Dave-
Dave’s been working on storage for 20 years and has been an AWS Hero since 2019. He has really bad luck when traveling, so if you come to his talk at re:Invent, he has hinted that you might also learn what questions they ask when you’re put in charge of deciding if your flight home should make an emergency landing. Learn what re:invent storage talks Dave can’t wait to attend after he finishes giving his own to help you make the most of your time at AWS re:Invent 2023. Connect with Dave on LinkedIn or X (Twitter) @DaveBuildsCloud.

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.