Back Up On-Premises Data Incrementally to Amazon S3 over SFTP using AWS Transfer Family
Business often want to securely backup on-premises data to the cloud using familiar SFTP protocol with a static IP for SFTP server. This blog demonstrates how to use AWS Transfer Family to create an SFTP endpoint which can be exposed via a static IP, enabling incremental backups from your local servers to Amazon S3.
Sameer
Amazon Employee
Published Dec 11, 2024
This blog post covers the process of setting up an SFTP endpoint using AWS Transfer Family to facilitate daily incremental backups from an on-premises server to Amazon S3. This solution is helpful for organizations looking to securely transfer data to the cloud while maintaining their existing SFTP-based backup processes.
- Allocate an Elastic IP, you can follow steps mentioned in section Allocate an Elastic IP address
- Create an Amazon S3 bucket to store your data. You can follow steps mentioned in Creating a bucket.
- Create an IAM role for Amazon S3 access. You can follow the steps mentioned in Create an IAM Role and Policy.
- For demonstration I will use Amazon EC2 instance as the source server from where I will upload data to Amazon S3. If you already have a server you can use that to send your data to Amazon S3 otherwise you can follow steps mentioned in the tutorial to launch an EC2 instance.
- Follow the steps mentioned in Generate SSH keys to get a private and public key pair which will be used in further steps.
- Open AWS Transfer Family console.
- In the navigation pane, click Servers then click Create server
- On the next page, choose SFTP (SSH File Transfer Protocol) - file transfer over Secure Shell, click next
- On the next page, for Identity Provider for SFTP, FTPS, or FTP choose Service managed
- For Endpoint configuration, configure the following inputs:
- Choose VPC Hosted as Endpoint type.
- Custom hostname: None, you can also use Amazon Route53 DNS alias or other DNS if required
- Choose a VPC which also spans across the availability zone selected for Elastic IP.
- Choose the availability zone and elastic IP address created earlier. Then, click Next
- Choose Amazon S3 as Domain
- On the next page, keep parameters to default and click Next
- Review the details and click Create server
- Wait for the server status to be online
- Select the created server and click Add user
- Enter Username: 'username'
- Select the IAM role created in step 2 of Prerequisited as the IAM role while adding user.
- For Home Directory, choose the S3 bucket that you have created to store your data.
- For SSH public keys, you need to paste the content of the public key. Paste the content of the Public Key created in the step 5 of prerequisites.
- Click Add
In case you already have a source server from where you want to backup the data, you can perform further steps on the same source server. Or if your server only requires the IP address or endpoint you can use the endpoing and IP address of the SFTP server created in the previous section.
- Connect to your on-premises server or Amazon EC2 instance. You can follow steps for Connecting to Amazon EC2 instance
- I've created a folder app-logs which you will backup to Amazon S3 everyday. You can see the following screenshot having app-logs folder and keys stored under the same directory.
- Create a python script which does the following:
- Established SSH connection with the SFTP server and opens a SFTP session.
- Identifies files which were changed in last 24 hours.
- Performs the file transfer operation over SFTP for the modified files.
- Ensures SFTP session and SSH connection is closed properly.
- Sample script for the above actions:
Now, you can create a cron job on the source server to execute the file everyday which will backup the data to Amazon S3.
In this blog, I have shown steps to create a SFTP server using AWS transfer family and highlighted how get a static IP for the SFTP server. I created an Amazon EC2 instance to show how to establish SSH connection and setup script to backup data incrementally everyday.
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.