logo
Unveiling the Power of Zero-ETL Integration between Amazon RDS for MySQL and Amazon Redshift

Unveiling the Power of Zero-ETL Integration between Amazon RDS for MySQL and Amazon Redshift

Let's Explore the Power of Zero-ETL Integration between Amazon RDS for MySQL and Amazon Redshift

Published Jan 31, 2024
Hello Everyone,
This article is my first time post on community.aws.
In this article, I will share about new feature in Amazon RDS for MySQL that is Zero-ETL Integration between RDS for MySQL and Amazon Redshift.
Before I explain about Zero-ETL Integration between RDS for MySQL and Amazon Redshift, I will explain about what is Zero-ETL.

So what is Zero-ETL? Zero-ETL, or Zero Extract, Transform, Load, refers to a data integration approach that minimizes or eliminates the need for traditional ETL processes. ETL is a common method used in data warehousing to extract data from source systems, transform it into a suitable format, and then load it into a target data warehouse. Zero-ETL aims to simplify and streamline this process by reducing or eliminating the intermediate transformation step.
So with new feature Zero-ETL integration between Amazon RDS for MySQL and Amazon Redshift, you can easily replicate data without doing ETL process. This can result in faster and more efficient data movement between the source (Amazon RDS for MySQL) and the target (Amazon Redshift) databases, potentially saving time and resources in the data integration workflow.

In order to use this feature, we need to several requirement bellow:
1. At least have 1 Database instances in Amazon RDS for MySQL
2. At least have 1 Redshift cluster or Redshift serverless workgroup with namespace
3. Create parameter group for source data Amazon RDS for MySQL
4. Configure turn on case sensitivity on Amazon Redshift
In this article, I will explain step by step how to for point no 3 & 4. For detail how to create Database instance point no 1 please see this guide. For detail how to create redshift cluster please see this guide. For detail how to create Redshift serverless workgroup with namespace please see this guide. And also don’t forget to create both Amazon RDS for MySQL and Amazon Redshift must same region and region that supported Zero-ETL Integration.
The following Regions and engine versions are available for zero-ETL integrations with Amazon Redshift
List Region Amazon RDS for MySQL that supported Zero ETL

  1. Sign in to the AWS Management Console and open the Amazon RDS console at https://console.aws.amazon.com/rds/.
  2. In the navigation pane, choose Parameter groups.
  3. Choose Create parameter group.
The Create parameter group window appears.
  1. In the Parameter group family list, select a DB parameter group family.
  2. In the Type list, if applicable, select DB Parameter Group.
  3. In the Group name box, enter the name of the new DB parameter group.
  4. In the Description box, enter a description for the new DB parameter group then choose Create.
  5. Open parameter group that created before then modify parameters bellow
    · binlog_format = ROW
value of parameter binlog_format
· binlog_row_image = full
value of parameter binlog_row_image
· binlog_checksum = NONE
value of parameter binlog_checksum
· In addition, make sure that the binlog_row_value_options parameter is not set to PARTIAL_JSON
value of parameter binlog_row_value_options
Configure turn on case sensitivity on Amazon Redshift
In this article, i use Amazon Redshift Serverless so for configure turn on case sensitivity just run AWS CLI bellow :
1
2
3
aws redshift-serverless update-workgroup \
--workgroup-name target-workgroup \
--config-parameters parameterKey=enable_case_sensitive_identifier,parameterValue=true
Don't forget to change value target-workgroup to redshift serverless workgroup that you use.

  1. In Amazon RDS, click Zero-ETL integration at left side menu
  2. Click zero-ETL integration
  3. Input integration identifier name then click Next
  4. In Select source database, choose RDS database then click Next
  5. In Select target Redshift Datawarehouse, choose redshift datawarehouse then click Next
  6. In tags and encryption click Next
  7. Review again, if ok then click Create zero-ETL Integration.
  8. Wait about 30 minutes till finish
  9. In Amazon Redshift, click Zero-ETL integration at left side menu
  10. Click Create Database Integration then input database name for zero-ETL integration in Amazon Redshift. In this article, i use database name redshift_zero_etl

Create new Database if not exist using sql script bellow :
1
CREATE DATABASE demo_zero_etl;

1
2
3
4
5
6
7
8
9
10
11
12
13
USE demo_zero_etl;

CREATE TABLE products (
product_id int,
product_name varchar(255),
PRIMARY KEY (product_id)
);

CREATE TABLE customers (
customer_id int,
customer_name varchar(255),
PRIMARY KEY (customer_id)
);
Result new table creation in Amazon RDS for MySQL
Result new table creation in Amazon Redshift Serverless

Insert New data in table customers and products using SQL script bellow :
1
2
3
4
5
6
7
INSERT INTO products VALUES (1, 'Laptop X');
INSERT INTO products VALUES (2, 'Phone Y');
INSERT INTO products VALUES (3, 'Tablet Z');

INSERT INTO customers VALUES (1, 'Ani');
INSERT INTO customers VALUES (2, 'Budi');
INSERT INTO customers VALUES (3, 'Caca');
Result new data inserted in Amazon RDS for MySQL
latest all data in products tables Amazon RDS for MySQL
latest all data in customers tables Amazon RDS for MySQL
Result new data inserted in Amazon Redshift Serverless
latest all data in products table Amazon Redshift Serverless
latest all data in customers tables Amazon Redshift Serverless

Update data in table customers and products using SQL script bellow :
1
2
3
4
5
UPDATE products set product_name = 'Samsung S24'
WHERE product_id = 2;

UPDATE customers set customer_name = 'Andika'
WHERE customer_id = 3;
Before Update
Before update products table in RDS for MySQL
Before update products table in Redshift Serverless
Before update customers table in RDS for MySQL
Before Update customers table in Redshift Serverless
After Update
After Update products table in RDS for MySQL
After Update products table in Redshift Serverless
After Update products table in RDS for MySQL
After update customers table in Redshift Serverless
Delete data in tables
Delete data in table customers and products using SQL script bellow :
1
2
3
DELETE FROM products WHERE product_name = 'Tablet Z';

DELETE FROM customers WHERE customer_name = 'Budi';
Before Delete data
Before delete data product tables in RDS for MySQL
Before Delete data products table in Redshift Serverless
Before delete data customers table in RDS for MySQL
Before delete data customers table in Redshift Serverless
After Delete data
After delete data products table on RDS for MySQL
After delete data products table on Redshift Serverless
After delete data customers table on RDS form MySQL
After Delete data customers table in Redshift Serverless

After we explore this new feature, we can more easily replicate data without doing complicated ETL process and can save more time to doing other task. Until now, this feature still Public Review so i recommend to not yet implementation to production until Generally Available. Thanks for reading this article. If this article useful don't forget to like, share to others and follow my account

  • https://aws.amazon.com/about-aws/whats-new/2023/11/amazon-rds-mysql-zero-etl-integration-amazon-redshift-public-preview/
  • https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl-using.html
  • https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl-using.setting-up.html
  • https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/zero-etl.setting-up.html#zero-etl.parameters
  • https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_WorkingWithDBInstanceParamGroups.html