Code Generation: Exploring Generative AI for Database Workloads
My code generation experience using a "Productive AI" tool built on Amazon Bedrock leveraging its foundation models.
Nihilson Gnanadason
Amazon Employee
Published Dec 8, 2024
Nature's fury, in the form of Storm Darragh, recently swept across the UK, confining many of us indoors. As the wind howled outside, I found myself with an unexpected opportunity to delve into the fascinating world of generative AI. Specifically, I was curious to explore its potential in streamlining backend database-related workloads through code generation.
This exploration felt particularly fitting given my long-standing relationship with databases. My journey in this field spans decades, from contributing to JDBC/JCA implementations for WebSphere Application Server in 2004, supporting a diverse array of commercial and open-source databases, to my current fascination with cutting-edge technologies like Amazon Aurora DSQL, unveiled at re:Invent 2024.
As the storm raged on, I embarked on a different kind of adventure – one that would bridge my database experiences with the exciting frontier of AI-assisted development. Join me as I share insights from this indoor expedition, where the constraints of a stormy weekend paved way for the integration of databases and artificial intelligence.
With just a few conversational exchanges, I was able to effortlessly conceptualize and bring to life a comprehensive database application. The process flowed seamlessly from designing a detailed PostgreSQL database schema to defining RESTful APIs, implementing a Python-based AWS Lambda function, and finally generating AWS CDK code for deployment.
What might typically require days of planning, coding, and configuration was accomplished in a matter of minutes through natural language prompts. The ability to understand context, generate code, and adapt to follow-up requests demonstrates how generative AI can significantly streamline the software development process, making it accessible even to those without deep technical expertise in every aspect of backend development and cloud infrastructure.
Response : Certainly! I'll provide you with a sample set of PostgreSQL schemas and tables that could represent a simplified e-commerce system. This example will include multiple schemas to demonstrate a more complex database structure.
Response: Certainly! Here's another example of PostgreSQL schemas and tables, this time modeling a university management system. This example will include schemas for academics, staff, facilities, and research.
While this university management schema was no different from the order management system, I liked to continue with it as it took my thoughts back to my college days and made me feel younger.
Response : Certainly! Based on the university management system database schema I provided, here's a list of potential APIs that could be developed to interact with this data. I'll organize these by schema and include common CRUD (Create, Read, Update, Delete) operations as well as some more specific operations that might be useful.
Response : Certainly! Here's an OpenAPI specification in YAML format for the APIs we discussed. Note that this is a simplified version and doesn't include all the endpoints we mentioned earlier, but it covers the main entities and operations. You can expand on this as needed.
Response : Certainly! I'll create a Python-based Lambda function that implements the APIs we discussed using the RDS Data API to interface with an Aurora PostgreSQL database. This example will provide a basic structure that you can expand upon.
First, make sure you have the necessary permissions set up for your Lambda function to access RDS Data API. You'll also need to set up the appropriate environment variables.
Here's a Python script that implements the basic structure:
Response: Certainly! I'll create a new CDK code from scratch to deploy an Aurora PostgreSQL cluster along with the schema and APIs we discussed earlier. This code will be more streamlined and focused on the core requirements. Here's the new CDK code:
Here's the CDK code:
While the above CDK code might look syntactically correct, there are a few glaring issues:
- It is mixing up the Serverless cluster configurations with the provisioned configuration
- The API resource creation can lead to duplicate resources errors.
- It has not used all the APIs identified originally. Let us ignore it for now. But you can prompt it to include all the needed APIs.
Now to move forward, all that is required is hinting and instructing the AI by pointing out the issues and what changes are needed. Let me prompt to update the CDK code to use Aurora Provisioned Cluster version 16.2 with RDS Data API enabled and fix the API add_resource issue. (I was even able to get the Serverless V2 cluster configurations work).
Response: Certainly! I'll modify the CDK code to deploy an Aurora PostgreSQL cluster version 16.2 and enable the RDS Data API. Also I understand the issue you're facing with the
add_resource
method potentially causing errors due to existing resources. To fix this, we can modify the API Gateway setup to ensure we don't try to add the same resource twice. Here's an updated version of the CDK code:To use this CDK code, you'll need to create two Lambda functions:
- API Lambda (
lambda/api/index.py
): (Use the python code created in the earlier step) - Schema Creation Lambda (
lambda/schema/index.py
):
Now I have a fairly working code and will stop the code generation mania and seek help for deploying all of these generated code artifacts ( CDK & Lambda functions ).
Response : Certainly! Here are the step-by-step deployment instructions for your AWS CDK project with the University Management System:
1. Set up your development environment:
- Ensure you have Node.js installed (v10.13.0 or later)
- Install AWS CDK:
npm install -g aws-cdk
- Configure AWS CLI with your credentials:
aws configure
2. Create and set up your CDK project:
- Create a new directory:
mkdir university-management-system
- Navigate to the directory:
cd university-management-system
- Initialize a new CDK project:
cdk init app --language python
3. Set up a Python virtual environment:
- Create a virtual environment:
python -m venv .venv
- Activate the virtual environment:
- On Windows:
.venv\Scripts\activate
- On macOS and Linux:
source .venv/bin/activate
4. Install required dependencies:
pip install -r requirements.tx
- Add the following to your requirements.tx file and reinstall:
aws-cdk.aws-ec2
aws-cdk.aws-rds
aws-cdk.aws-lambda
aws-cdk.aws-apigateway
5. Update your CDK stack:
- Replace the content of university_management_system/university_management_system_stack.py with the CDK code provided earlier.
6. Create Lambda function directories and files:
- Create directories:
mkdir -p lambda/schema lambda/api
- Create
lambda/schema/index.py
and add the schema creation Lambda code - Create
lambda/api/index.py
and add the API Lambda code
7. Synthesize the CloudFormation template:
- Run
cdk synth
in your project root directory
8. Deploy the stack:
- Run
cdk deploy
in your project root directory - Confirm the deployment when prompted
9. Monitor the deployment:
- The CDK will provide updates on the deployment progress
- Once complete, it will output the API Gateway URL, Database Endpoint, and Database Secret ARN
10. Test the deployment:
- Use the provided API Gateway URL to test your endpoints
- You can use tools like Postman or curl to send requests to your API
11. Update and iterate:
- As you develop your application, you may need to update the Lambda functions or add new API routes
- After making changes, redeploy using
cdk deploy
12. Clean up (when you're done):
- To avoid incurring unnecessary costs, run
cdk destroy
to remove all created resources
Additional notes
- Ensure you have sufficient permissions in your AWS account to create and manage the required resources.
- The first deployment might take some time as it creates the Aurora cluster and other resources.
- Always monitor your AWS costs and resources to avoid unexpected charges.
- For production use, consider implementing additional security measures, such as API key authentication for your API Gateway.
- Regularly update and patch your Lambda functions and database to maintain security.
Using all the generated code and following the deployment steps above, I was able to get the
cdk deploy
command working (refer screenshot below) without having to do any changes to the code. In less than 600 seconds, the deploy completed creating all the required resources in AWS. ( Aurora PostgreSQL Cluster, Lambda functions, API Gateway resources etc ).Verifying the CloudFormation stack confirms all the resources created.
Remember, this setup creates real AWS resources that may incur costs. Always review the resources being created and delete them when they're no longer needed.
Let us test the schema creation lambda to check if its able to create the schema for us to get started with the database. The schema creation lambda function required a few changes :
- The schema creation lambda appears to be generated as a custom resource to be invoked during the deployment of the resources through CDK/CloudFormation. For testing purposes, I simply comment off the
#cfnresponse.send()
calls so that I can quickly test from the new Lambda function VS Code IDE. You can also get it updated such that it is invoked during the resource deployment when you run thecdk deploy
. - Next I also wanted the schema generation code to include sample data creation as well and had to run through another prompt to get it added using the next prompt.
Response: Certainly! I'll update the schema generation Lambda function to include sample data ingestion for all the tables. Here's the modified version of the Lambda function:
After these changes, invoking the lambda with a test payload, I realized that the Lambda function does not have the desired permissions to access the Aurora PostgreSQL cluster using RDS Data APIs. Now, let us do one last prompt for the day to update the IAM role into the CDK code.
Response: Certainly! I'll update the CDK code to include a specific IAM role for the schema creation Lambda function, granting it the necessary permissions to execute statements using the RDS Data API. Here's the updated CDK code:
Repeated the same for the API Lambda function as well and updated the code.
Now I just need to run another
cdk deploy
for the new changes in the CDK code (IAM role for Lambda) to be updated and now I can test the Lambda function to confirm that it can create the schema successfully. I can also validate the sample data using the RDS Query Editor feature :
Let us then test the API Lambda function as well :
The final test is to invoke the APIs and verify that it works for both GET and POST.
And validate it from the RDS Query editor as well !!
By leveraging AI-assisted code generation, I have accomplished in a matter of minutes what might traditionally take days or even weeks of manual coding and configuration. While the generated code still requires human oversight, testing, and customization to meet specific business needs, the productivity gains are substantial. Developers can now start from a much more advanced baseline, allowing them to deliver more value in less time.
This approach doesn't replace the need for skilled developers but rather augments their capabilities. It shifts the developer's role towards higher-level system design, business logic implementation, and fine-tuning performance and security – areas where human expertise and creativity are most valuable.
As generative AI continues to evolve, I can expect even more sophisticated code generation capabilities, further revolutionizing the software development lifecycle and enabling teams to build complex, scalable systems with unprecedented speed and efficiency.
Happy Code Generation :-)
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.