AWS Logo
Menu
Code Generation: Exploring Generative AI for Database Workloads

Code Generation: Exploring Generative AI for Database Workloads

My code generation experience using a "Productive AI" tool built on Amazon Bedrock leveraging its foundation models.

Nihilson Gnanadason
Amazon Employee
Published Dec 8, 2024
Nature's fury, in the form of Storm Darragh, recently swept across the UK, confining many of us indoors. As the wind howled outside, I found myself with an unexpected opportunity to delve into the fascinating world of generative AI. Specifically, I was curious to explore its potential in streamlining backend database-related workloads through code generation.
This exploration felt particularly fitting given my long-standing relationship with databases. My journey in this field spans decades, from contributing to JDBC/JCA implementations for WebSphere Application Server in 2004, supporting a diverse array of commercial and open-source databases, to my current fascination with cutting-edge technologies like Amazon Aurora DSQL, unveiled at re:Invent 2024.
As the storm raged on, I embarked on a different kind of adventure – one that would bridge my database experiences with the exciting frontier of AI-assisted development. Join me as I share insights from this indoor expedition, where the constraints of a stormy weekend paved way for the integration of databases and artificial intelligence.
With just a few conversational exchanges, I was able to effortlessly conceptualize and bring to life a comprehensive database application. The process flowed seamlessly from designing a detailed PostgreSQL database schema to defining RESTful APIs, implementing a Python-based AWS Lambda function, and finally generating AWS CDK code for deployment.
What might typically require days of planning, coding, and configuration was accomplished in a matter of minutes through natural language prompts. The ability to understand context, generate code, and adapt to follow-up requests demonstrates how generative AI can significantly streamline the software development process, making it accessible even to those without deep technical expertise in every aspect of backend development and cloud infrastructure.

Prompt 1: Can you provide a sample PostgreSQL schema and tables ?

Response : Certainly! I'll provide you with a sample set of PostgreSQL schemas and tables that could represent a simplified e-commerce system. This example will include multiple schemas to demonstrate a more complex database structure.

Prompt 2: This is a bit boring. Can you give another example ?

Response: Certainly! Here's another example of PostgreSQL schemas and tables, this time modeling a university management system. This example will include schemas for academics, staff, facilities, and research.
While this university management schema was no different from the order management system, I liked to continue with it as it took my thoughts back to my college days and made me feel younger.

Prompt 3: Can you list all the possible APIs for the above database schema ?

Response : Certainly! Based on the university management system database schema I provided, here's a list of potential APIs that could be developed to interact with this data. I'll organize these by schema and include common CRUD (Create, Read, Update, Delete) operations as well as some more specific operations that might be useful.

Prompt 4: Can you generate an OpenAPI specification for these APIs in YAML format ?

Response : Certainly! Here's an OpenAPI specification in YAML format for the APIs we discussed. Note that this is a simplified version and doesn't include all the endpoints we mentioned earlier, but it covers the main entities and operations. You can expand on this as needed.

Prompt 5: Now generate a Python based Lambda function to implement all the above APIs and use RDS Data APIs to interface with Aurora PostgreSQL database ?

Response : Certainly! I'll create a Python-based Lambda function that implements the APIs we discussed using the RDS Data API to interface with an Aurora PostgreSQL database. This example will provide a basic structure that you can expand upon.
First, make sure you have the necessary permissions set up for your Lambda function to access RDS Data API. You'll also need to set up the appropriate environment variables.
Here's a Python script that implements the basic structure:

Prompt 6: Great. Now can you generate a CDK code to deploy an Aurora PostgreSQL cluster with the schema and APIs generated.

Response: Certainly! I'll create a new CDK code from scratch to deploy an Aurora PostgreSQL cluster along with the schema and APIs we discussed earlier. This code will be more streamlined and focused on the core requirements. Here's the new CDK code:
Here's the CDK code:
While the above CDK code might look syntactically correct, there are a few glaring issues:
  1. It is mixing up the Serverless cluster configurations with the provisioned configuration
  2. The API resource creation can lead to duplicate resources errors.
  3. It has not used all the APIs identified originally. Let us ignore it for now. But you can prompt it to include all the needed APIs.
Now to move forward, all that is required is hinting and instructing the AI by pointing out the issues and what changes are needed. Let me prompt to update the CDK code to use Aurora Provisioned Cluster version 16.2 with RDS Data API enabled and fix the API add_resource issue. (I was even able to get the Serverless V2 cluster configurations work).

Prompt 7: Can you update the CDK code to use an Aurora Provisioned cluster version 16.2 with RDS data API enabled and fix the API add_resource to prevent the existing resource error?

Response: Certainly! I'll modify the CDK code to deploy an Aurora PostgreSQL cluster version 16.2 and enable the RDS Data API. Also I understand the issue you're facing with the add_resource method potentially causing errors due to existing resources. To fix this, we can modify the API Gateway setup to ensure we don't try to add the same resource twice. Here's an updated version of the CDK code:
To use this CDK code, you'll need to create two Lambda functions:
  • API Lambda (lambda/api/index.py): (Use the python code created in the earlier step)
  • Schema Creation Lambda (lambda/schema/index.py):
Now I have a fairly working code and will stop the code generation mania and seek help for deploying all of these generated code artifacts ( CDK & Lambda functions ).

Prompt 8: What are the deployment steps using the generated CDK, Lambda and the Schema code ?

Response : Certainly! Here are the step-by-step deployment instructions for your AWS CDK project with the University Management System:
1. Set up your development environment:
  • Ensure you have Node.js installed (v10.13.0 or later)
  • Install AWS CDK: npm install -g aws-cdk
  • Configure AWS CLI with your credentials: aws configure
2. Create and set up your CDK project:
  • Create a new directory: mkdir university-management-system
  • Navigate to the directory: cd university-management-system
  • Initialize a new CDK project: cdk init app --language python
3. Set up a Python virtual environment:
  • Create a virtual environment: python -m venv .venv
  • Activate the virtual environment:
    • On Windows: .venv\Scripts\activate
    • On macOS and Linux: source .venv/bin/activate
4. Install required dependencies:
  • pip install -r requirements.tx
  • Add the following to your requirements.tx file and reinstall:
    • aws-cdk.aws-ec2
    • aws-cdk.aws-rds
    • aws-cdk.aws-lambda
    • aws-cdk.aws-apigateway
5. Update your CDK stack:
  • Replace the content of university_management_system/university_management_system_stack.py with the CDK code provided earlier.
6. Create Lambda function directories and files:
  • Create directories:
    • mkdir -p lambda/schema lambda/api
  • Create lambda/schema/index.py and add the schema creation Lambda code
  • Create lambda/api/index.py and add the API Lambda code
7. Synthesize the CloudFormation template:
  • Run cdk synth in your project root directory
8. Deploy the stack:
  • Run cdk deploy in your project root directory
  • Confirm the deployment when prompted
9. Monitor the deployment:
  • The CDK will provide updates on the deployment progress
  • Once complete, it will output the API Gateway URL, Database Endpoint, and Database Secret ARN
10. Test the deployment:
  • Use the provided API Gateway URL to test your endpoints
  • You can use tools like Postman or curl to send requests to your API
11. Update and iterate:
  • As you develop your application, you may need to update the Lambda functions or add new API routes
  • After making changes, redeploy using cdk deploy
12. Clean up (when you're done):
  • To avoid incurring unnecessary costs, run cdk destroy to remove all created resources
Additional notes
  • Ensure you have sufficient permissions in your AWS account to create and manage the required resources.
  • The first deployment might take some time as it creates the Aurora cluster and other resources.
  • Always monitor your AWS costs and resources to avoid unexpected charges.
  • For production use, consider implementing additional security measures, such as API key authentication for your API Gateway.
  • Regularly update and patch your Lambda functions and database to maintain security.

Deploy the generated code

Using all the generated code and following the deployment steps above, I was able to get the cdk deploy command working (refer screenshot below) without having to do any changes to the code. In less than 600 seconds, the deploy completed creating all the required resources in AWS. ( Aurora PostgreSQL Cluster, Lambda functions, API Gateway resources etc ).
cdk deploy
cdk deploy
Verifying the CloudFormation stack confirms all the resources created.
CloudFormation Stack
CloudFormation Stack
Remember, this setup creates real AWS resources that may incur costs. Always review the resources being created and delete them when they're no longer needed.

Test the generated code

Let us test the schema creation lambda to check if its able to create the schema for us to get started with the database. The schema creation lambda function required a few changes :
  1. The schema creation lambda appears to be generated as a custom resource to be invoked during the deployment of the resources through CDK/CloudFormation. For testing purposes, I simply comment off the #cfnresponse.send() calls so that I can quickly test from the new Lambda function VS Code IDE. You can also get it updated such that it is invoked during the resource deployment when you run the cdk deploy.
  2. Next I also wanted the schema generation code to include sample data creation as well and had to run through another prompt to get it added using the next prompt.

Prompt 9: Can you update the schema generation Lambda function to include some code to ingest sample data into all the tables ?

Response: Certainly! I'll update the schema generation Lambda function to include sample data ingestion for all the tables. Here's the modified version of the Lambda function:
After these changes, invoking the lambda with a test payload, I realized that the Lambda function does not have the desired permissions to access the Aurora PostgreSQL cluster using RDS Data APIs. Now, let us do one last prompt for the day to update the IAM role into the CDK code.

Prompt 10: Can you update the CDK code to include a role for the schema creation lambda to perform rds-data:ExecuteStatement on the Aurora Cluster ?

Response: Certainly! I'll update the CDK code to include a specific IAM role for the schema creation Lambda function, granting it the necessary permissions to execute statements using the RDS Data API. Here's the updated CDK code:
Repeated the same for the API Lambda function as well and updated the code.
Now I just need to run another cdk deploy for the new changes in the CDK code (IAM role for Lambda) to be updated and now I can test the Lambda function to confirm that it can create the schema successfully.
Schema Creation Lambda
Schema Creation Lambda
I can also validate the sample data using the RDS Query Editor feature :
RDS Query Editor
RDS Query Editor
Let us then test the API Lambda function as well :
Testing API Lambda Handler
Testing API Lambda Handler
The final test is to invoke the APIs and verify that it works for both GET and POST.
Testing APIs using curl request
Testing APIs using curl request
And validate it from the RDS Query editor as well !!
RDS Query Editor
RDS Query Editor

Conclusion

By leveraging AI-assisted code generation, I have accomplished in a matter of minutes what might traditionally take days or even weeks of manual coding and configuration. While the generated code still requires human oversight, testing, and customization to meet specific business needs, the productivity gains are substantial. Developers can now start from a much more advanced baseline, allowing them to deliver more value in less time.
This approach doesn't replace the need for skilled developers but rather augments their capabilities. It shifts the developer's role towards higher-level system design, business logic implementation, and fine-tuning performance and security – areas where human expertise and creativity are most valuable.
As generative AI continues to evolve, I can expect even more sophisticated code generation capabilities, further revolutionizing the software development lifecycle and enabling teams to build complex, scalable systems with unprecedented speed and efficiency.
Happy Code Generation :-)
 

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Comments