AI Lends a Hand: Using AI to Assist with Open Source Development
As generative AI becomes more prevalent, I am finding ways to utilize it to improve the efficiency of my open-source work. I wanted to provide some examples of how integrating generative AI can accelerate development.
Abhijit Rajeshirke
Amazon Employee
Published Sep 11, 2024
In this article, I demonstrate how I have begun using AWS's generative AI capabilities to speed up development on my open-source projects, boto-formatter and resource-lister. These projects are part of the aws-samples and aws-labs GitHub repositories. As generative AI is being increasingly integrated into many modern activities, I am started experimenting and leveraging it to accelerate my own open-source work . I wanted to share some examples of how generative AI can help boost development efficiency. This is just the beginning of my learning journey with leveraging these powerful technologies.
Before going into details, lets understand what’s Boto-formatter. Boto-formatter is a Python library that streamlines working with the paginated responses returned by Amazon's Boto3 SDK for AWS. It wraps the Boto3 responses in an easy-to-use interface that handles pagination automatically and flattens the data into a consistent structure.
Specifically, boto-formatter saves developers effort in three ways:
1) It abstracts away pagination logic, so you don't have to write code to deal with paginated responses - the library handles it automatically.
2) It standardizes the variable column names that can occur in Boto3 results. The flattened responses have consistent keys regardless of the AWS service.
3) It provides helper functions to output the standardized results as CSV, JSON or other formats with just a few lines of code.
In short, for the 97 AWS services currently supported, boto-formatter reduces boilerplate code for fetching, paginating, and normalizing Boto3 responses. This simplifies reporting and analysis use cases with AWS data. The library is configuration-driven and extensible to any Boto3 list_* method, as outlined in its GitHub documentation boto-formatter
I recently released a new version of my boto-formatter library with a new "magic formatter" capability. This new feature enables chaining responses, allowing the library to handle use cases like listing all S3 buckets with S3 Inventory enabled or retrieving configuration details for all EMR clusters.
1. boto-formatter uses configuration file for each AWS services that it supports. I used Amazon Bedrock and Anthropic Claude V3 to automatically format and generate JSON configuration files for boto-formatter. This sped up a traditionally manual and tedious process.
2. I leveraged Amazon Q Developer throughout the development lifecycle to fix bugs, optimize code, and identify security vulnerabilities. This improved overall software quality and security.
The boto-formatter uses configuration files for each AWS service that contain sample flatten responses as outlined in the public boto3 API documentation. To add support for a new AWS service function, we simply need to create a configuration file for that service following the same format. The file should include a sample JSON response representing the anticipated output of the function based on the boto3 documentation. By leveraging these configuration files as examples, boto-formatter can easily be extended to handle new AWS services as needed.
https://github.com/awslabs/boto-formatter/tree/main/src/boto_formatter/service_config_mgr/service_configs
Generate and format the JSON Configuration files for specific AWS resource function there are two main steps, first step is getting the function’s reference response from boto3 public documentation and second step is format and customized this response for specific to boto-formatter library as layed out in below diagram.
Step 1: To get the function’s reference response from boto3 public documentation. I used Python code with the BeautifulSoup and urllib libraries to scrape the reference response for the function from the Boto3 public documentation. As an example, here is some sample code that retrieves the reference response for a Boto3 function by scraping the public documentation pages.
Sample response from boto3- public documentation is not formatted (see the comma after first element in Buckets array) and it’s just for reference purpose. Example
Step 2: Generate the valid Configuration JSON with help of LLM
In this step create Valid JSON from this response and format which is aligned with boto-formatter specific rules using prompt engineering .
Example Sample Prompt: Correct the errors in the string below and format into a JSON document. Go over each JSON key-value pair, and if the value is True or False, convert it to a "string". If the value is a datetime, convert it to a "string". If the value is a number, convert it to a "string". Ensure to return the formatted Json only
The output will be formatted as valid JSON, aligned with the formatting rules of boto-formatter. For example, all elements will be converted to strings and other errors will be removed.
The second AI service I used is Amazon Q Developer. I have installed the Amazon Q Developer Plugin for Visual Studio Code. Here is the documentation for setting up the Amazon Q Developer Plugin: https://marketplace.visualstudio.com/items?itemName=AmazonWebServices.amazon-q-vscode
Code Optimization:
For example, when optimizing a particular function, you can right click and select "Optimize" to get suggestions for improvements. Previously, in my get_prompt() function, I was not using f-strings for formatting strings. Amazon Q Developer suggested that I use f-strings instead and provided an example of how to do so.
Troubleshooting code errors:
On several occasions, I have utilized Amazon Q Developer to debug errors and troubleshoot issues in my code. Most recently, I ran into the error "AttributeError: 'str' object has no attribute 'read'" because I was mistakenly calling json.load rather than json.loads. Amazon Q Developer suggested fixing this by using json.loads instead, which resolved the attribute error I was seeing. The main difference being that json.load expects a file-like object to deserialize while json.loads takes a string as input. Switching to json.loads addressed my situation where I was passing a string rather than a file.
Identifying Security Vulnerabilities and other issues:
You can right click and send code to the Amazon Q developer prompt and write prompt something like "Identify security issues " to have it analyze the code for potential security vulnerabilities. The Amazon Q developer tool will review the code and provide recommendations for security issues it identifies. You can then go over these recommendations, review them, and implement any necessary changes to address the identified vulnerabilities.
Generative AI is progressing quickly, so please consult the most up-to-date documentation and guidance on the topic.
https://aws.amazon.com/ai/generative-ai
https://aws.amazon.com/q/developer/
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.