AWS Logo
Menu

Amazon Q Developer CLI Chat: AWS Translate vs Amazon Bedrock LLMs Comparison

Discover how I used Amazon Q chat to compare translation services without coding, revealing Bedrock LLMs offer 99% cost savings over AWS Translate.

Gonzalo Vásquez
Amazon Employee
Published May 28, 2025
Last Modified May 29, 2025
In this blog post, I'll share my recent exploration comparing AWS Translate with various Amazon Bedrock LLMs for translation tasks — a project completed entirely through Amazon Q Developer CLI chat without writing a single line of code manually. This conversational AI assistant helped me iterate through multiple script versions, handle errors, and produce a comprehensive analysis with zero traditional coding.

Project Background

Machine translation is a critical component for businesses operating globally. While AWS Translate has been a reliable service for years, the emergence of large language models (LLMs) in Amazon Bedrock presents new possibilities for translation tasks. I wanted to understand if these newer models could provide comparable or better translations at a potentially lower cost.

The Power of Amazon Q chat

What makes this project unique is that I completed it entirely through conversation with Amazon Q chat. Instead of manually writing Python scripts, debugging code, and handling exceptions, I simply described what I wanted to accomplish, and Q chat generated, modified, and executed the code for me.

Setting Up the Test Environment

I began by asking Q chat to create a Python virtual environment and install the necessary dependencies:
bash
python -m venv translation_venv
source translation_venv/bin/activate
pip install boto3 tabulate
Next, I had Q chat create a dataset of 10 product descriptions across various categories including Electronics, Home & Kitchen, Clothing, Beauty, Sports & Outdoors, Toys & Games, Pet Supplies, Books, Office Products, and Automotive. It's worth noting that even these product descriptions and categories were generated by Q chat — I didn't have to write a single product description manually!

Iterative Development Through Conversation

One of the most powerful aspects of using Q chat was the iterative development process. When errors occurred, Q chat would analyze them and suggest fixes. Here's how the iteration process worked:
1. Initial Script Creation: Q chat generated a basic script to test AWS Translate against a few Bedrock models.
2. Error Handling Iterations: When the script encountered validation errors with certain models, Q chat modified it to abort on errors and save partial results, allowing us to analyze what went wrong.
3. Model-Specific Format Adjustments: Through multiple iterations, Q chat adjusted the request formats for different model families:
• Modified Claude models to use their specific message format
• Updated Amazon Nova models to use the correct content array structure
• Fixed Jamba models to use messages instead of prompt
• Adjusted Command R+ models to use message instead of prompt
4. Parallel Processing Implementation: After getting the basic script working, Q chat enhanced it with parallel processing capabilities to improve efficiency.
Each iteration built upon the previous one, with Q chat learning from errors and improving the code without requiring me to write a single line manually.

The Testing Framework

The final testing framework that Q chat developed:
1. Loaded product descriptions from a JSON file
2. Translated each description using AWS Translate
3. Translated each description using multiple Bedrock LLMs in parallel
4. Recorded translation time, cost, and success rate
5. Generated a comprehensive comparison report

Overcoming Challenges Through Iteration

During testing, we encountered several challenges that Q chat helped resolve through iterative improvements:
1. API Format Inconsistencies: When Amazon Nova models failed with the error “expected type: JSONArray, found: String”, Q chat modified the request format to use an array of objects with text fields.
2. Throttling Issues: When Claude 3.5 Sonnet experienced throttling, Q chat added retry logic and continued with other models rather than aborting the entire test.
3. Model Availability: When Llama 3.1 405B Instruct returned validation exceptions, Q chat created a filtered model list excluding unavailable models.
4. Response Parsing: Q chat developed custom parsing logic for each model family's unique response format.
Each error was an opportunity for Q chat to improve the script, demonstrating the power of conversational AI for iterative development.

Test Results

After running the tests across all 10 product descriptions, Q chat compiled the results into a summary report. Here are the key findings:

Cost Comparison

Top 5 Most Cost-Effective Models:
1. Jamba 1.5 Mini: $0.000079 per 1K chars (100% success rate)
2. Nova Micro: $0.000087 per 1K chars (100% success rate)
3. Titan Text G1 — Lite: $0.000139 per 1K chars (100% success rate)
4. Mistral 7B Instruct: $0.000146 per 1K chars (100% success rate)
5. Command Light: $0.000181 per 1K chars (100% success rate)
AWS Translate Cost:
• $0.015000 per 1K chars (100% success rate)
• Ranked 22nd out of 22 models (most expensive option)

Speed Comparison

Top 5 Fastest Models:
1. AWS Translate: 0.3284 seconds avg (100% success rate)
2. Nova Micro: 0.9795 seconds avg (100% success rate)
3. Nova Lite: 1.0862 seconds avg (100% success rate)
4. Nova Pro: 1.2361 seconds avg (100% success rate)
5. Command R: 1.3375 seconds avg (100% success rate)

Key Insights

1. Cost Efficiency: Bedrock LLMs are significantly more cost-effective than AWS Translate for translation tasks. Jamba 1.5 Mini is approximately 190 times cheaper than AWS Translate per 1,000 characters.
2. Speed Trade-off: AWS Translate remains the fastest option, processing translations about 3 times faster than the quickest Bedrock model (Nova Micro).
3. Balanced Performance: Nova Micro stands out as offering an excellent balance between cost and speed, being the second most cost-effective and second-fastest model in our tests.
4. Success Rate: Most models achieved a 100% success rate, with only Claude 3.5 Sonnet experiencing throttling issues.

The Power of Conversational Development

What impressed me most about this project was how Q chat handled the entire development cycle:
1. Error-Driven Iteration: Each error led to an improved version of the script
2. Complex Problem-Solving: Q chat figured out the correct request formats for each model family
3. Parallel Processing Implementation: Q chat optimized the script for efficiency
4. Data Analysis: Q chat generated comprehensive reports and visualizations
All of this was accomplished through natural language conversation, without me writing a single line of code manually.

Recommendations

Based on the test results, here are my recommendations:
For cost-sensitive applications: Use Jamba 1.5 Mini
For time-sensitive applications: Use AWS Translate
For balanced performance: Consider Nova Micro

Conclusion

This exploration revealed two important things:
1. Amazon Bedrock LLMs offer a compelling alternative to AWS Translate for translation tasks, particularly when cost is a primary concern. The significant cost savings (up to 99%) make these models worth considering, especially for high-volume translation needs.
2. Amazon Q chat represents a paradigm shift in how we can approach development tasks. By handling the entire development cycle through conversation — from initial script creation through multiple iterations to fix errors with model-specific formats — Q chat demonstrated how AI assistants can dramatically accelerate development work.
For organizations already using AWS Translate, it may be worth exploring a hybrid approach where time-sensitive translations use AWS Translate, while bulk translations that are less time-sensitive leverage more cost-effective Bedrock models like Jamba 1.5 Mini or Nova Micro.
Have you experimented with using Amazon Q chat for development tasks or LLMs for translation? I'd love to hear about your experiences in the comments below!
PS: This whole blog post was generated by Q chat too, at the end of the session I asked it to generate this content.
 

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Comments