
Exploring Claude 3.7 Sonnet's Hybrid Reasoning on Amazon Bedrock
Discover how to leverage Claude 3.7 Sonnet's hybrid reasoning capabilities on Amazon Bedrock with practical Python examples comparing standard and extended thinking modes.
✨ What Makes Claude 3.7 Sonnet Special?
🧠 Example 1: Comparing Standard and Extended Thinking Modes
Output Analysis: Standard vs. Extended Thinking
Extended Thinking Mode Response:
🔧 Example 2: Tool Use with Reasoning
Output Analysis: Tool Use with Reasoning
🔍 Important Implementation Details
- Hybrid Reasoning - A single model that can toggle between standard responses and detailed reasoning
- Extended Thinking Mode - Analyses problems in detail with transparent step-by-step thinking
- Adjustable Reasoning Budget - Control how many tokens are allocated to the thinking process
- Massive Output Capacity - Up to 15x longer output than predecessor models (up to 128K tokens)
- Enhanced Coding Capabilities - Industry-leading performance on coding benchmarks
- An AWS account with access to Amazon Bedrock
- AWS CLI installed and configured with appropriate permissions
- Python 3.x installed
- The latest version of boto3 and AWS CLI
- Navigate to the Amazon Bedrock console
- Go to "Model access" under "Bedrock configurations"
- Select "Modify model access" and request access for Claude 3.7 Sonnet
- us-east-1 (N. Virginia)
- us-east-2 (Ohio)
- us-west-2 (Oregon)
What would be the impact on global sea levels if all ice in Greenland melted?
I received two distinctly different responses:- Detailed quantification: The thinking process references "2.85-3 million cubic kilometers of ice" - a detail not included in either final response.
- Structured approach: Claude organises its thoughts into numbered points (timeframe, global
impact, current situation, comparison) before synthesising a more readable response. - Self-instruction: Claude tells itself "In providing my answer, I'll focus on the estimated sea level rise..." showing how it plans its final response.
- Tone differences: The standard mode response is more direct and confident, while the extended thinking shows a more deliberative, academic approach weighing facts.
- Token usage: Extended thinking used 532 output tokens compared to 161 for standard mode - more than 3 times as many tokens.
- Different final format: The extended thinking mode response uses a different structure with a more conversational ending, asking if the user would like more elaboration on any aspect.
I need to calculate the compound interest on an investment of $5,000 with an annual interest rate of 6.5% compounded monthly for 8 years.
- Structured Problem Solving: Claude first breaks down the problem, identifies the formula and values needed, and realises it needs computational help.
- Appropriate Tool Selection: It correctly determines that the calculator tool is needed to evaluate the complex expression.
- Formula Translation: Claude correctly translates the mathematical formula
A = P(1 + r/n)^(nt)
into a calculable expression. - Complete Response: After receiving the calculation result, Claude formats a clear, comprehensive response that explains both the result and how it was calculated.
- Tool Use Steps: The code demonstrates the full life cycle of tool use - from thinking, to tool request, to result processing, to final response.
- Reasoning and Inference Parameters: Reasoning is not compatible with
temperature
,top_p
, ortop_k
modifications, as well as forced tool use. When comparing standard and reasoning modes, I used default values for these parameters to ensure a fair comparison. - Budget Tokens: You must specify how many tokens to allocate for reasoning via the
budget_tokens
parameter. The minimum is 1,024 tokens, but 4,000+ tokens are recommended for complex problems. - Max Tokens Requirement: The
maxTokens
value must be higher thanbudget_tokens
. A good rule of thumb is to setmaxTokens
at least twice as high asbudget_tokens
. - Filtered Content in Follow-ups: When using tool results in a follow-up request, you must filter out the reasoningContent blocks from the previous response to avoid validation errors.
- Tool Config in Follow-ups: When sending tool results back to the model, you must include the same toolConfig in the follow-up request.
- Python Exponentiation: Note that Claude uses ^ for exponentiation in mathematical expressions, but Python uses **. The code handles this conversion automatically.
- Educational Tools: Showing students the step-by-step reasoning process for solving complex problems
- Research Assistance: Breaking down complex research questions into logical components
- Math and Science Problem Solving: Tackling multi-step calculations with transparent working
- Decision Making Transparency: Understanding how AI arrives at recommendations or conclusions
- Complex Planning: Creating detailed plans with clear reasoning behind each step
- Adjust Budget Based on Complexity: Use higher reasoning budgets (6,000+ tokens) for very complex problems and lower budgets for simpler ones.
- Explicitly Request Step-by-Step Thinking: When you want detailed reasoning, phrases like "Think step by step" or "Show your work" can help guide the model.
- Consider Performance Trade-offs: Extended thinking increases token usage and response time, so use it strategically when deeper reasoning is valuable.
- Examine Thinking Process for Verification: The thinking process can reveal potential issues in the model's reasoning that might not be apparent in the final response.
- Code Defensively: Handle different response structures and potential errors when working with reasoning and tool use in production code.
problem-solving.
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.