Claude 3.7 Multimodal with Reasoning Examples

This blog demonstrates Claude 3.7's multimodal capabilities combined with explicit reasoning, showcasing three different types of interactions: text, image, and document analysis.

Code is Here

Examples

1. Text Reasoning

text_thinking.ipynb demonstrates Claude's ability to solve complex problems through step-by-step reasoning:

Probability calculations
Computer science concepts
Mathematical problem-solving

1
2
3
4
5
from claude_think import ClaudeThink
ct = ClaudeThink()
reasoning, answer = ct.converse_stream([{"text": "cual es el minimo de personas 
para que al menos exista una probabilidad de 50% que tres personas estén de 
cumpleaños el mismo día?"}])

1
2
3
4
5
6
7
Thinking...
Este problema se puede entender como el problema del cumpleaños pero extendido a buscar al menos 3 personas que cumplan años el mismo día, con una probabilidad de al menos 50%.
Vamos a plantearlo de manera similar al problema clásico del cumpleaños. Supongamos que hay n personas en la sala y queremos calcular la probabilidad de que al menos 3 personas cumplan años el mismo día.
Consideremos que hay 365 días posibles de cumpleaños (ignorando los años bisiestos para simplificar).
Es más sencillo calcular primero la probabilidad complementaria: la probabilidad de que NO haya 3 o más personas que cumplan años el mismo día. Para que esto ocurra, cada día de cumpleaños puede tener 0, 1 o 2 personas como máximo.
Esta es una distribución de cajas (días) y bolas (personas), donde queremos distribuir n personas en 365 días de manera que ningún día tenga más de 2 personas.
Este cálculo se vuelve complicado, así que vamos a intentar otro enfoque....

2. Image Analysis

image_thinking.ipynb shows how Claude can:

Interpret complex graphs
Extract quantitative information
Reconstruct visualizations with Python code

1
2
3
4
5
6
7
8
9
10
11
from claude_think import ClaudeThink
ct = ClaudeThink()
...
content = [
        {"text": "explain this image and reconstruct with python"},
        {
            "image": {
                "format": "png", "source": { "bytes": file_bytes}
            }
        }]
reasoning, answer = ct.converse_stream(content)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Thinking...

This image is a graph showing the AIME (AI for Mathematics Education) accuracy during training of a DeepSeek-R1-Zero model. Let me break it down:

    Title: "DeepSeek-R1-Zero AIME accuracy during training"
    X-axis: "Steps" (training steps/iterations) ranging from 0 to about 8000
    Y-axis: "Accuracy" ranging from approximately 0.15 to 0.9

The graph shows multiple lines:

    Blue line with circle markers: labeled as "r1-zero-pass@1" - starts at around 0.15 accuracy and gradually increases to about 0.7 by the end of training
    Red line with circle markers: labeled as "r1-zero-cons@16" - starts at around 0.25 accuracy and increases to about 0.87 by the end of training
    Green dashed horizontal line: labeled as "o1-0912-pass@1" - appears to be a baseline at around 0.75 accuracy
    Purple dashed horizontal line: labeled as "o1-0912-cons@64" - appears to be another baseline at around 0.83 accuracy

Both the red and blue lines show training progression with fluctuations, but overall upward trends in accuracy as training proceeds.

To reconstruct this with Python, I'll need to use matplotlib to create a similar plot. I'll create some simulated data that resembles what's shown in the graph:

Final Answer:
Image Explanation

This image shows a graph titled "DeepSeek-R1-Zero AIME accuracy during training." It's tracking the accuracy of an AI model on AIME (American Invitational Mathematics Examination) problems during training.

The graph contains:

    X-axis: Training steps from 0 to approximately 8000
    Y-axis: Accuracy score from about 0.15 to 0.9
    Two training curves:
        Red line labeled "r1-zero-cons@16" that starts around 0.25 and improves to about 0.87
        Blue line labeled "r1-zero-pass@1" that starts lower (around 0.15) and reaches about 0.7
    Two horizontal reference lines:
        Green dashed line labeled "o1-0912-pass@1" at about 0.75
        Purple dashed line labeled "o1-0912-cons@64" at about 0.83

The graph shows the progression of model accuracy over training steps, with the red line (consensus-based evaluation) consistently outperforming the blue line (pass@1 evaluation).
Python Code to Recreate the Graph...

3. Document Analysis

document_thinking.ipynb demonstrates Claude's ability to:

Read and analyze PDF documents
Explain complex academic papers
Break down technical concepts into simple terms

1
2
3
4
5
6
7
8
9
10
from claude_think import ClaudeThink
ct = ClaudeThink()
...
content = [
    {"text": "Can you explain this paper in common language? why is a breaktrough?"},
    { 
        "document": { 
            "name": "250112948", "format": "pdf", "source": { "bytes": file_bytes}
        }}]
reasoning, answer = ct.converse_stream(content)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Thinking...

I need to explain the DeepSeek-R1 paper in common language and identify why it's considered a breakthrough. Let me analyze the key aspects of the paper:

Key Aspects of the Paper:

    Introduction of DeepSeek-R1-Zero and DeepSeek-R1 - a new generation of reasoning models
    Training approach using Reinforcement Learning (RL) without Supervised Fine-Tuning (SFT)
    Competitive performance with OpenAI's o1 models
    Distillation of reasoning capabilities to smaller models
    Breakthrough elements and novel contributions

Let me organize this into a clear explanation focusing on:

    What the paper is about
    The key innovation (using pure RL for reasoning)
    Why this is considered a breakthrough
    The results and implications
    The importance of the distillation findings

Final Answer:
DeepSeek-R1: A Breakthrough in AI Reasoning Through Reinforcement Learning
What's This Paper About?

This paper introduces DeepSeek-R1, a new AI model designed specifically to excel at reasoning tasks like mathematics, coding, and scientific problems. What makes this work special is how they trained the model to reason.

Usage

Each notebook can be run independently. The ClaudeThink class in claude_think.py provides the core functionality for:

Managing conversations with Claude
Handling different types of input (text, images, documents)
Displaying real-time reasoning process
Formatting responses

Enjoy!

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Select your cookie preferences

Site Terms, Privacy, and more.

Anthropic's Claude 3.7 thinking... 🤔 with multimodal content in Amazon Bedrock