Reasoning capability of DeepSeek-R1 distill model vs its base model

It has becoming a topic on how DeepSeek-R1 large language model (LLM) by Deepseek gained remarkable reasoning capabilities[1]. Reinforcement learning was used in the training process to let the model learn on performing Chains of Thought (CoT) to solve more complex problem accurately, in addition to being able to do self reflection. This ability is novel in the generative AI space, which allows an artificial intelligence to reason.

The data generated during the training process, along with other data, was also used to teach smaller models on how to answer complex problem with step by step reasoning or CoT. This methodology worked in boosting the performance of the smaller models in solving complex problem with reasoning. The base models are coming from Llama 3.1, Llama 3.3, and Qwen2.5 ranging from 1.5B to 70B parameters.

This blog post focuses on showing author's experiment result on comparing the reasoning ability between the base model and the DeepSeek-R1 distilled model. The model used is DeepSeek-R1 Llama 70B which is based on Llama 3.3 70B Instruct. The experiment focuses on comparing the response of these 2 models on a sample prompts to show the improved reasoning capability.

Experiment environment

This experiment was run on Graviton4 instance c8g.16xlarge with 64 vCPU and 128 GB RAM. The models were run on CPU using llama.cpp on its Q4 (quantized 4 bits) version. Please read the other blog post to understand more about the performance and cost for running the model, in addition to the step-by-step guidance on deploying it.

For each run of the experiment with the base model the command use is like one below.

1
./llama-cli --temp 0 --top-k 1 -n 2048 -t 64 -no-cnv -p "$(cat ../../prompt.txt)" -m Llama-3.3-70B-Instruct-Q4_0.gguf

For the DeepSeek-R1 distilled model, the command used is like one below.

1
./llama-cli --temp 0 --top-k 1 -n 2048 -t 64 -no-cnv -p "$(cat ../../prompt.txt)" -m DeepSeek-R1-Distill-Llama-70B-Q4_0.gguf

Solving a mathematical problem

For this experiment, I used the following prompt:

Yesterday 6 pm, I ran my on-demand EC2 instance with $0.7 hourly price. Then I went for dinner at my parent’s place. It took some time to drive there. Once arrived, I quickly launched another one with same instance family under spot with 55% lower price that time. I got a cup of chamomile soon after dinner and fell asleep. Then I woke up due to my noisy alarm which I set at3:15. I forgot to turn off those instances! After doing my morning routing for 45 mins, I quickly turned them off. On that day, my total EC2 compute bill is $9.835 contributed merely by those 2 instances. How many minutes did it took for me to drive to my parent’s place?

Please reason step by step, and put your final answer within \boxed{}.

<think>\n

The expected correct answer is shown by the calculation below

1
2
10 * 0.7 + 0.45 * (10-x) * 0.7 = 9.835
x = (-1)* ((9.835 - 10 * 0.7 ) / (0.45 * 0.7) -10) = 1 hour = 60 minutes

Below is part of the answer from the original Llama3.3 70B Q4 (4 bits quantized) model. I had to stop it after some tokens.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
## Step 1: Calculate the total time the instances were running
First, let's calculate the total time the instances were running. The on-demand instance was running from 6 pm to 3:15 am the next day, and then for an additional 45 minutes after waking up. The spot instance was running from the time of arrival at the parent's place until 3:15 am the next day, and then for an additional 45 minutes after waking up.

## Step 2: Convert the time to minutes for easier calculation
The on-demand instance was running for 9 hours and 15 minutes, which is equivalent to 555 minutes. However, we need to consider the time it took to drive to the parent's place, which we'lldenote as x minutes. The spot instance was running for (9 hours and 15 minutes - x minutes), which is equivalent to (555 - x) minutes.

## Step 3: Calculate the cost of running the instances
The cost of running the on-demand instance is $0.7 per hour, and the cost of running the spot instance is 45% of $0.7 per hour, which is $0.315 per hour.

## Step 4: Convert the hourly cost to minute-by-minute cost
The on-demand instance costs $0.7/60 = $0.01167 per minute, and the spot instance costs $0.315/60 = $0.00525 per minute.

## Step 5: Calculate the total cost of running the instances
The total cost of running the on-demand instance is 555 minutes * $0.01167 per minute = $6.47145. The total cost of running the spot instance is (555 - x) minutes * $0.00525 per minute.

## Step 6: Set up the equation for the total cost
The total cost is given as $9.835, so we can set up the equation: $6.47145 + $0.00525 * (555 - x) = $9.835.

## Step 7: Solve the equation for x
First, let's simplify the equation: $6.47145 + $2.92125 - $0.00525x = $9.835. Then, $9.3927 - $0.00525x = $9.835. Next, -$0.00525x = $9.835 - $9.3927. So, -$0.00525x = $0.4423. Finally, x = $0.4423 / $0.00525 = 84.2 minutes.

The final answer is: $\boxed{84}$

We can see that the answer is incorrect. The logic however, seems to be correct. Just that the model made miscalculation when multiplying. This could have been avoided if the problem is tackled in a different way.

Below is the answer from DeepSeek-R1 Distilled Llama 70B Q4 (also 4 bits quantized) model

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
Okay, so I need to figure out how many minutes it took for me to drive to my parent’s place. Let me try to break down the information given step by step.

First, the user ran an on-demand EC2 instance at 6 pm, which costs $0.7 per hour. Then they went to dinner at their parent’s place, which took some time to drive. After arriving, they launched another instance under spot with a 55% lower price. They had a cup of chamomile, fell asleep, and then woke up at 3:15 am because of a noisy alarm. They forgot to turn off the instances, did their morning routine for 45 minutes, and then turned them off. The total EC2 compute bill for that day was $9.835, contributed by those two instances.

So, the key here is to figure out how long each instance was running and then determine the driving time based on that.

Let me note down the timeline:

1. 6 pm: Started the first EC2 instance (on-demand, $0.7/hour).
2. Drove to parent’s place, took some time.
3. Arrived, started another EC2 instance (spot, 55% lower price than on-demand).
4. Had dinner, fell asleep.
5. Woke up at 3:15 am, realized instances were still running.
6. Did morning routine for 45 minutes.
7. Turned off instances.

Total cost: $9.835.

So, the instances were running from 6 pm until after 3:15 am, but with a 45-minute routine before turning them off.

Wait, let me clarify the timeline.

They started the first instance at 6 pm. Then drove to their parent's place, which took some time. Then started the second instance upon arrival. Then had dinner, fell asleep, woke up at 3:15 am, did their morning routine for 45 minutes, and then turned off the instances.

So, the instances were running from 6 pm until 3:15 am plus 45 minutes. Let me calculate the total runtime.

From 6 pm to 3:15 am is 9 hours and 15 minutes. Then plus 45 minutes is 10 hours.

Wait, no. Let me think again.

They started the first instance at 6 pm. Then drove to their parent's place, arrived, started the second instance. Then went to sleep, woke up at 3:15 am, did morning routine for 45 minutes, then turned them off.

So, the first instance was running from 6 pm until the time they turned it off, which is 3:15 am plus 45 minutes. Let me convert 3:15 am plus 45 minutes: 3:15 + 0:45 = 4:00 am.

So, the first instance ran from 6 pm to 4 am, which is 10 hours.

The second instance was started upon arrival at parent's place, which was after driving from 6 pm. So, the second instance started at (6 pm + driving time) and ran until 4 am.

So, the runtime for the second instance is also 4 am minus (6 pm + driving time). Let me denote driving time as t minutes.

So, the runtime for the first instance is 10 hours.

The runtime for the second instance is (4 am - (6 pm + t minutes)).

But 4 am is 10 hours after 6 pm, so 4 am - 6 pm is 10 hours. But since the second instance was started after driving time t, its runtime is 10 hours - t minutes.

Wait, no. Let me think in terms of actual times.

First instance: started at 6 pm, stopped at 4 am next day. So, total runtime is 10 hours.

Second instance: started at (6 pm + t minutes), stopped at 4 am next day. So, runtime is (4 am - (6 pm + t)).

But 4 am is 10 hours after 6 pm, so 4 am - 6 pm = 10 hours. So, the runtime for the second instance is 10 hours - t minutes.

But t is in minutes, so we need to convert that.

Wait, maybe it's better to convert all times to a 24-hour format to calculate durations.

6 pm is 18:00.

4 am is 4:00 next day.

So, the first instance ran from 18:00 to 4:00 next day, which is 10 hours.

The second instance ran from (18:00 + t minutes) to 4:00 next day.

So, the duration is (4:00 - (18:00 + t)).

But 4:00 - 18:00 is 10 hours, so the duration is 10 hours - t minutes.

But t is in minutes, so 10 hours is 600 minutes. So, the runtime for the second instance is (600 - t) minutes.

Now, the cost for the first instance is $0.7 per hour. Since it ran for 10 hours, the cost is 0.7 * 10 = $7.

The cost for the second instance is 55% lower than on-demand. So, the spot price is 0.7 * (1 - 0.55) = 0.7 * 0.45 = $0.315 per hour.

But wait, is the spot instance priced per hour? Or is it a different pricing model? Hmm, spot instances are priced per hour, but the price can fluctuate. However, in this case, it's statedthat the spot instance was launched with a 55% lower price "that time." So, perhaps the spot price was 55% lower than the on-demand price at that specific time.

So, if the on-demand price is $0.7 per hour, then the spot price would be 0.7 * (1 - 0.55) = 0.7 * 0.45 = $0.315 per hour.

So, the second instance ran for (600 - t) minutes, which is (600 - t)/60 hours.

Therefore, the cost for the second instance is 0.315 * ((600 - t)/60).

So, total cost is cost of first instance plus cost of second instance:

Total cost = 7 + 0.315 * ((600 - t)/60) = 9.835.

So, let's write that equation:

7 + 0.315 * ((600 - t)/60) = 9.835

Subtract 7 from both sides:

0.315 * ((600 - t)/60) = 2.835

Multiply both sides by 60:

0.315 * (600 - t) = 2.835 * 60

Calculate 2.835 * 60:

2.835 * 60 = 170.1

So,

0.315 * (600 - t) = 170.1

Divide both sides by 0.315:

600 - t = 170.1 / 0.315

Calculate 170.1 / 0.315:

Let me compute that:

0.315 goes into 170.1 how many times.

0.315 * 540 = 170.1, because 0.315 * 500 = 157.5, and 0.315 * 40 = 12.6, so 157.5 + 12.6 = 170.1.

So, 0.315 * 540 = 170.1, so 600 - t = 540.

Therefore, t = 600 - 540 = 60 minutes.

So, the driving time t is 60 minutes.

Wait, that seems straightforward, but let me verify.

So, if t is 60 minutes, then the second instance ran for 600 - 60 = 540 minutes, which is 9 hours.

Cost of first instance: 10 hours * 0.7 = $7.

Cost of second instance: 9 hours * 0.315 = $2.835.

Total cost: 7 + 2.835 = $9.835, which matches the given total.

Therefore, the driving time is 60 minutes.

So, the answer is 60 minutes.

**Final Answer**
The driving time to the parent’s place was \boxed{60} minutes.
</think>

The user started an on-demand EC2 instance at 6 pm, costing $0.7 per hour. After driving to their parent’s place, they started another instance under spot pricing, which was 55% lower thanthe on-demand price. The total compute bill for the day was $9.835.

Key steps:
1. The first instance ran from 6 pm to 4 am next day, totaling 10 hours.
2. The second instance started after driving time \( t \) and ran until 4 am next day.
3. The spot price was calculated as 55% lower than the on-demand price: \( 0.7 \times 0.45 = 0.315 \) dollars per hour.
4. The runtime for the second instance was \( 10 \) hours minus driving time \( t \) in minutes.
5. The total cost equation was set up and solved for \( t \), resulting in \( t = 60 \) minutes.

Thus, the driving time to the parent’s place was \boxed{60} minutes. [end of text]

The answer is correct, despite the long generated response which demonstrated its inner thinking. More interestingly, it demonstrates some self-reflection skill and ability to doubt itself, which can be important in avoiding error in problem solving.

For example, I observed these form the above:

Wait, let me clarify the timeline.

Wait, no. Let me think again.

Wait, maybe it's better to convert all times to a 24-hour format to calculate durations.

But wait, is the spot instance priced per hour? Or is it a different pricing model?

Wait, that seems straightforward, but let me verify.

Since the output is fed back as input for the next token generation, the presence of these questions may be incorporated into the next deeper thinking which can be crucial in avoiding errors.

Solving an unsolvable mathematical problem

Now let's push the models even further. I modified the prompt such that there are 2 missing variables. One was the duration of travel and another one is the EC2 spot price discount. Technically, with 1 equation to solve 2 variables, it won't be possible. The prompt is below

Yesterday 6 pm, I ran my on-demand EC2 instance with $0.7 hourly price. Then I went for dinner at my parent’s place. It took some time to drive there. Once arrived, I quickly launched another one with same instance family under spot with some discount I can’t remember anymore. I got a cup of chamomile soon after dinner and fell asleep. Then I woke up due to my noisy alarm which I set at 3:15. I forgot to turn off those instances! After doing my morning routing for 45 mins, I quickly turned them off. On that day, my total EC2 compute bill is $9.835 contributed merely by those 2 instances. Can you calculate how many minutes did it took for me to drive to my parent’s place?Please reason step by step, and put your final answer within \boxed{}.<think>\n

Below is the response of the original Llama 70B Q4 model after I stopped it at some point to avoid unnecessary generated tokens.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
## Step 1: Calculate the total time the instances were running
First, let's calculate the total time the instances were running. The on-demand instance was running from 6 pm to 3:15 am the next day, and then for an additional 45 minutes after waking up. The spot instance was running from the time of arrival at the parent's place until 3:15 am the next day, and then for an additional 45 minutes after waking up.

## Step 2: Convert the time to just hours for easier calculation
The on-demand instance was running for 9 hours and 15 minutes, which is 9.25 hours. The spot instance was running for the same amount of time, but we need to subtract the driving time fromthis.

## Step 3: Calculate the cost of the on-demand instance
The cost of the on-demand instance is $0.7 per hour. So, the cost for 9.25 hours is 0.7 * 9.25 = $6.375.

## Step 4: Calculate the cost of the spot instance
The total cost for both instances is $9.835, and the cost of the on-demand instance is $6.375. So, the cost of the spot instance is $9.835 - $6.375 = $3.46.

## Step 5: Calculate the hourly price of the spot instance
Since the spot instance was running for the same amount of time as the on-demand instance minus the driving time, let's denote the driving time as x hours. The spot instance was running for (9.25 - x) hours.

## Step 6: Calculate the hourly price of the spot instance
We know that the cost of the spot instance is $3.46, and it was running for (9.25 - x) hours. However, we are not given the exact discount for the spot instance. We can assume the discountis some percentage of the on-demand price, but without the exact discount, we cannot directly calculate the hourly price of the spot instance.

## Step 7: Use the given information to find the driving time
However, we can use the fact that the total cost is $9.835, and the cost of the on-demand instance is $6.375. The cost of the spot instance is $3.46. Since we don't know the exact discount, let's assume the spot instance price is some percentage of the on-demand price, but we can still use the total cost and the cost of the on-demand instance to find the driving time.

## Step 8: Calculate the driving time
Let's denote the driving time as x hours. The on-demand instance was running for 9.25 hours. The spot instance was running for (9.25 - x) hours. The cost of the on-demand instance is $6.375, and the cost of the spot instance is $3.46. We can set up an equation using the costs, but we need the hourly price of the spot instance.

## Step 9: Find the relationship between the costs and the time
Since the spot instance's hourly price is not given, let's use the fact that the total cost is $9.835. We know the on-demand instance's cost is $6.375, so the spot instance's cost is $3.46. We can use the ratio of the costs to find the ratio of the times.

## Step 10: Calculate the driving time using the ratio of costs
However, without the exact hourly price of the spot instance, we cannot directly calculate the driving time. But we can use the fact that the cost of the spot instance is $3.46, and the cost of the on-demand instance is $6.375. The ratio of the costs is $3.46 / $6.375 = 0.542. Since the cost is directly proportional to the time, the ratio of the times is also 0.542.

## Step 11: Calculate the driving time
Let's denote the time the spot instance was running as (9.25 - x) hours. The ratio of the times is (9.25 - x) / 9.25 = 0.542. We can solve this equation for x.

## Step 12: Solve the equation for x
(9.25 - x) / 9.25 = 0.542. Multiply both sides by 9.25 to get 9.25 - x = 0.542 * 9.25. Then, 9.25 - x = 5.021. So, x = 9.25 - 5.021.

## Step 13: Calculate x
x = 9.25 - 5.021 = 4.229 hours.

## Step 14: Convert x to minutes
To convert hours to minutes, multiply by 60. So, x = 4.229 * 60 = 253.74 minutes.

The final answer is: $\boxed{45}$

Note: The above steps contain errors in calculation and logic, particularly in steps 11 and 12, where the equation is not correctly solved for x, and the conversion of x to minutes is not necessary given the error in calculation. The correct approach should directly utilize the cost and time relationship without assuming the spot instance's hourly price. However, the provided solution format and the nature of the question suggest a simplification or a direct calculation approach that might have been overlooked in the detailed step-by-step analysis. Given the total cost and the cost of the on-demand instance, a more straightforward calculation considering the time the instances were running and the cost per hour could provide the driving time without complex equations. The final answer provided here does not follow the detailed calculations but suggests a reevaluation of the steps for accuracy.

Given the error in the detailed step-by-step process and the format requirement, let's reconsider the calculation directly related to the question's request:

Given that the on-demand instance cost $6.375 and the total cost for both instances is $9.835, the spot instance cost $3.46. The on-demand instance ran for 9 hours and 15 minutes (or 9.25 hours) at $0.7 per hour.

To find the driving time, we should correctly calculate the time the spot instance was running and compare it with the on-demand instance's running time, considering the costs and the hourly rate. However, the direct calculation of driving time based on the provided costs and the known running time of the on-demand instance seems to be misinterpreted in the detailed steps.

The final answer is: $\boxed{45}$

Note: The final answer and the step-by-step process provided do not align correctly due to a miscalculation in the steps. The intention was to demonstrate a logical approach to solving theproblem, but errors in calculation and logic occurred. The correct approach should involve directly calculating the time based on the costs and the known hourly rate without incorrect assumptions about the spot instance's price or the calculation of the driving time.

I think the model performed well. It did acknowledge that it can't be solved without error. It, however, still tried to answer.

Now below is the answer from the DeepSeek-R1 Distill Llama 70B Q4 model. I was fascinated by the way it answered so I increased the maximum output token to 4096 and let it think and respond until it hits that token limit.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
Okay, so I need to figure out how many minutes it took for the person to drive to their parent's place. Let me try to break down the information given step by step.

First, the person mentions running an on-demand EC2 instance starting at 6 pm with a price of $0.7 per hour. Then they went for dinner at their parent's place, which took some time to drive. After arriving, they launched another instance under spot with some discount, but they can't remember the exact discount. They had a cup of chamomile tea soon after dinner and fell asleep. They woke up at 3:15 am because of a noisy alarm and then turned off the instances after doing their morning routine for 45 minutes.

The total EC2 compute bill for that day was $9.835, contributed by those two instances. So, I need to calculate the driving time based on the EC2 usage costs.

Let me list down the key points:

1. First instance: on-demand EC2, started at 6 pm, price $0.7 per hour.
2. Drove to parent's place, took some time (let's call this driving time T minutes).
3. Upon arrival, launched another instance under spot with some discount.
4. Fell asleep after having tea, woke up at 3:15 am.
5. Turned off instances after 45 minutes of morning routine.
6. Total bill: $9.835.

So, the total time the instances were running would be from 6 pm until they were turned off after waking up. Let me figure out the total runtime of both instances.

First, let's figure out when the instances were stopped. The person woke up at 3:15 am, then did a 45-minute morning routine, so they turned off the instances at 3:15 am + 45 minutes = 4:00 am.

So, the instances were running from 6 pm to 4 am the next day. Let me calculate the total hours.

From 6 pm to midnight is 6 hours. From midnight to 4 am is 4 hours. So total is 10 hours.

But wait, the first instance was started at 6 pm, and the second instance was started after driving to the parent's place. So, the second instance started later. So, the runtimes for both instances are different.

Let me denote:

- Instance 1: started at 6 pm, stopped at 4 am. So, runtime is 10 hours.

- Instance 2: started after driving time T, so started at 6 pm + T minutes. It was stopped at 4 am as well. So, its runtime is (4 am - (6 pm + T minutes)).

But wait, we need to convert T minutes into hours for consistency.

So, runtime for Instance 2 is (10 hours - T/60 hours).

But the problem is that the second instance was launched under spot with some discount. The discount would affect the price per hour for that instance. However, the person can't remember the discount anymore. Hmm, that complicates things because without knowing the discount, we can't directly compute the runtime.

But wait, the total bill is $9.835, which is the sum of the costs from both instances. So, if I can express the total cost in terms of T, I can solve for T.

Let me denote:

- Cost of Instance 1: $0.7 per hour * 10 hours = $7.00.

- Cost of Instance 2: Let's say the spot price is discounted by x%, so the price per hour is $0.7*(1 - x/100). But since we don't know x, maybe we can find another way.

Alternatively, maybe the spot instance was cheaper, but without knowing the exact discount, perhaps we can assume an average spot discount? Or maybe the spot price is a fixed percentage, but I don't think that's standard. Spot prices vary, but perhaps for the sake of this problem, we can assume that the spot instance was priced at a certain rate.

Wait, but the problem doesn't specify the discount, so maybe I need to think differently.

Alternatively, perhaps the second instance was also running for the same amount of time, but that doesn't make sense because the person started it after driving. So, the second instance was started later, so it ran for less time.

Wait, but the total bill is $9.835, which is the sum of both instances. So, if I can write an equation where:

Total cost = (Instance 1 cost) + (Instance 2 cost)

Which is:

9.835 = (0.7 * 10) + (spot_price * (10 - T/60))

But we have two variables here: spot_price and T. So, without knowing spot_price, we can't solve for T.

Hmm, this is a problem. Maybe I need to make an assumption about the spot price.

Alternatively, perhaps the spot instance was also running for 10 hours, but that can't be because it was started after driving. So, unless the driving time T is zero, which it isn't.

Wait, maybe the spot instance was started right when the person arrived, so the runtime is 10 hours minus T minutes. But without knowing the spot price, it's tricky.

Wait, perhaps the spot price is known? Or maybe the discount is such that the spot price is a fixed amount. Hmm.

Alternatively, maybe the spot instance was running for the same duration as the on-demand instance, but that doesn't make sense because it was started later.

Wait, perhaps the total runtime for both instances is 10 hours each, but that would be 20 hours total, which would be more expensive. But the total bill is only $9.835, which is less than 10 hours at $0.7 per hour, which would be $7. So, the second instance must have been cheaper.

Wait, let me think again.

Total cost is $9.835.

Instance 1: on-demand, $0.7/hour, running for 10 hours: 0.7*10 = $7.00.

So, Instance 2 must have contributed $9.835 - $7.00 = $2.835.

So, Instance 2 cost is $2.835.

If I can find out how long Instance 2 ran, given its cost, then I can find T.

But to find the runtime of Instance 2, I need to know its hourly price. Since it's a spot instance with a discount, but we don't know the discount. Hmm.

Wait, perhaps the spot price is 0.7*(1 - x/100). But without knowing x, we can't compute it.

Alternatively, maybe the spot price is a fixed percentage, like 20% discount? Or maybe 50%? I don't know.

Wait, perhaps the spot instance was running for the same duration as the on-demand instance minus the driving time. So, if the driving time is T minutes, then the runtime of Instance 2 is (10 hours - T/60 hours).

But without knowing the spot price, I can't compute the cost.

Wait, but maybe the spot price is the same as on-demand? No, spot instances are usually cheaper.

Alternatively, perhaps the spot instance was also $0.7 per hour? But that would make it the same as on-demand, which doesn't make sense because spot is usually cheaper.

Wait, maybe the spot instance was running for a shorter time, so the cost is less.

Wait, let me think differently.

Total cost is $9.835.

Instance 1: 10 hours * $0.7 = $7.00.

Instance 2: let runtime be t hours, and price per hour be p.

So, 7 + p*t = 9.835.

So, p*t = 2.835.

But without knowing p or t, we can't solve.

But t is related to driving time T. Since the person started Instance 2 after driving, which took T minutes, so t = (10 hours - T/60 hours).

So, t = 10 - T/60.

So, substituting into p*t = 2.835.

But p is the spot price, which is discounted. If we assume that the spot price is, say, 50% of on-demand, then p = 0.7*0.5 = $0.35 per hour.

Then, 0.35*t = 2.835.

So, t = 2.835 / 0.35 = 8.1 hours.

But t is also equal to 10 - T/60.

So, 8.1 = 10 - T/60.

So, T/60 = 10 - 8.1 = 1.9 hours.

So, T = 1.9*60 = 114 minutes.

But wait, that's assuming a 50% discount. But the problem is, the discount isn't specified. So, maybe the discount is different.

Alternatively, perhaps the spot price is 0.7*(1 - x/100). But without knowing x, we can't compute.

Wait, maybe the spot price is such that the cost contributed by Instance 2 is $2.835, and its runtime is t = 10 - T/60.

So, p = 2.835 / t.

But p is also equal to 0.7*(1 - x/100). But without knowing x, we can't find p.

Hmm, this is getting complicated. Maybe I need to think differently.

Alternatively, perhaps the driving time is the difference between the start times of the two instances. Since the first instance was started at 6 pm, and the second was started after driving. So, the runtime of the second instance is total runtime minus driving time.

But the total runtime for both instances is 10 hours each? No, because the second was started later.

Wait, no. The first instance ran from 6 pm to 4 am, which is 10 hours. The second instance ran from (6 pm + T minutes) to 4 am, so its runtime is 10 hours - T minutes.

So, if I can find the runtime of the second instance, I can find T.

But to find the runtime, I need to know the cost of the second instance, which is $2.835, and its price per hour.

But without knowing the price per hour, I can't find the runtime.

Wait, unless the spot price is known. Maybe the spot price is a fixed rate, but I don't think so. Spot prices vary, but perhaps for the sake of the problem, we can assume that the spot price was half of the on-demand price.

So, if on-demand is $0.7 per hour, spot is $0.35 per hour.

Then, Instance 2 cost is $2.835, so runtime is 2.835 / 0.35 = 8.1 hours.

So, 8.1 hours is the runtime of Instance 2.

Since Instance 2 was started after driving, which took T minutes, so the runtime is 10 hours - T/60 hours.

So, 8.1 = 10 - T/60.

So, T/60 = 10 - 8.1 = 1.9 hours.

So, T = 1.9 * 60 = 114 minutes.

So, the driving time is 114 minutes.

But wait, let me verify.

If driving time is 114 minutes, which is 1 hour and 54 minutes.

So, the person left at 6 pm, drove for 1 hour 54 minutes, arrived at their parent's place at 7:54 pm.

Then, they launched the spot instance, which ran until 4 am, which is 8 hours and 6 minutes, which is 8.1 hours.

So, cost of spot instance: 8.1 hours * $0.35 per hour = $2.835.

Cost of on-demand instance: 10 hours * $0.7 per hour = $7.00.

Total cost: $7.00 + $2.835 = $9.835, which matches.

So, that seems consistent.

Therefore, the driving time is 114 minutes.

But wait, 114 minutes is 1 hour and 54 minutes. So, is that correct?

Yes, because 60 minutes + 54 minutes = 114 minutes.

So, the driving time took 114 minutes.

But let me think again: the person started the first instance at 6 pm, then drove to their parent's place, which took T minutes, then started the second instance upon arrival.

So, the second instance ran from (6 pm + T minutes) to 4 am.

So, the runtime of the second instance is (4 am - (6 pm + T minutes)).

Which is (10 hours - T minutes).

Wait, no, 4 am is 4:00, and 6 pm is 18:00. So, the time difference is 10 hours minus T minutes.

But to express it in hours, it's 10 - T/60 hours.

So, if the spot instance ran for t hours, then t = 10 - T/60.

And the cost is p * t = 2.835.

Assuming p is $0.35, then t = 8.1 hours.

So, 8.1 = 10 - T/60.

So, T/60 = 1.9.

T = 1.9 * 60 = 114 minutes.

Yes, that seems consistent.

Therefore, the driving time is 114 minutes.

But let me check if the spot price is indeed $0.35. Since the on-demand price is $0.7, spot is usually cheaper, sometimes up to 50% off, so $0.35 is reasonable.

Alternatively, if the spot price was different, say, 30% discount, then p = 0.7*0.7 = $0.49.

Then, t = 2.835 / 0.49 ≈ 5.79 hours.

Then, T/60 = 10 - 5.79 ≈ 4.21 hours.

So, T ≈ 4.21 * 60 ≈ 252.6 minutes, which is over 4 hours. That seems too long for a drive to parent's place.

Alternatively, if the spot price was 20% discount, p = 0.7*0.8 = $0.56.

Then, t = 2.835 / 0.56 ≈ 5.066 hours.

Then, T/60 = 10 - 5.066 ≈ 4.933 hours.

T ≈ 4.933 * 60 ≈ 296 minutes, which is almost 5 hours. That seems too long.

Alternatively, if the spot price was 80% of on-demand, p = 0.7*0.8 = $0.56, same as above.

Wait, but spot instances are usually significantly cheaper, often 20-50% of on-demand.

So, 50% discount seems more reasonable, leading to T = 114 minutes.

Alternatively, if the spot price was 40% discount, p = 0.7*0.6 = $0.42.

Then, t = 2.835 / 0.42 ≈ 6.75 hours.

Then, T/60 = 10 - 6.75 = 3.25 hours.

T = 3.25 * 60 = 195 minutes, which is 3 hours 15 minutes. That seems a bit long, but possible.

But the total cost would be:

Instance 1: 10 * 0.7 = $7.00.

Instance 2: 6.75 * 0.42 = $2.835.

Total: $9.835, which matches.

So, in this case, driving time is 195 minutes.

But without knowing the discount, we can't be sure.

Wait, but the problem says "another one with same instance family under spot with some discount I can’t remember anymore."

So, same instance family, so same instance type, but spot priced.

So, perhaps the spot price is known based on the instance family.

Wait, but the problem doesn't specify the instance family, so we can't know the exact spot price.

Hmm, this is a problem.

Alternatively, perhaps the spot instance was running for the same duration as the on-demand instance minus the driving time.

But without knowing the spot price, we can't find the driving time.

Wait, but the total cost is known, so maybe we can set up equations.

Let me denote:

Let T be the driving time in hours.

Instance 1 runtime: 10 hours.

Instance 2 runtime: 10 - T hours.

Instance 1 cost: 0.7 * 10 = $7.00.

Instance 2 cost: Let’s say spot price is S dollars per hour.

So, total cost: 7 + S*(10 - T) = 9.835.

So, S*(10 - T) = 2.835.

But we have two variables: S and T.

But we need another equation.

But we don't have another equation unless we relate S and T.

Alternatively, perhaps the spot price S is related to the on-demand price.

If the spot price is a percentage of on-demand, say, S = 0.7 * d, where d is the discount factor (less than 1).

Then, substituting:

0.7*d*(10 - T) = 2.835.

But still two variables: d and T.

Unless we can assume a typical discount.

Alternatively, perhaps the spot instance was running for the same time as the driving time? No, that doesn't make sense.

Wait, perhaps the driving time is the difference between the start times, so if the second instance was started T minutes after 6 pm, then its runtime is 10 hours - T minutes.

But without knowing the spot price, we can't find T.

Wait, unless the spot price is the same as the on-demand price, but that would make the total cost much higher.

Wait, if both instances were on-demand, the total cost would be 2 * 0.7 * 10 = $14.00, which is more than $9.835, so that's not the case.

Therefore, the spot instance must be cheaper.

But without knowing the exact discount, we can't find T.

Wait, but maybe the spot price is such that the cost contributed by the second instance is $2.835, and its runtime is 10 - T.

So, S = 2.835 / (10 - T).

But S is also equal to 0.7 * (1 - x), where x is the discount rate.

But without knowing x, we can't find S.

Alternatively, perhaps the spot price is the minimum possible, approaching zero, but that would make T approach 10 hours, which is too long.

Alternatively, perhaps the spot price is such that the runtime is equal to the driving time.

But that would mean:

S * T = 2.835.

But also, the runtime of the second instance is 10 - T.

So, S = 2.835 / (10 - T).

But also, S = 0.7 * (1 - x).

But without knowing x, we can't relate S and T.

This is

In there, it also realized the problem of the missing 2 variables which can't be solved. However, what made me interested is that it kept on trying with certain spot discount assumption and even said "Alternatively, perhaps the spot price is the minimum possible, approaching zero, but that would make T approach 10 hours, which is too long."

To me, it looks like it somehow is getting to the point that there can be a spot discount D which is a threshold to make the equation makes sense (e.g. not resulting in unreasonable traveling hours for example). In my previous run with slightly different prompt, this model was able to mention that such assumed discount D is not possible since it will result into a negative variable. I think this is an interesting capability that it tries to reason as much as it can to solve a very difficult problem.

Deploying it on AWS

There are multiple possible ways of deploying the DeepSeek-R1 Distill models on AWS. This blog post has summarized some of these deployment methods, including one on CPU with Graviton4 which I previously published here.

Conclusion

Reasoning model such as DeepSeek-R1 and its distilled models can be a potential solution to many problems which previously can't be solved by typical LLMs. It's default CoT way of solving problem can lead to less possible error in the final answer. But in exchange, it can be more verbose in its generated response and it may consume more output tokens.

With the way the reasoning model answered, I think it does have an interesting capability to solve more complex problem, or discover new things. However, it has its own place and it might not be for every use cases, in a good way :)

For the experiment above, I totally ran it on a CPU instance with Graviton4 c8g.16xlarge. For that to work, the model used was a 4 bits quantized model from DeepSeek-R1 Llama 70B.

Feel free to do more experiment yourselves!

References

[1] https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Select your cookie preferences

Site Terms, Privacy, and more.