logo
Menu
How to shutdown SageMaker studio apps automatically?

How to shutdown SageMaker studio apps automatically?

Amazon SageMaker lacks auto-shutdown for cost control. I'll explain a Terraform based solution to automate app shutdown in this blog.

Published May 15, 2024

Introduction

While Amazon SageMaker offers a robust suite of features, it lacks native functionality to automatically shut down studio apps when they're inactive, which is crucial for cost control during periods of resource dormancy. In this blog post, I'll elucidate a simple yet effective approach to automatically shutting down studio apps when they're not in use. Leveraging Terraform as our Infrastructure as Code (IAC) tool of choice, we'll walk through the deployment process step by step.

The script

Presented below is a straightforward bash script designed to gracefully shut down apps after a period of inactivity, set to 2 hours by default. Feel free to adjust the timeout duration to suit your specific requirements.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
#!/bin/bash
# This script installs the idle notebook auto-checker server extension to SageMaker Studio
# The original extension has a lab extension part where users can set the idle timeout via a Jupyter Lab widget.
# In this version the script installs the server side of the extension only. The idle timeout
# can be set via a command-line script which will be also created by this create and places into the
# user's home folder
#
# Installing the server side extension does not require Internet connection (as all the dependencies are stored in the
# install tarball) and can be done via VPCOnly mode.

set -eux

# timeout in minutes
export TIMEOUT_IN_MINS=120

# Should already be running in user home directory, but just to check:
cd /home/sagemaker-user

# By working in a directory starting with ".", we won't clutter up users' Jupyter file tree views
mkdir -p .auto-shutdown

# Create the command-line script for setting the idle timeout
cat > .auto-shutdown/set-time-interval.sh << EOF
#!/opt/conda/bin/python
import json
import requests
TIMEOUT=${TIMEOUT_IN_MINS}
session = requests.Session()
# Getting the xsrf token first from Jupyter Server
response = session.get("http://localhost:8888/jupyter/default/tree")
# calls the idle_checker extension's interface to set the timeout value
response = session.post("http://localhost:8888/jupyter/default/sagemaker-studio-autoshutdown/idle_checker",
json={"idle_time": TIMEOUT, "keep_terminals": False},
params={"_xsrf": response.headers['Set-Cookie'].split(";")[0].split("=")[1]})
if response.status_code == 200:
print("Succeeded, idle timeout set to {} minutes".format(TIMEOUT))
else:
print("Error!")
print(response.status_code)
EOF

chmod +x .auto-shutdown/set-time-interval.sh

# "wget" is not part of the base Jupyter Server image, you need to install it first if needed to download the tarball
sudo yum install -y wget
# You can download the tarball from GitHub or alternatively, if you're using VPCOnly mode, you can host on S3
wget -O .auto-shutdown/extension.tar.gz https://github.com/aws-samples/sagemaker-studio-auto-shutdown-extension/raw/main/sagemaker_studio_autoshutdown-0.1.5.tar.gz

# Or instead, could serve the tarball from an S3 bucket in which case "wget" would not be needed:
# aws s3 --endpoint-url [S3 Interface Endpoint] cp s3://[tarball location] .auto-shutdown/extension.tar.gz

# Installs the extension
cd .auto-shutdown
tar xzf extension.tar.gz
cd sagemaker_studio_autoshutdown-0.1.5

# Activate studio environment just for installing extension
export AWS_SAGEMAKER_JUPYTERSERVER_IMAGE="${AWS_SAGEMAKER_JUPYTERSERVER_IMAGE:-'jupyter-server'}"
if [ "$AWS_SAGEMAKER_JUPYTERSERVER_IMAGE" = "jupyter-server-3" ] ; then
eval "$(conda shell.bash hook)"
conda activate studio
fi;
pip install --no-dependencies --no-build-isolation -e .
jupyter serverextension enable --py sagemaker_studio_autoshutdown
if [ "$AWS_SAGEMAKER_JUPYTERSERVER_IMAGE" = "jupyter-server-3" ] ; then
conda deactivate
fi;

# Restarts the jupyter server
nohup supervisorctl -c /etc/supervisor/conf.d/supervisord.conf restart jupyterlabserver

# Waiting for 30 seconds to make sure the Jupyter Server is up and running
sleep 30

# Calling the script to set the idle-timeout and active the extension
/home/sagemaker-user/.auto-shutdown/set-time-interval.sh

How to deploy this script using terraform?

Let's enhance our solution by incorporating a Terraform resource aws_sagemaker_studio_lifecycle_config and linking the previously mentioned script within this resource.
1
2
3
4
5
6
resource "aws_sagemaker_studio_lifecycle_config" "jupyter" {
studio_lifecycle_config_name = "lcc-jupyter-server-autoshutdown"
studio_lifecycle_config_app_type = "JupyterServer"
studio_lifecycle_config_content = filebase64("${path.module}/scripts/lcc_jupyter_server_autoshutdown.sh")
tags = var.tags
}
By integrating a Terraform resource to create the script, we've taken a significant step forward. Now, let's direct this resource to the SageMaker Studio environment, ensuring that all users automatically inherit this configuration by default. This centralized approach simplifies management and guarantees consistent application across the board.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
esource "aws_sagemaker_domain" "default" {
domain_name = var.name
app_network_access_type = var.app_network_access_type
auth_mode = var.auth_mode
kms_key_id = var.kms_key_id
subnet_ids = var.subnet_ids
vpc_id = var.vpc_id
tags = var.tags

default_user_settings {
execution_role = var.role_arn != null ? var.role_arn : aws_iam_role.default[0].arn
security_groups = var.security_groups

jupyter_server_app_settings {
lifecycle_config_arns = [aws_sagemaker_studio_lifecycle_config.jupyter.arn]

default_resource_spec {
instance_type = "system"
lifecycle_config_arn = aws_sagemaker_studio_lifecycle_config.jupyter.arn
}
}
}
}

A complete Terraform module to deploy and manage SageMaker.

I've also developed a comprehensive Terraform module that orchestrates the creation of essential resources for setting up a SageMaker Studio environment.

Comments