How to launch Ray Clusters on AWS

Getting started with Ray clusters on AWS

Naga Gaddamu
Amazon Employee
Published Oct 24, 2023
Last Modified Apr 20, 2024

Introduction to Ray

Ray is a general purpose universal library that allows you to do distributed computing and it offers you an ecosystem of native libraries to scale ML workloads. It can run anywhere, you can run it on a laptop or on public Cloud , on Kubernetes or on-premise.
In simple words Ray provides you with simple Primitives for you to be able to take your python applications that you have and convert them into distributed manner at scale and take away the undifferentiated heavy lifting.
As a developer you can use Ray to take advantage of the resources available in a distributed environment without having to worry about the underlying infrastructure. Ray removes compute constraints and is fault tolerant.
In this blog we will learn how to get started with Ray on AWS.

Ray and AWS a powerful combination

AWS is designed to allow application to perform at scale. It does not only support online business as websites but also compute intensive applications consisting of Machine Leaning models. With the growing adoption of ML and GenerativeAI you can use Ray to solve common production challenges for genAI and scale ML. Using Ray you can step away from the heavy lifting of driving the distributed training and focus more on training the models and applications.

AWS Environment SetUp to Launch Ray Cluster

  1. AWS Account: You will need an AWS account to create and manage cloud resources.
  2. Python 3.x is installed.
  3. AWS user with IAM permission to create EC2 instances
  4. The AWS CLI is installed and credentials are configured to authenticate the IAM user. Check out this how to guide :
    (https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html)

Step 1: Launch an EC2 instance

  1. Log in to your AWS Management Console.
  2. Navigate to the EC2 dashboard.
  3. Click on "Launch Instance" to create a new virtual machine.
  4. Choose an Amazon Machine Image (AMI) that suits your needs. A standard Linux distribution like Amazon Linux or Ubuntu Server is a good starting point.
  5. Select an instance type based on your workload requirements. Ray can be run on various instance types, but for experimentation, a t2.micro (free-tier eligible) instance is sufficient.
  6. Configure instance details, storage, security groups, and add tags if needed.
  7. Review your settings and launch the instance. You'll need an SSH key pair to access the instance securely.

Step 2: Connect to Your EC2 Instance

Once your EC2 instance is running, connect to it via ‘EC2 instance Connect’

Step 3: Install Ray

Now you are logged in the EC2, you can install Ray using pip:

Step 4: Write your first Ray program on EC2

  1. Make a new directory and create python file raytest
  2. Run your program using the below command
    You'll see the distributed power of Ray in action as it prints out ' Hello, Let’s get started with Ray on AWS!'

Scaling and Distributing Ray workloads on AWS

One of the significant benefits of Ray is its ability to scale out to multiple nodes. To scale your Ray cluster on AWS:
  1. Launch additional EC2 instances following Step 1.
  2. Install Ray on each instance as in Step3.
  3. On the second instance modify the init block to use the head node i.e. public ip of the first instance.
    This allows your program to use multiple nodes for distributed computing.
  4. Execute your program, and Ray will distribute tasks across all connected instances.
Happy Scaling with Ray on AWS !

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Comments