#TGIFun๐ YOLambda: Running Serverless YOLOv8/9
Run inference at scale with YOLOv8/9 in a secure and reliable way with AWS Lambda.
Joรฃo Galego
Amazon Employee
Published Apr 5, 2024
Last Modified May 6, 2024
In this episode of #TGIFun๐, I'd like to demonstrate a quick and easy way to deploy YOLOv8/9 ๐๏ธ on AWS Lambda using the AWS SAM (Serverless Application Model) CLI.
Hosting YOLO on Lambda strikes a good balance between performance, scalability and cost efficiency. Plus, it's always fun to put stuff inside Lambda functions. If you're interested in exploring other deployment options though, feel free to scroll all the way down to the References section.
๐จโ๐ป All code and documentation is available on GitHub.
So what's YOLOv8 and why should you care? Let's start with a short recap...
YOLOv8 (You Only Look Once) is a state-of-the-art computer vision model that supports multiple tasks
It builds on top of an already long history of YOLO models
and it was designed to be smaller ๐ค and faster โก than previous iterations.
While a full description of the YOLOv8 architecture is well beyond the scope of this article, it's useful to gain some intuition on what's happening behind the scenes.
Referring back to the original (YOLOv1) paper, YOLO models work by dividing the input image into a grid, predicting a set of bounding boxes [note: as we will see shortly, these are expressed as 2+2-tuples of top-left
(x1, y1)
and bottom-right coordinates (x2, y2)
], as well as their associated confidence scores and class probabilities, to generate the final predictions.It goes without saying that I'm obviously oversimplifying things here.
Over the years, there have been many improvements like faster NMS implementations (Non-Maximum Suppression, in case you're wondering) or the use of "bag-of-freebies" and "bag-of-specials" approaches (best names ever!) that have made YOLO faster and stronger.
Fortunately, you won't need to care about those at all to work on this project. If you're interested in such things though, I strongly encourage you to read up on the history of YOLO.
โ This introduction was written before the release of YOLOv9.
๐ก If you want to learn more, just scroll all the way down to the References section.
In this project, we're going to create a simple object detection app that accepts an image ๐ผ๏ธ
sends it to the YOLO model and returns a list of detected objects
which we can then place on top of the original image
We're going to use a vanilla YOLOv8 model, but you're more than welcome to use a fine-tuned model or to train your own YOLO.
Sounds fun? ๐คฉ Then buckle up and let's built it together!
Before we get started, make sure these tools are installed and properly configured:
Let's start by cloning the repository
๐งช Switch to thefeat/YOLOv9
branch if you're feeling experimental!
As a best practice, I recommend you create a Conda environment or something similar to keep everything isolated
Once the environment is activated, we can kick things off and install the project dependencies
One of those dependencies is the ultralytics package which includes the
yolo
CLI. We can use it to download the YOLOv8 model and convert it to the ONNX format:
๐ก The YOLOv8 series offers a wide range of models both in terms of size (n
ano >>x
l) and specialized task likeseg
mentation orpose
estimation. If you want to try a different model, please refer to the official documentation (Supported Tasks and Modes).
๐งช Replaceyolov8n
withyolov9c
in the commands below to work with YOLOv9. Just keep in mind that the performance and the output of our app may not be the same.
As an optional step, we can run ONNX Simplifier (based on ONNX Optimizer) against our model to get rid of redundant operations. Everything counts to reduce the size of our model and make it run faster.
Once this is done, we can look "under the hood" and take a peek at the computational graph with a tool like Netron:
Let's move our model to a dedicated folder
and use the AWS SAM CLI to build a container image for our appp
While in development, we can test our app by using
sam local
Whenever you're ready, just start the deployment ๐
While this is running, let's take a closer look at our template
Here are a few important things to notice:
- ๐งฑ Resources - there's one for the Lambda function (
YOLOFunction
) and another one for the YOLOv8 model (YOLOModel
) which will be added to our function as a Lambda layer; the Lambda function itself will be accessible through a Lambda function URL. - โ๏ธ Settings - the memory size is set to the maximum allowed value (
10GB
) to improve performance cf. AWS Lambda now supports up to 10 GB of memory and 6 vCPU cores for Lambda Functions for more information. - ๐ Security - authentication to our function URL is handled by IAM, which means that all requests must be signed using AWS Signature Version 4 (SigV4) cf. Invoking Lambda function URLs for additional details.
The deployment should be done by now ๐ Don't forget to note down the function URL
You can use tools like
awscurl
to test the app (awscurl will handle the SigV4 signing for you)or create your own test scripts
And that's it! ๐ฅณ We just crossed the finish line...
Sooo, what's next? Here are a few recommendations:
- Explore the code - it's just there for the taking, plus I left some Easter eggs and L400 references in there for the brave ones.
- Check out the
feat/YOLOv9
branch to test the newest member of the YOLO family
- Build your own app - I'm pretty sure you already have a cool use case in mind
- Share with the community - leave a comment below if you do something awesome
I hope you enjoyed it, see you next time! ๐
This is the first article in the #TGIFun๐ series, a personal space where I'll be sharing some small, hobby-oriented projects with a wide variety of applications. As the name suggests, new articles come out on Friday. // PS: If you like this format, don't forget to give it a thumbs up ๐ Work hard, have fun, make history!
- (Redmon et al., 2015) You Only Look Once: Unified, Real-Time Object Detection
- (Terven & Cordova-Esparza, 2023) A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS
- HowTo: deploying YOLOv8 on AWS Lambda (an alternative implementation ๐ช)
ย
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.