Build a Serverless Application for Image Label Detection
Learn how to use Amazon Rekognition and AWS Lambda to extract image labels using the Go programming language.
Abhishek Gupta
Amazon Employee
Published Jun 16, 2023
Last Modified Mar 14, 2024
Amazon Rekognition is a service that lets you analyze images and videos in your applications. You can identify objects, people, text, scenes, and activities, and detect inappropriate content. You can also do facial analysis, face comparison, and face search for various use cases like user verification and public safety. Amazon Rekognition is built on deep learning technology that doesn't require machine learning expertise to use. It has an easy-to-use API that can analyze any image or video file in Amazon S3.
In this tutorial, you will learn how to build a Serverless solution for image label detection using Amazon Rekognition, AWS Lambda and the Go programming language. A label refers to any of the following: objects (flower, tree, or table), events (a wedding, graduation, or birthday party), concepts (a landscape, evening, and nature) or activities (getting out of a car). For example, a photo of people on a tropical beach may contain labels such as Palm Tree (object), Beach (scene), Running (action), and Outdoors (concept).
Label detection in Rekognition also works for video content. However, this tutorial focuses on image label detection.
We will cover how to:
- Deploy the solution using AWS CloudFormation.
- Verify the solution.
We will be using the following Go libraries:
- AWS Go SDK, specifically for Amazon Rekognition.
- Go bindings for AWS CDK to implement "Infrastructure-as-code" (IaC) for the entire solution and deploy it with the AWS Cloud Development Kit (CDK) CLI.
Here is how the application works:
- Images uploaded to Amazon S3 trigger a Lambda function.
- The Lambda function extracts list of labels (with their name, category and confidence level) and saves it to an Amazon DynamoDB table.
Before starting this tutorial, you will need the following:
- An AWS Account (if you don't yet have one, you can create one and set up your environment here).
- Go programming language (v1.18 or higher).
- Git.
Clone the project and change to the right directory:
The AWS Cloud Development Kit (AWS CDK) is a framework that lets you define your cloud infrastructure as code in one of its supported programming and provision it through AWS CloudFormation.
To start the deployment, simply invoke
cdk deploy
and wait for a bit. You will see a list of resources that will be created and will need to provide your confirmation to proceed.Enter
y
to start creating the AWS resources required for the application.If you want to see the AWS CloudFormation template which will be used behind the scenes, runcdk synth
and check thecdk.out
folder
You can keep track of the stack creation progress in the terminal or navigate to AWS console:
CloudFormation > Stacks > RekognitionLabelDetectionGolangStack
.Once the stack creation is complete, you should have:
- A
S3
bucket - Source bucket to upload images. - An AWS Lambda function to extract image labels using Amazon Rekognition.
- A
DynamoDB
table to store the label data for each image. - And a few other resources (such as
IAM
roles etc.)
You will also see the following output in the terminal (resource names will differ in your case). In this case, these are the names of the
S3
buckets created by CDK:You are ready to verify the solution.
To try the solution, you can either use an image of your own or use the sample files provided in the GitHub repository. I will be used the AWS CLI to upload the file, but you can use the AWS console as well.
This Lambda function will extract labels from the image and store them in a
DynamoDB
table.Upload another file:
Check the
DynamoDB
table in the AWS console - you should see results of the label detection for both the images.DynamoDB
table is designed with source file name as the partition key and (detected) label name as the sort key. This allows for a couple of query patterns:- You can get all the labels for a given image.
- You can query for the metadata (category and confidence) for a specific source image and it's label.
You can use the AWS CLI to query the
DynamoDB
table:Now that you have verified the end-to-end solution, you can clean up the resources and explore the Lambda function logic.
Once you're done, to delete all the services, simply use:
Here is a quick overview of the Lambda function logic. Please note that some code (error handling, logging etc.) has been omitted for brevity since we only want to focus on the important parts.
The Lambda function is triggered when a new image is uploaded to the source bucket. The function iterates through the list of files and calls the
labelDetection
function for each image.Let's go through the
labelDetection
function.- The
labelDetection
function uses the Amazon Rekognition DetectLabels API to returns a list of labels. - The function then iterates through each of those labels and stores the label name, category and confidence score in the
DynamoDB
table.
In this tutorial, you used AWS CDK to deploy a Go Lambda function to detect labels in images using Amazon Rekognition and store the results in a DynamoDB table. Here are a few things you can try out to extend this solution:
- Explore how to build a solution to analyze videos stored in a
S3
bucket. - Even better, try to processing streaming video to extract labels in real-time.
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.