XRAI Glass - Let's Build a Startup Showcase!

Learn how XRAI Glass is making the world a better place, one word at the time, with Amazon Translate and Transcribe

Giuseppe Battista
Amazon Employee
Published Jun 14, 2024
XRAI Glass is an AI startup that combines the latest in extended reality (XR) and artificial intelligence (AI) to give people the tools to engage with the world in new ways. They leverage Amazon Translate and Transcribe services in order to provide real-time transcription and translation.
Hear from Tim Scarfe, CTO & Co-founder, and Jacqueline Press, Chief Brand Ambassador, about how they built such an impactful startup.
We also had the pleasure to interview Tim at "Let's Build a Startup!" last year. Here's our highlight!
Loading...
What do you think about XRAI Glass? Let us know in the comments, and don't forget to drop a like!
Also, let us know if your startup is using tech to make the world a better place!
 

Ok, but how to get started with Translate and Transcribe?

Let's have a look at how you can get started with Amazon Translate and Transcribe with a simple event driven, serverless architecture. We're going to build a simple choreography.
  1. When an audio file is uploaded to an Amazon Simple Storage Service (S3) bucket under the prefix /audio
  2. we trigger an AWS Lambda function that makes use of Transcribe to get text out of the audio file. This function will write the transcription in the same bucket under the prefix /transcriptions
  3. In turn, when a file is created under /transcriptions, another Lambda function is triggered to perform a translation job. For simplicity we'll assume that the audio has been recorded in english and we want to translate that into italian. This lambda function will write the translation results to /translations
an architectural and choreography diagram depicting the stages of the process and the services used
Architecture and Choreography for Transcript and Translation
Choreographies are great in my opinion to build small prototypes and PoCs. In this example, we're using S3 also as message bus. For a productionized version of this architecture, I'd highly recommend looking into more sophisticated orchestration tools like AWS StepFunction.
Important caveat: whenever implementing choreographies with Amazon S3, S3 prefixes, and Lambda you must be very careful not to cause recursive invocation of Lambda functions, as you'll will incur unwnted charges. Be mindful about the S3 paths your functions are writing to. For example, if one of the two functions in this example were to write into /audio, you'd end up with an infinite loop! If that were to happen, make sure you throttle the invocation of your functions and potentially add kill-switches to all your choreographers. A common strategy is to provide a deny-list of S3 paths to each choreographer. In case one of these paths is detected in the event triggering the function, the function may want to throttle itself, or ignore the event, send the event to a dead letter queue, and send alerts to relevant notification systems. Others prefer to keep separate buckets for every stage: just beware of your AWS account limits and file a service quota increase if you need more buckets.

Create the CDK Stack

  1. Install the AWS CDK CLI
    If you haven't already installed the AWS CDK, install it using npm:
    npm install -g aws-cdk
  2. Create a New CDK Project
    Create a new directory for your CDK project and initialize it with the following commands
    mkdir transcribe-translate
    cd transcribe-translate
    cdk init app --language typescript
  3. Add the required AWS CDK dependencies
    npm install @aws-cdk/aws-s3 @aws-cdk/aws-lambda @aws-cdk/aws-iam @aws-cdk/aws-s3-notifications @aws-cdk/aws-lambda-nodejs

Build the Infrastructure

Edit the lib/transcribe-translate-stack.ts file to define the resources:

Trigger the Transcription Job

In the following example, we're just going to trigger the transcription job and we won't be monitoring for its successful completeion. This is an optimistic approach that works only when prototyping! What I'd suggest for a production implementation is to poll the Transcribe APIs so you know when the job has completed successfully.
Create a file in your local repository under lambda/transcribe.mjs
When the job completes, a new object is created under /transcriptions. This will trigger the next phase in the choreography.

Trigger the Translation Job

Create a file under lambda/translate.mjs
 

Deploy

After deployment you should see a list of Outputs. Here's where you'll find your S3 bucket name.
cloudformation outputs in the terminal
Outputs

Test

To test, simply upload an mp3 file to the S3 bucket you got from the outputs under audio/. You can use the CLI
or navigate to the AWS console, upload an mp3 file into audio/. After a while, you should have available transcriptions/ and translations/ where you'll find the choreography results for each category.
aws console for S3 with paths
S3 console

Cleanup

Once you're done, tear down the infrastructure by issuing the following

Wrap Up

I hope XRAI's success story inspired you to build something that makes the world a bit better with fantastic tech such as Amazon Transcribe and Translate. Follow Let's Build a Startup on Twitch and on Community Livestream for more startup related content on AWS. Drop a comment if you have a startup idea that could make the world a better place!

Author

Giuseppe Battista is a Senior Solutions Architect at Amazon Web Services. He leads soultions architecture for Early Stage Startups in UK and Ireland. He hosts the Twitch Show "Let's Build a Startup" on twitch.tv/aws and he's head of Unicorn's Den accelerator. Follow Giuseppe on LinkedIn
 

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Comments