
Unleashing the Power of Cloud and AI: Automating Music Discovery with a Smartphone Camera
In this post, we'll explore how cloud computing and artificial intelligence can be harnessed to transform the way you discover and access new music. Learn how to use your smartphone camera to scan album covers, leverage AWS services like S3 and Rekognition to automatically identify the album, and then seamlessly integrate with the Spotify API to start listening. Discover the power of cloud-based solutions and AI-driven technology in enhancing your music experience.
- You need an AWS account to deploy this solution. If you don’t have an existing account, you can sign up for one. The instructions in this post use the AWS Region us-east-1. Make sure you deploy your resources in a Region with AWS Machine Learning services available.
- Set up the Boto3 AWS SDK and Python: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html
- Before proceeding, make sure you have the necessary permissions to utilize Amazon S3, AWS Lambda, and Amazon Rekognition. You can refer to the AWS documentation on IAM access management to ensure your credentials have the required permissions:
https://docs.aws.amazon.com/IAM/latest/UserGuide/access.html
- The use of these services will incur costs. If you access them through your own AWS account, you will be responsible for paying those costs.
- You will need to provide your own front-end or mobile application for the purpose of uploading the image to Amazon S3. An example mobile application is discussed below.
- For the purposes of an end-to-end solution, we recommend having a front-end set up where your users can upload images that they want detected and labeled. A sample mobile app is provided later in this post. To learn more about front-end deployment on AWS, refer to Front-end Web & Mobile on AWS.
- The picture taken by the user is stored in an Amazon Simple Storage Service (Amazon S3) bucket. This S3 bucket should be configured with a lifecycle policy that deletes the image after usage. To learn more about S3 lifecycle policies, see Managing your storage lifecycle.
- This architecture uses an AWS Lambda function that serves as the business logic for this solution. The Lambda function harnesses the power of Amazon Rekognition by using the Boto3 Python API. Amazon Rekognition is a cutting-edge computer vision service that uses machine learning (ML) models to analyze the uploaded images.
- We use Rekognition Custom Labels so that this solution can fit a personalized use case. With the aid of custom labels specifically trained to recognize various album covers, Amazon Rekognition accurately identifies the items present in the images.
- The album names are stored as keys in Amazon DynamoDB table, a fully managed NoSQL database service, along with their Spotify URIs. When a user scans an album, Rekognition detects the cover and responds with the label (i.e. album name). Lambda then uses DynamoDB to look up the corresponding Spotify link to play the album.
- Spotify is a music streaming platform that also offers an API, enabling developers to create applications that leverage its capabilities. In our use case, we make HTTP requests to Spotify's endpoint to specify which album should be played. This information is retrieved by a Lambda function through a DynamoDB lookup. Once Spotify authorization is obtained, the requested album begins playing.
- On the Amazon Rekognition Custom Labels console, select 'Projects' from the left sidebar.
- Click 'Create Project' and enter a project name.
- On the Project page, click 'Create Dataset'.
- Select the option 'Start with a training dataset and test dataset' to have more control over the training and testing images.
- Upload the images of the album covers you want to include in the database from various angles.
- For the training dataset, label the images based on the corresponding album names.
- Click 'Train Model' to start the training process.
- Review the performance metrics to ensure the model can accurately label the test images.
- Once training is successful, click on the model and navigate to the 'Use Model' section.
- Click 'Start' to begin using the custom image recognition model to detect the album covers it was trained on.
- The custom Rekognition model is now set up and ready to use for your application.
- On the Lambda console, choose Functions in the navigation pane.
- Choose Create Lambda function.
- Choose Author from scratch.
- Name your function and choose Python 3.8 for Runtime, and choose Create function.
- Replace the text in Lambda function code with the following sample code and choose Save:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
import json
import os
import boto3
import urllib3
from botocore.exceptions import ClientError
# Initialize AWS session and clients
session = boto3.Session()
dynamodb = session.resource('dynamodb')
rekognition = session.client('rekognition')
# Get environment variables
TABLE_NAME = os.environ.get('DYNAMODB_TABLE_NAME')
PROJECT_VERSION_ARN = os.environ.get('PROJECT_VERSION_ARN')
BUCKET_NAME = os.environ.get('BUCKET_NAME')
SPOTIFY_API_KEY = os.environ.get('SPOTIFY_API_KEY')
def lambda_handler(event, context):
"""
Main Lambda function handler for detecting album covers and playing them on Spotify.
Args:
event (dict): Lambda event payload
context (dict): Lambda context object
Returns:
dict: Response object containing status code and response body
"""
try:
# Extract the S3 object key from the event
obj_name = event['Records'][0]['s3']['object']['key']
# Detect custom labels (album) using Amazon Rekognition
album = detect_album(obj_name)
# Get Spotify URI for the detected album
spotify_uri = get_spotify_uri(album)
# Play the album on Spotify
status_code, response_body = play_spotify_album(spotify_uri)
return {
"statusCode": status_code,
"body": response_body
}
except Exception as e:
print(f"Error: {str(e)}")
return {
"statusCode": 500,
"body": json.dumps({"error": str(e)})
}
def detect_album(obj_name):
"""
Detect the album using Amazon Rekognition custom labels.
Args:
obj_name (str): S3 object key of the image
Returns:
str: Detected album name
"""
try:
response = rekognition.detect_custom_labels(
ProjectVersionArn=PROJECT_VERSION_ARN,
Image={
'S3Object': {
'Bucket': BUCKET_NAME,
'Name': obj_name,
}
}
)
return response['CustomLabels'][0]['Name']
except ClientError as e:
print(f"Rekognition error: {e.response['Error']['Message']}")
raise
def get_spotify_uri(album):
"""
Retrieve Spotify URI for the given album from DynamoDB.
Args:
album (str): Album name
Returns:
str: Spotify URI for the album
"""
table = dynamodb.Table(TABLE_NAME)
filter_expression = boto3.dynamodb.conditions.Attr('album').eq(album)
try:
response = table.scan(FilterExpression=filter_expression)
items = response['Items']
# Use pagination if there are more items
while 'LastEvaluatedKey' in response:
response = table.scan(
FilterExpression=filter_expression,
ExclusiveStartKey=response['LastEvaluatedKey']
)
items.extend(response['Items'])
if not items:
raise ValueError(f"No Spotify URI found for album: {album}")
return items[0]['uri']
except ClientError as e:
print(f"DynamoDB error: {e.response['Error']['Message']}")
raise
def play_spotify_album(uri):
"""
Play the album on Spotify using the Spotify API.
Args:
uri (str): Spotify URI for the album
Returns:
tuple: HTTP status code and response body
"""
url = "https://api.spotify.com/v1/me/player/play"
headers = {
"Authorization": f"Bearer {SPOTIFY_API_KEY}",
"Content-Type": "application/json"
}
data = {
"context_uri": f"spotify:album:{uri}",
"position_ms": 0 # Start from the beginning of the album
}
try:
with urllib3.PoolManager() as http:
response = http.request(
'PUT',
url,
body=json.dumps(data).encode('utf-8'),
headers=headers
)
return response.status, response.data.decode('utf-8')
except urllib3.exceptions.HTTPError as e:
print(f"Spotify API error: {str(e)}")
raise
- Choose Create bucket.
- Enter a unique bucket name.
- On the Lambda console, navigate to the Lambda function you created.
- On the Configuration tab, choose Add trigger.
- Select the trigger type as S3 and choose the bucket you created.
- Set Event type to All object create events and choose Add.
- On the Amazon S3 console, navigate to the bucket you created.
- Under Properties and Event Notifications, choose Create event notification.
- Enter an event name (for example, Trigger LambdaFunctionName) and set the events to All object create events.
- For Destination, select Lambda Function and choose the Lambda function you created in the prior steps.
- Choose Save.
- On the DynamoDB console, choose Tables in the navigation pane.
- Choose Create table.
- For Table name, enter a name for the table.
- For Partition key, use ‘album’ (String).
- Verify that all entries on the page are accurate, leave the rest of the settings as default, and choose Create.
- After creating the table, navigate to the 'Items' tab and choose 'Create item'.
- For each album in your Rekognition training dataset, enter the album name as the 'album' partition key and the corresponding Spotify album URI as the 'uri' attribute.
- You can find the Spotify URI by navigating to the album's page on the Spotify website and copying the unique identifier from the URL (e.g. '41GuZcammIkupMPKH2OJ6I' for Astroworld).
- Repeat this process to add all album names and URIs from your Rekognition training dataset.
In this section, we will discuss the steps involved in creating the mobile application.
- Select your preferred IDE and language for development. We are using Expo & React Native (JavaScript) to code this app.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
// https://docs.expo.dev/versions/latest/sdk/camera/
// Import necessary modules
import { CameraView, useCameraPermissions } from 'expo-camera';
import { useState, useRef } from 'react';
import { Button, StyleSheet, Text, TouchableOpacity, View, Image } from 'react-native';
import AWS from 'aws-sdk';
import 'dotenv/config'; // Import the dotenv module to load environment variables
// Configure AWS SDK
AWS.config.update({
accessKeyId: process.env.AWS_ACCESS_KEY_ID,
secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY,
region: process.env.AWS_REGION,
});
// Create a memoized S3 instance
const s3 = new AWS.S3();
export default function App() {
const [facing, setFacing] = useState('back');
const [permission, requestPermission] = useCameraPermissions();
const [photoUri, setPhotoUri] = useState(null);
const cameraRef = useRef(null);
if (!permission) {
// Camera permissions are still loading.
return <View />;
}
if (!permission.granted) {
// Camera permissions are not granted yet.
return (
<View style={styles.container}>
<Text style={{ textAlign: 'center' }}>We need your permission to show the camera</Text>
<Button onPress={requestPermission} title="Grant Permission" />
</View>
);
}
// Switch camera or back
function toggleCameraFacing() {
setFacing(current => (current === 'back' ? 'front' : 'back'));
}
//Click picture
const takePicture = async () => {
if (cameraRef.current) {
const photo = await cameraRef.current.takePictureAsync();
setPhotoUri(photo.uri);
// Upload the image to S3
uploadToS3(photo.uri, `image_${Date.now()}.jpg`);
}
};
// Move the uploadToS3 function outside the component and memoize it
const uploadToS3 = async (imageUri, fileName) => {
try {
const file = await fetch(imageUri).then((response) => response.blob());
const params = {
Bucket: process.env.S3_BUCKET_NAME,
Key: fileName,
Body: file,
};
await s3.upload(params).promise();
console.log('Image uploaded to S3 successfully');
} catch (error) {
console.error('Error uploading image to S3:', error);
}
};
return (
<View style={styles.container}>
<CameraView style={styles.camera} ref={cameraRef} facing={facing}>
<View style={styles.buttonContainer}>
<TouchableOpacity style={styles.button} onPress={toggleCameraFacing}>
<Text style={styles.text}>Flip Camera</Text>
</TouchableOpacity>
<TouchableOpacity style={styles.button} onPress={takePicture}>
<Text style={styles.text}>Take Picture</Text>
</TouchableOpacity>
</View>
</CameraView>
{photoUri && (
<Image source={{ uri: photoUri }} style={styles.previewImage} />
)}
</View>
);
}
// Memoize the styles object to prevent unnecessary recalculations
const styles = StyleSheet.create({
container: {
flex: 1,
justifyContent: 'center',
},
camera: {
flex: 1,
},
buttonContainer: {
flexDirection: 'row',
justifyContent: 'space-between',
margin: 20,
},
button: {
backgroundColor: '#000',
padding: 10,
borderRadius: 5,
},
text: {
fontSize: 18,
color: '#fff',
},
previewImage: {
width: '100%',
height: 'auto',
marginTop: 10,
},
});
- Amazon Rekognition documentation: https://docs.aws.amazon.com/rekognition/
- AWS Lambda documentation: https://docs.aws.amazon.com/lambda/
- AWS S3 documentation: https://docs.aws.amazon.com/s3/
- Boto3 AWS SDK documentation: https://boto3.amazonaws.com/v1/documentation/api/latest/index.html
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.