Setting up source attribution urls when using Amazon S3 as a connector for Amazon Q Business applications

Learn how to ensure S3 source attribution links work in an Amazon Q Business application.

Published Jul 18, 2024

Background and Problem Statement

Business users can build an Amazon Q Business application with a completely “no code” approach with zero dependency on the engineering teams. And Amazon S3 is one of the simplest connectors that business users can leverage as the repository of documents they’d like to use for their Generative AI applications.
When conversing with an Amazon Q Business application with Amazon S3 as the connector the users are expected to see sources from which Amazon Q Business is providing answers. This is termed as source attribution and is one of the key features of Amazon Q Business to ensure the responses are grounded in the source documents.
However, the default source attribution urls/links point to the Amazon S3 bucket urls. And since Amazon S3 bucket is private by default a user will not be able to open the urls/links provided in the source attribution and will receive an “Access Denied” error as a result. This error interrupts with the user experience that a business user expects from Amazon Q Business applications. Supplementing the product documentation for Amazon Q Business this article intends to provide clarity in addressing this particular issue.

Prerequisites

  1. Create an Amazon Q Business application. Refer here for steps to create one.
  2. Connect the Amazon Q Business application created with S3 as the data source. Refer here for associated steps.

Solution

The issue described above can be addressed by using a metadata file for each document in the Amazon S3 bucket that the Amazon Q Business application is connected to. Here is the supporting product documentation for the same.
Please follow the steps below to setup the metadata file to configure Amazon S3 urls to an alias that can be made accessible:
  1. Create a metadata file (json document) for each document in Amazon S3 bucket that is connected to the Amazon Q Business application. The metadata file should have the following naming standard: <documentName>.<extension>.metadata.json.
  2. Create the attribute "_source_uri" within the metadata file created. The value of the attribute represents the url you’d like to override with for the corresponding Amazon S3 url, to make it available for a user to click when interacting with the Amazon Q Business application. Here is an example:
    Note: For example, the “alias” depicted in the above example could be an Amazon CloudFront distribution setup on the Amazon S3 bucket.
    • Here are the steps to create a CloudFront distribution for an S3 bucket.
    • Here is an example screenshot of the metadata file with the attribute "_source_uri" added:
  3. Ensure the json file is valid.
  4. Upload the metadata json file to the Amazon S3 bucket which stores the corresponding source document. Here is an example screenshot of the folder structure in S3 with the metadata file added.
  5. Go to Amazon Q Business in the AWS console, select the Amazon Q Business application created as a prerequisite above, select the data source that points to the Amazon S3 bucket and sync the data source. This step will crawl and index the metadata file(s).
  6. Test your Amazon Q Business Application and check the urls/links in the Sources within the response. They should point to the "_source_uri" mapped in the metadata file.

Conclusion

By implementing the steps conveyed in this article you will be able to make the source attribution links work when S3 is used as a data source connector while keeping the bucket private and catering to the business users needs with an Amazon Q Business application.
 

Comments