AWS Logo
Menu
GenAI-powered Intelligent Document Processing

GenAI-powered Intelligent Document Processing

Extract relevant information from bills and invoices using natural language queries

Karan Desai
Amazon Employee
Published Jan 24, 2025
Business owners across industries, whether running a financial service, a trading company, a restaurant, a manufacturing plant, or anything else have to handle a large number of bills and invoices from vendors and customers. These documents come in various shapes and forms, handwritten or printed or scanned, and need to be processed by various teams such as sales, finance, marketing, and so on. This is a long tedious process, but now with the power Generative AI on AWS, we can automate this and save hours of manual work.

Solution Overview

The solution we are building here leverages Generative AI to intelligently extract the required information from the documents using Anthropic Claude Haiku foundation model running on Amazon Bedrock. It also uses Amazon Textract to pre-process the scanned documents. This has twofold advantage- it improves the accuracy by harnessing the power of Textract's ML models customized to recognize bills and invoices, and feeding the output of Textract which is a blob of text as an input to the Bedrock foundation model instead of uploading entire PDF or PNG files uses lesser number of input tokens, making the solution more cost efficient.
The response from Bedrock is a JSON in the format defined by the user with the relevant information fields expected. This response is presented to the user in a web UI, and also written to an Amazon DynamoDB table to be stored as a database for other applications to use. The solution architecture is as shown here:
Solution Architecture
Solution Architecture
The entire process is automated using a Python code, which can be either run as a Streamlit application from a local machine or a cloud-based container or EC2 instance, or as a Lambda function on AWS. The user has to simply upload the invoice files from their local machine, or provide an Amazon S3 bucket location for batch processing of all documents within that bucket or a folder within the bucket.

Implementation Details

A sample Python code to build a Streamlit application is provided for your reference. You can modify it as per your specific requirements.

Pre-requisites:

1. An AWS account with console and CLI access to it
2. AWS CLI (command line interface) installed on your machine. Installation steps: https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html
3. AWS IAM user with access key and secret key created and saved on your machine: https://docs.aws.amazon.com/keyspaces/latest/devguide/create.keypair.html
4. AWS IAM permissions to interact with S3, DynamoDB, Textract and Bedrock services. As a security best practice, apply least-privilege permissions for the user instead of giving overly permissive access such as Administrator unless absolutely required: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_change-permissions.html

Provision services and permissions:

Update your S3 bucket permissions to allow your IAM user access to the bucket and objects inside the bucket. Sample permissions policy:
Update IAM user’s permissions to grant access to S3, DynamoDB, Textract and Bedrock. This is a sample policy, you can modify it as per your requirements. Make sure to follow the principle of least access and only allow access to the services you intend to use and the operations you wish to perform.
Configure AWS CLI on your machine with the IAM user’s credentials if not done already: https://docs.aws.amazon.com/cli/v1/userguide/cli-chap-configure.html

Running the solution

Upload the documents you want to process to your S3 bucket. You can create folders within the bucket to organize the documents, for example, a separate folder by month, week, or day, so that you can limit your processing workflow to only those documents.
Uploaded documents in S3
Uploaded documents in S3
Enable virtual environment and install Streamlit on your machine if not already installed: https://python.land/virtual-environments/virtualenv
python3 -m venv path/to/venv
source path/to/venv/bin/activate
Copy the Python code provided at the end of this page to your local machine, modify it as per your requirements if needed, and save it as a .py file, for example, doc-processing-demo-code.py
Run the Streamlit application from command line on your machine-
streamlit run doc-processing-demo-code.py
This will open the Streamlit application in a new tab or window in your default web browser, and will look like this:
Streamlit UI screenshot
Streamlit UI screenshot
You can modify the extraction prompt to describe what information you want to extract from the documents. Provide the location of the S3 bucket or folder where your documents are stored, and start processing. As the documents get processed, you will see the output on the screen, along with useful information such as the number of input and output tokens consumed in the processing and the time taken to process the document.
Login to the AWS console and go to DynamoDB service. Select the table you created earlier and verify that the table has been populated with details extracted from the documents, as shown below:
Data inserted in DymamoDB table
Data inserted in DymamoDB table

Conclusion

You have built a solution to successfully extract information from documents by providing your requirements in natural language prompts. This makes it easy to use by non-technical business users with no programming experience. This solution can be expanded to cover various types of documents such as employee salary slips, patient health records, tax documents, and so on. If you build an interesting solution, share it with us by writing a post about it here on the AWS Community!

Appendix: Sample code

The following code is provided only for reference. Please do not deploy it in production. Please update it as per your organization’s code quality and security guardrails.
 

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Comments