Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

AWS Logo
Menu
AI-Powered Text Extraction with Amazon Bedrock

AI-Powered Text Extraction with Amazon Bedrock

Learn to extract highlighted text from images using Amazon Bedrock. Automate document processing and information extraction.

Akash Ninave
Amazon Employee
Published Mar 6, 2025
Automating Highlighted Text Extraction with Amazon Bedrock and Amazon S3. Building a serverless solution to extract highlighted text from images, generate explanations, and store the results as formatted PDFs in Amazon S3 using AWS Bedrock and Python.
This project demonstrates the power of combining AI services with cloud storage and serverless computing to create a useful tool for researchers, students, or anyone who needs to quickly digitize and explain highlighted text from physical documents or digital images.
Simply upload an image of your highlighted text, and our AI will accurately extract all the important information and will provide :-
  1. Comprehensive Explanations: For each piece of highlighted text, receive a detailed explanation that breaks down complex concepts into easily understandable insights.
  2. Curated Resource Links: Gain access to a wealth of knowledge with carefully selected links to public articles that provide additional context and information related to your extracted text.
  3. Relevant Video Content: Enhance your understanding with links to YouTube videos that offer visual explanations and expert discussions on the topics you've highlighted.

Infrastructure :-

  1. AWS Bedrock:
    • Used for AI-powered text extraction and explanation generation
    • Specifically uses the "anthropic.claude-3-5" model
  2. Amazon S3:
    • Stores the generated PDFs in a pre-defined bucket
    • The bucket name is hardcoded in the script (BUCKET_NAME variable)
  3. AWS IAM:
    • Implicit use for managing permissions to access Bedrock and S3 services
  4. Local Environment:
    • Python script running on a local machine or server
    • Handles image processing, PDF generation, and AWS service interactions
  5. ReportLab:
    • Python library used for generating well-formatted PDFs
  6. Boto3:
    • AWS SDK for Python, used to interact with AWS services (Bedrock and S3)

Key Components :-

  • S3 Client: For uploading PDFs and generating pre-signed URLs.
  • Bedrock Runtime Client: For interacting with the Bedrock AI model.
  • PDF Generation: Using ReportLab to create formatted PDFs from extracted text.
  • Error Handling and Logging: Basic error handling and logging mechanisms are in place.

Prerequisites :-

  • AWS Account with appropriate permissions
  • Python 3.8 or later
  • boto3 library installed
  • reportlab library installed
  • Access to AWS Bedrock service (specifically the Claude 3 Sonnet model)
  • AWS CLI configured with appropriate credentials
  • An S3 bucket created for storing the generated PDFs
  • An image file containing highlighted text for processing

Data flow :-

  • User provides an image containing highlighted text.
  • The application reads the image file.
  • AWS Bedrock is called to extract the highlighted text and generate explanations.
  • The extracted text and explanations are formatted into a PDF.
  • The PDF is uploaded to a unique folder in an S3 bucket as per timestamp.
  • A pre-signed URL for the PDF is generated and returned to the user.

Code block :-

Note:-
Replace bucket name on line no 21 with your actual s3 bucket.
Replace image path with your actual image path on line no 151.

Warning:-

It is not a best practice to expose your bucket name in the code, please use environment variable instead.
example :-
Please refer to this link to understand how to create environment variables .

Reference input image :-

highlighted image
Input Image
Output :-
Extracted output
output

Understanding the code :-

Import Statements and Basic Setup :-
This section imports necessary libraries:
  • logging: For application logging
  • boto3: AWS SDK for Python
  • reportlab: For PDF generation
  • Other utility imports for time, UUID, and IO operations It also initializes AWS clients for S3 and Bedrock services.
PDF Generation Function:-
This function creates a PDF from the input text with proper formatting, pagination, and text wrapping.
S3 Upload Function:-
This function manages S3 operations:
  • Creates timestamped folders
  • Converts text to PDF
  • Handles S3 upload
  • Generates temporary access URLs
  • Includes error handling
Bedrock Integration Function:-
This function handles:
  • Image file reading
  • Message formatting for Bedrock
  • API call execution
  • Response timing
  • Comprehensive error handling
Main Orchestration Function:-
This main function:
  • Sets up model and input parameters
  • Coordinates the extraction process
  • Handles response processing
  • Manages the upload process
  • Provides status feedback
The entry point:-
This ensures the script runs only when executed directly.

Key Features and Best Practices

  1. Error Handling: The script includes comprehensive error handling and logging, ensuring robustness and easier debugging.
  2. Modularity: Functions are well-separated, promoting code reusability and maintainability.
  3. Security: The use of pre-signed URLs ensures secure, time-limited access to the uploaded files.
  4. Scalability: By leveraging AWS services, the solution can easily scale to handle large volumes of documents.
  5. Flexibility: The AI model and input instructions can be easily modified to adapt to different use cases.

Potential Improvements and Extensions

  1. Parallel Processing: Implement multiprocessing to handle multiple images concurrently.
  2. Integration with Document Management Systems: Extend the script to integrate with popular document management systems.
  3. User Interface: Develop a web interface for easy upload and processing of images.
  4. Automated Workflow: Integrate with AWS S3 event notification and SNS to create an automated workflow.

Conclusion :-

This implementation demonstrates how AWS services can be combined to create practical solutions useful for
  • Researchers digitizing research papers
  • Students organizing study materials
  • Business professionals extracting key information from documents.
As AI and cloud technologies continue to evolve, we can expect even more powerful and efficient solutions for document processing and information extraction. By staying up-to-date with these technologies and continuously improving our processes, we can unlock new levels of productivity and insight from our document repositories.
Note :- Remember to follow security best practices and handle errors appropriately when implementing similar solutions in production environments.
 

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Comments