AWS Logo
Menu
Options and Trade-offs for PDF processing with Claude models

Options and Trade-offs for PDF processing with Claude models

Compare PDF processing approaches for Claude models to find what works best for your use case.

Published Mar 13, 2025
PDF documents are essential in many business workflows, and AWS customers often use Claude to process these files. Whether you use PDF files to provide Claude context in an interactive assistant scenario ("document chat"), extracting key information, analyzing content or checking compliance, it's crucial to understand the available options for PDF processing.
This blog explores common approaches for using PDFs with Claude, and highlights the advantages and limitations of each method.

๐Ÿ“„ Common PDF Use Cases with Claude

AWS customers leverage Claude models in Amazon Bedrock to process PDFs for various purposes:
  1. Document Chat: Performing in-depth analysis and answering questions about document content
  2. Document Summarization: Extracting key points from lengthy documents
  3. Data Extraction: Pulling structured information from tables and specific text sections
  4. Compliance Verification: Checking if documents meet regulatory standards
  5. Editing Assistance: Providing suggestions or corrections for document drafts
  6. Translation: Converting documents from one language to another

๐Ÿ‘‰ Following Along

You can follow this blog along using the companion notebook on the AWS Samples Anthropic on AWS GitHub repository. The notebook contains sample code that implements the different PDF processing options described here, and evaluates their suitability for the most common use case, Document Chat.

๐Ÿ” PDF Processing Options in Amazon Bedrock

1. Bedrock DocumentBlock API

What it is: A managed API for sending documents alongside user prompts, with Bedrock handling document transformation. With this API, Claude receives text that's been extracted from the PDF by Bedrock, as context together with the actual prompt sent by the user - in order for Claude to use the extracted text as additional context. This helps increase the accuracy of Claude's output.
Key capabilities:
  • Supports PDF and other files up to 4.5MB per file
  • Handles up to 5 documents per request
  • Can apply OCR to process scanned documents
  • Extracts text and passes it as context to the model
Limitations:
  • Extracts text only, ignoring visual elements like tables, images, and logos
  • OCR results depend on quality of the scan (readability of the PDF file)
  • Claude does not receive visual context (such as layout or images)

2. Bedrock Knowledge Bases

What it is: A RAG (Retrieval-Augmented Generation) solution for handling large document collections that exceed Claude's context window. With RAG, the PDF is typically divided into smaller chunks and an embedding model stores a vector of the chunk in a vector DB. When a user prompt is sent to Claude, the same embedding model is used to find chunks with semantic similarity to that prompt. A number of matching chunks is then sent along with the user prompt to help Claude produce more informed output. Amazon Bedrock Knowledge Bases is fully managed functionality for creating RAG setups.
Key capabilities:
  • Scales to very large documents or document collections (thousands or millions)
  • Chunks documents into manageable pieces
  • Creates embeddings for vector search
  • Retrieves the most relevant context for each query
Limitations:
  • Only sends chunks (ranked "highlights") rather than complete context
  • Search process isn't perfect - may miss relevant context
  • May promote less relevant context that negatively affects Claude's responses
  • Adds latency to end-to-end query processing due to the multi-stage process including the creation of, and search for dense embeddings

3. Anthropic Native PDF Support

What it is: Anthropic's own API provides sophisticated PDF handling - but this method is not yet available on Amazon Bedrock.
Key capabilities:
  • Made for assistant/chatbot use cases, for users to "talk to their document"
  • Extracts text from PDFs
  • Converts each page into an image for visual context
  • Provides both text and visual modalities to the model
  • Handles requests up to 32MB with maximum 100 pages per request
Limitations:
  • Not currently implemented on Amazon Bedrock

4. Self-Managed Solutions

What it is: Customers sometimes implement custom implementations that replicate some of the capabilities of the previously discussed options in code, without relying on specific Bedrock (or Anthropic) functionality. Examples are customers that implement functionality similar to the Anthropic PDF Support with their own code using Claude's multimodal capabilities in Amazon Bedrock, and others that implement novel ways of extracting information from PDFs.
Key capabilities:
  • Perform text extraction/OCR, potentially with Amazon Textract
  • Converts PDF pages to images using tools like pdf2image
  • Includes up to 20 images per Bedrock request (Bedrock limit)
  • Provides either text, visual context or both to Claude, depending on the requirements
  • Most flexible for implementing custom document processing strategies
Limitations:
  • More complex to implement and maintain
  • Limited to 20 images per request (3.75MB, 8000px ร— 8000px max per image)
  • Larger documents require multiple requests with careful prompt engineering
  • Increases implementation complexity and request latency

๐Ÿงช Practical Comparison

For a hands-on comparison of these approaches you should check out our companion notebook that demonstrates each method with sample code and practical evaluation.

๐ŸŽฏ Choosing the Right Approach

ApproachBest ForLimitationsCost
DocumentBlock APISimple, text-focused PDFs with minimal visual elements/tablesMax. 5 docs per request, 4.5MB each, text-onlyLower: per-token input cost, only for extracted text
Knowledge BasesVery large documents or collections exceeding context limitsLess detailed and consistent context, added latencyHigher: cost for running and using vector DB infrastructure, on top of input tokens
Anthropic nativeSimple, highly accurate, scalable talk-to-your-PDFNot currently on BedrockMedium: per-input token cost on images and extracted text
Self-managedCustom, complex processing requirementsComplex to implement, 20-page limit per requestFlexible: depends on processing setup, typically per-input token cost on images and extracted text

๐Ÿ“‹ Conclusion

When working with PDFs in Amazon Bedrock, your choice depends on your specific requirements:
  • For simple text extraction from standard text-only born-digital documents, the DocumentBlock API offers a fully-managed, low-latency solution.
  • For very large documents or collections, Bedrock Knowledge Bases provide excellent scalability.
  • The Anthropic native PDF support offers a simple, yet comprehensive solution. Best for document chat use cases, but not yet available on Amazon Bedrock.
  • When a customized solution to document processing is needed, self-managed solutions can provide more complete context and flexible document processing, but they add complexity.
As you build PDF processing workflows with Claude on Amazon Bedrock, carefully consider these trade-offs to select the approach that best balances accuracy, scalability, and implementation effort for your specific use case.
ย 

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Comments