
GenAI driven Invoice Processing
How GenAI changed the Machine Learning landscape?
Published Jan 16, 2025
2025 is here - Can't believe that a year of AI has grown from RAG to CAG to Agentic RAG. Everyone is talking that AI is a hype or AI is going to be the same as some of the recent technological advancements (like AR and Virtual World etc.). Please read through the entire article.
My study continues - as a Student who wants to learn things I like to compare things and how AI and LLMs are shortening the efforts.
Back in Early'2020 - during the start of the pandemic, I started my learning journey into AI/ML. And as with everyone, I started off with YOLO. I developed a solution for a customer who aimed at automating invoice processing to address the inefficiencies associated with manual data entry.
The earlier system I designed was utilizing a combination of optical character recognition (OCR) technology and rule-based scripts to extract data from invoices. While it marked a significant improvement over manual methods, the solution faced challenges, specifically we had to create the datasets and perform model training over YOLOv4 (and later on in 2 months YOLOv5) for close to 3 months.
Plus, in handling the vast variability in invoice formats - required constant updates to accommodate the new vendor templates. One of the biggest challenge for me was the language and clarity of the data that being processed.
Also faced some familiar challenges like Annotation Quality, Overfitting, Convergence Issues that nowadays every ML engineer knows about... By the way, for an Application developer, they were BUGS (Lol). And I solved them using "Logic" (you know what I mean).
Bottom line is - I milked that cow for 5 to 6 months. (Don't frown - that was an achievement in those days). Btw 5 to 6 months was including model creation, fine-tuning, application creation and support with bug fixes.
Turn your heads to 2025 now - and I just rebuild the same application in 5 days flat. That too without writing a single line of code. I just wrote prompts on Amazon Q... :-)
Here comes the shocker now --- I was slow in that as well, you know why AWS has already built that solution and made it open source. My GitHub has no value now.
For everyone's benefit, here is the Repository, which I ended up deploying on my account (again under my Community Builder credits and some nominal charges for bedrock).
Yes you guessed it right, Bedrock has become one stop solution for every ML engineer working with AWS.
Amazon Bedrock is a fully managed service that provides access to high-performing foundation models from leading AI companies. These models are adept at understanding and generating human-like text, making them ideal for extracting key details from invoices, such as invoice numbers, dates, and amounts. By utilizing Amazon Bedrock, businesses can automate the extraction process, reducing manual intervention and improving accuracy.
Instead of building a UI with Angular (referring to 2020 version), I just deployed the Streamlit built UI from the repository.
Streamlit is an open-source framework that enables the rapid development of interactive web applications in Python. By integrating Streamlit with Amazon Bedrock, businesses can create user-friendly interfaces for reviewing and interacting with processed invoice data. This combination allows for real-time data visualization and decision-making, enhancing operational efficiency.
- Setting Up the Environment:
- AWS Configuration: Ensure that your AWS environment is configured with the necessary permissions to access Amazon Bedrock and other related services.
- Python Environment: Set up a Python environment with the required libraries, including Streamlit and AWS SDKs.
- Storing Invoices in Amazon S3:
- Upload invoices to an Amazon S3 bucket, organizing them appropriately for batch processing.
- Processing Invoices with Amazon Bedrock:
- Utilize Amazon Bedrock's foundation models to extract key information from the invoices.
- Implement error handling and validation to ensure data accuracy.
- Developing the Streamlit Application:
- Create a Streamlit app that displays the original invoices alongside the extracted data for easy review.
- Implement features that allow users to interact with and validate the processed data.
- Deployment:
- Deploy the Streamlit application on a scalable platform such as Amazon SageMaker, Amazon EC2, or Amazon ECS to ensure accessibility and performance.
The integration of Amazon Bedrock and Streamlit represents a substantial advancement over the 2020 solution:
- Enhanced Flexibility: Amazon Bedrock's foundation models are capable of understanding and adapting to various invoice formats without the need for extensive reprogramming.
- Improved Accuracy: The AI models employed by Amazon Bedrock offer higher accuracy in data extraction compared to traditional OCR and rule-based methods.
- User-Friendly Interface: Streamlit provides an interactive platform for users to review and validate extracted data, enhancing user experience and operational efficiency.
Conclusion:
The world of AI is going to change the world. The application that took 5 months can now be completed in 5 days.
If you are thinking that AI is a hype - open your eyes and a little bit of Mind. Prepare your self, up-skill your self. Else It's going to be you, who would be changed.
Feel free to reach out to me if you want to know - how I learned ML after being an Application Developer/Architect for 14 years.