logo
Menu

Extracting Text from documents and Entity recognition | S02 E01 | Winging It

Brian and Darko use Machine Learning to sort documents

Darko Mesaros
Darko Mesaros
Amazon Employee
Brian Ketelsen
Brian Ketelsen
Amazon Employee
Published Aug 16, 2023

Screenshot of the code generated by Amazon CodeWhisperer

Today, Brian and Darko continue their adventure in document sorting. Yes, document sorting. Imagine this, you have hundreds - NO Thousands of documents littered around in various storage systems and devices. What can you make of them? Do you know where is that specific note you got from school? Do you have that invoice that is badly needed? Well, what if there was a way to extract that, and comprehend what is in each of those docs? And maybe, just maybe store in a central place for you to access, and all that (and more) with the power of Machine Learning. THIS is what we worked on in this stream. We extracted text from PDFs using Textract, passed it on to Amazon Comprehend and worked out how to do all that in Python. (Thank you CodeWhisperer ❤️ ).

If you are interested in learning all this with us, check out the recording below 👇

🐦 Reach out to the hosts and guests: