Extracting Text from documents and Entity recognition | S02 E01 | Winging It
Brian and Darko use Machine Learning to sort documents

Today, Brian and Darko continue their adventure in document sorting. Yes, document sorting. Imagine this, you have hundreds - NO Thousands of documents littered around in various storage systems and devices. What can you make of them? Do you know where is that specific note you got from school? Do you have that invoice that is badly needed? Well, what if there was a way to extract that, and comprehend what is in each of those docs? And maybe, just maybe store in a central place for you to access, and all that (and more) with the power of Machine Learning. THIS is what we worked on in this stream. We extracted text from PDFs using Textract, passed it on to Amazon Comprehend and worked out how to do all that in Python. (Thank you CodeWhisperer ❤️ ).
If you are interested in learning all this with us, check out the recording below 👇
- Amazon CodeWhisperer
- Detect Entities API documentation
- Amazon Textract
- Amazon Comprehend
- Just - a better MAKE
🐦 Reach out to the hosts and guests: