Disappointing AI: Using AWS for OCR & Celebrity Recognition

Or when AWS mistook The Witcher for a Serbian footballer.

Photo by Rock'n Roll Monkey on Unsplash

According to Forbes , 83% of enterprise workloads will be in The Cloud by 2020, including 41% on public platforms such as AWS, Google Cloud Platform or Microsoft Azure. During one of my recent project as a data scientist, I had to start getting used to cloud computing, storage and deployment. Another good step in this direction might be to start experimenting with cloud services such as image recognition, character recognition and speech recognition, if not only to know the performance they offer.

Testing OCR services during a Hackathon

Testing cloud services offering Optical Character Recognition was the topic of a hackathon I attended a few weeks back, at the company I currently work in. It gave me the chance to test the Artificial Intelligence Services of AWS , focusing primarily on OCR ( Textract, Rekognition ), but also fun services such as celebrity recognition ( Rekognition ). Indeed, AWS Rekognition is supposed to be quite a complete service that can detect objects, recognize faces and detect text. The reality is slightly different.

A tiny reminder about OCR

OCR represents the ability of a system to detect typed, handwritten or printed text into machine-encoded text, whether it is from a scanned document, a photo of a document, a sign or other type of text displayed in a photo. It is currently a hot topic as many “old” institutions such as governmental institutions, banks or insurances aim at digitizing all documents, such as printed contracts that might include both printed and hand-written characters. In general, OCR can be separated in two groups:

  • HCR : Hand Written Character Recognition
  • PCR : Printed Character Recognition

HCR tends to be a lot more challenging than PCR for an obvious reason: most people have a horribly confusing handwriting. The difference is also noticeable in the results we experienced.