Abstract:Millions of visually impaired people depend on relatives and friends to perform their everyday tasks. One relevant step towards self-sufficiency is to provide them with means to verify the value and operation presented in payment machines. In this work, we developed and released a smartphone application, named Pay Voice, that uses image processing, optical character recognition (OCR) and voice synthesis to recognize the value and operation presented in POS and PIN pad machines, and thus informing the user with auditive and visual feedback. The proposed approach presented significant results for value and operation recognition, especially for POS, due to the higher display quality. Importantly, we achieved the key performance indicators, namely, more than 80% of accuracy in a real-world scenario, and less than $5$ seconds of processing time for recognition. Pay Voice is publicly available on Google Play and App Store for free.
Abstract:Cross-domain biometrics has been emerging as a new necessity, which poses several additional challenges, including harsh illumination changes, noise, pose variation, among others. In this paper, we explore approaches to cross-domain face verification, comparing self-portrait photographs ("selfies") to ID documents. We approach the problem with proper image photometric adjustment and data standardization techniques, along with deep learning methods to extract the most prominent features from the data, reducing the effects of domain shift in this problem. We validate the methods using a novel dataset comprising 50 individuals. The obtained results are promising and indicate that the adopted path is worth further investigation.