Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Omar Mohamed

An End-to-End OCR Framework for Robust Arabic-Handwriting Recognition using a Novel Transformers-based Model and an Innovative 270 Million-Words Multi-Font Corpus of Classical Arabic with Diacritics

Aug 26, 2022

Aly Mostafa, Omar Mohamed, Ali Ashraf, Ahmed Elbehery, Salma Jamal, Anas Salah, Amr S. Ghoneim

Figure 1 for An End-to-End OCR Framework for Robust Arabic-Handwriting Recognition using a Novel Transformers-based Model and an Innovative 270 Million-Words Multi-Font Corpus of Classical Arabic with Diacritics

Figure 2 for An End-to-End OCR Framework for Robust Arabic-Handwriting Recognition using a Novel Transformers-based Model and an Innovative 270 Million-Words Multi-Font Corpus of Classical Arabic with Diacritics

Figure 3 for An End-to-End OCR Framework for Robust Arabic-Handwriting Recognition using a Novel Transformers-based Model and an Innovative 270 Million-Words Multi-Font Corpus of Classical Arabic with Diacritics

Figure 4 for An End-to-End OCR Framework for Robust Arabic-Handwriting Recognition using a Novel Transformers-based Model and an Innovative 270 Million-Words Multi-Font Corpus of Classical Arabic with Diacritics

Abstract:This research is the second phase in a series of investigations on developing an Optical Character Recognition (OCR) of Arabic historical documents and examining how different modeling procedures interact with the problem. The first research studied the effect of Transformers on our custom-built Arabic dataset. One of the downsides of the first research was the size of the training data, a mere 15000 images from our 30 million images, due to lack of resources. Also, we add an image enhancement layer, time and space optimization, and Post-Correction layer to aid the model in predicting the correct word for the correct context. Notably, we propose an end-to-end text recognition approach using Vision Transformers as an encoder, namely BEIT, and vanilla Transformer as a decoder, eliminating CNNs for feature extraction and reducing the model's complexity. The experiments show that our end-to-end model outperforms Convolutions Backbones. The model attained a CER of 4.46%.

Via

Access Paper or Ask Questions

Arabic Speech Emotion Recognition Employing Wav2vec2.0 and HuBERT Based on BAVED Dataset

Oct 09, 2021

Omar Mohamed, Salah A. Aly

Figure 1 for Arabic Speech Emotion Recognition Employing Wav2vec2.0 and HuBERT Based on BAVED Dataset

Figure 2 for Arabic Speech Emotion Recognition Employing Wav2vec2.0 and HuBERT Based on BAVED Dataset

Figure 3 for Arabic Speech Emotion Recognition Employing Wav2vec2.0 and HuBERT Based on BAVED Dataset

Figure 4 for Arabic Speech Emotion Recognition Employing Wav2vec2.0 and HuBERT Based on BAVED Dataset

Abstract:Recently, there have been tremendous research outcomes in the fields of speech recognition and natural language processing. This is due to the well-developed multi-layers deep learning paradigms such as wav2vec2.0, Wav2vecU, WavBERT, and HuBERT that provide better representation learning and high information capturing. Such paradigms run on hundreds of unlabeled data, then fine-tuned on a small dataset for specific tasks. This paper introduces a deep learning constructed emotional recognition model for Arabic speech dialogues. The developed model employs the state of the art audio representations include wav2vec2.0 and HuBERT. The experiment and performance results of our model overcome the previous known outcomes.

* 6 pages, 6 figures

Via

Access Paper or Ask Questions

Autonomous Navigation in Dynamic Environments: Deep Learning-Based Approach

Feb 03, 2021

Omar Mohamed, Zeyad Mohsen, Mohamed Wageeh, Mohamed Hegazy

Figure 1 for Autonomous Navigation in Dynamic Environments: Deep Learning-Based Approach

Figure 2 for Autonomous Navigation in Dynamic Environments: Deep Learning-Based Approach

Figure 3 for Autonomous Navigation in Dynamic Environments: Deep Learning-Based Approach

Figure 4 for Autonomous Navigation in Dynamic Environments: Deep Learning-Based Approach

Abstract:Mobile robotics is a research area that has witnessed incredible advances for the last decades. Robot navigation is an essential task for mobile robots. Many methods are proposed for allowing robots to navigate within different environments. This thesis studies different deep learning-based approaches, highlighting the advantages and disadvantages of each scheme. In fact, these approaches are promising that some of them can navigate the robot in unknown and dynamic environments. In this thesis, one of the deep learning methods based on convolutional neural network (CNN) is realized by software implementations. There are different preparation studies to complete this thesis such as introduction to Linux, robot operating system (ROS), C++, python, and GAZEBO simulator. Within this work, we modified the drone network (namely, DroNet) approach to be used in an indoor environment by using a ground robot in different cases. Indeed, the DroNet approach suffers from the absence of goal-oriented motion. Therefore, this thesis mainly focuses on tackling this problem via mapping using simultaneous localization and mapping (SLAM) and path planning techniques using Dijkstra. Afterward, the combination between the DroNet ground robot-based, mapping, and path planning leads to a goal-oriented motion, following the shortest path while avoiding the dynamic obstacle. Finally, we propose a low-cost approach, for indoor applications such as restaurants, museums, etc, on the base of using a monocular camera instead of a laser scanner.

* BSc Degree, Graduation Project Thesis, Institute of Aviation Engineering & Technology, Egypt, 2020

Via

Access Paper or Ask Questions