Abstract:Voice based applications are ruling over the era of automation because speech has a lot of factors that determine a speakers information as well as speech. Modern Automatic Speech Recognition (ASR) is a blessing in the field of Human-Computer Interaction (HCI) for efficient communication among humans and devices using Artificial Intelligence technology. Speech is one of the easiest mediums of communication because it has a lot of identical features for different speakers. Nowadays it is possible to determine speakers and their identity using their speech in terms of speaker recognition. In this paper, we presented a method that will provide a speakers geographical identity in a certain region using continuous Bengali speech. We consider eight different divisions of Bangladesh as the geographical region. We applied the Mel Frequency Cepstral Coefficient (MFCC) and Delta features on an Artificial Neural Network to classify speakers division. We performed some preprocessing tasks like noise reduction and 8-10 second segmentation of raw audio before feature extraction. We used our dataset of more than 45 hours of audio data from 633 individual male and female speakers. We recorded the highest accuracy of 85.44%.
Abstract:This research paper presents a unique Bengali OCR system with some capabilities. The system excels in reconstructing document layouts while preserving structure, alignment, and images. It incorporates advanced image and signature detection for accurate extraction. Specialized models for word segmentation cater to diverse document types, including computer-composed, letterpress, typewriter, and handwritten documents. The system handles static and dynamic handwritten inputs, recognizing various writing styles. Furthermore, it has the ability to recognize compound characters in Bengali. Extensive data collection efforts provide a diverse corpus, while advanced technical components optimize character and word recognition. Additional contributions include image, logo, signature and table recognition, perspective correction, layout reconstruction, and a queuing module for efficient and scalable processing. The system demonstrates outstanding performance in efficient and accurate text extraction and analysis.
Abstract:In this era, the moment has arrived to move away from disease as the primary emphasis of medical treatment. Although impressive, the multiple techniques that have been developed to detect the diseases. In this time, there are some types of diseases COVID-19, normal flue, migraine, lung disease, heart disease, kidney disease, diabetics, stomach disease, gastric, bone disease, autism are the very common diseases. In this analysis, we analyze disease symptoms and have done disease predictions based on their symptoms. We studied a range of symptoms and took a survey from people in order to complete the task. Several classification algorithms have been employed to train the model. Furthermore, performance evaluation matrices are used to measure the model's performance. Finally, we discovered that the part classifier surpasses the others.
Abstract:The term "paraphrasing" refers to the process of presenting the sense of an input text in a new way while preserving fluency. Scientific research distribution is gaining traction, allowing both rookie and experienced scientists to participate in their respective fields. As a result, there is now a massive demand for paraphrase tools that may efficiently and effectively assist scientists in modifying statements in order to avoid plagiarism. Natural Language Processing (NLP) is very much important in the realm of the process of document paraphrasing. We analyze and discuss existing studies on paraphrasing in the English language in this paper. Finally, we develop an algorithm to paraphrase any text document or paragraphs using WordNet and Natural Language Tool Kit (NLTK) and maintain "Using Synonyms" techniques to achieve our result. For 250 paragraphs, our algorithm achieved a paraphrase accuracy of 94.8%
Abstract:A deep learning model gives an incredible result for image processing by studying from the trained dataset. Spinach is a leaf vegetable that contains vitamins and nutrients. In our research, a Deep learning method has been used that can automatically identify spinach and this method has a dataset of a total of five species of spinach that contains 3785 images. Four Convolutional Neural Network (CNN) models were used to classify our spinach. These models give more accurate results for image classification. Before applying these models there is some preprocessing of the image data. For the preprocessing of data, some methods need to happen. Those are RGB conversion, filtering, resize & rescaling, and categorization. After applying these methods image data are pre-processed and ready to be used in the classifier algorithms. The accuracy of these classifiers is in between 98.68% - 99.79%. Among those models, VGG16 achieved the highest accuracy of 99.79%.
Abstract:An AI chatbot provides an impressive response after learning from the trained dataset. In this decade, most of the research work demonstrates that deep neural models superior to any other model. RNN model regularly used for determining the sequence-related problem like a question and it answers. This approach acquainted with everyone as seq2seq learning. In a seq2seq model mechanism, it has encoder and decoder. The encoder embedded any input sequence, and the decoder embedded output sequence. For reinforcing the seq2seq model performance, attention mechanism added into the encoder and decoder. After that, the transformer model has introduced itself as a high-performance model with multiple attention mechanism for solving the sequence-related dilemma. This model reduces training time compared with RNN based model and also achieved state-of-the-art performance for sequence transduction. In this research, we applied the transformer model for Bengali general knowledge chatbot based on the Bengali general knowledge Question Answer (QA) dataset. It scores 85.0 BLEU on the applied QA data. To check the comparison of the transformer model performance, we trained the seq2seq model with attention on our dataset that scores 23.5 BLEU.