Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

M. Sohel Rahman

Entropy-Driven Genetic Optimization for Deep-Feature-Guided Low-Light Image Enhancement

May 16, 2025

Nirjhor Datta, Afroza Akther, M. Sohel Rahman

Abstract:Image enhancement methods often prioritize pixel level information, overlooking the semantic features. We propose a novel, unsupervised, fuzzy-inspired image enhancement framework guided by NSGA-II algorithm that optimizes image brightness, contrast, and gamma parameters to achieve a balance between visual quality and semantic fidelity. Central to our proposed method is the use of a pre trained deep neural network as a feature extractor. To find the best enhancement settings, we use a GPU-accelerated NSGA-II algorithm that balances multiple objectives, namely, increasing image entropy, improving perceptual similarity, and maintaining appropriate brightness. We further improve the results by applying a local search phase to fine-tune the top candidates from the genetic algorithm. Our approach operates entirely without paired training data making it broadly applicable across domains with limited or noisy labels. Quantitatively, our model achieves excellent performance with average BRISQUE and NIQE scores of 19.82 and 3.652, respectively, in all unpaired datasets. Qualitatively, enhanced images by our model exhibit significantly improved visibility in shadowed regions, natural balance of contrast and also preserve the richer fine detail without introducing noticable artifacts. This work opens new directions for unsupervised image enhancement where semantic consistency is critical.

Via

Access Paper or Ask Questions

GroundHog: Revolutionizing GLDAS Groundwater Storage Downscaling for Enhanced Recharge Estimation in Bangladesh

Mar 28, 2025

Saleh Sakib Ahmed, Rashed Uz Zzaman, Saifur Rahman Jony, Faizur Rahman Himel, Afroza Sharmin, A. H. M. Khalequr Rahman, M. Sohel Rahman, Sara Nowreen

Abstract:Long-term groundwater level (GWL) measurement is vital for effective policymaking and recharge estimation using annual maxima and minima. However, current methods prioritize short-term predictions and lack multi-year applicability, limiting their utility. Moreover, sparse in-situ measurements lead to reliance on low-resolution satellite data like GLDAS as the ground truth for Machine Learning models, further constraining accuracy. To overcome these challenges, we first develop an ML model to mitigate data gaps, achieving $R^2$ scores of 0.855 and 0.963 for maximum and minimum GWL predictions, respectively. Subsequently, using these predictions and well observations as ground truth, we train an Upsampling Model that uses low-resolution (25 km) GLDAS data as input to produce high-resolution (2 km) GWLs, achieving an excellent $R^2$ score of 0.96. Our approach successfully upscales GLDAS data for 2003-2024, allowing high-resolution recharge estimations and revealing critical trends for proactive resource management. Our method allows upsampling of groundwater storage (GWS) from GLDAS to high-resolution GWLs for any points independently of officially curated piezometer data, making it a valuable tool for decision-making.

Via

Access Paper or Ask Questions

Ophthalmic Biomarker Detection Using Ensembled Vision Transformers -- Winning Solution to IEEE SPS VIP Cup 2023

Oct 21, 2023

H. A. Z. Sameen Shahgir, Khondker Salman Sayeed, Tanjeem Azwad Zaman, Md. Asif Haider, Sheikh Saifur Rahman Jony, M. Sohel Rahman

Figure 1 for Ophthalmic Biomarker Detection Using Ensembled Vision Transformers -- Winning Solution to IEEE SPS VIP Cup 2023

Figure 2 for Ophthalmic Biomarker Detection Using Ensembled Vision Transformers -- Winning Solution to IEEE SPS VIP Cup 2023

Figure 3 for Ophthalmic Biomarker Detection Using Ensembled Vision Transformers -- Winning Solution to IEEE SPS VIP Cup 2023

Figure 4 for Ophthalmic Biomarker Detection Using Ensembled Vision Transformers -- Winning Solution to IEEE SPS VIP Cup 2023

Abstract:This report outlines our approach in the IEEE SPS VIP Cup 2023: Ophthalmic Biomarker Detection competition. Our primary objective in this competition was to identify biomarkers from Optical Coherence Tomography (OCT) images obtained from a diverse range of patients. Using robust augmentations and 5-fold cross-validation, we trained two vision transformer-based models: MaxViT and EVA-02, and ensembled them at inference time. We find MaxViT's use of convolution layers followed by strided attention to be better suited for the detection of local features while EVA-02's use of normal attention mechanism and knowledge distillation is better for detecting global features. Ours was the best-performing solution in the competition, achieving a patient-wise F1 score of 0.814 in the first phase and 0.8527 in the second and final phase of VIP Cup 2023, scoring 3.8% higher than the next-best solution.

Via

Access Paper or Ask Questions

Image Contrast Enhancement using Fuzzy Technique with Parameter Determination using Metaheuristics

Jan 30, 2023

Mohimenul Kabir, Jaiaid Mobin, Ahmad Hassanat, M. Sohel Rahman

Abstract:In this work, we have presented a way to increase the contrast of an image. Our target is to find a transformation that will be image specific. We have used a fuzzy system as our transformation function. To tune the system according to an image, we have used Genetic Algorithm and Hill Climbing in multiple ways to evolve the fuzzy system and conducted several experiments. Different variants of the method are tested on several images and two variants that are superior to others in terms of fitness are selected. We have also conducted a survey to assess the visual improvement of the enhancements made by the two variants. The survey indicates that one of the methods can enhance the contrast of the images visually.

* 14 pages, 7 figures, Image Processing, Computer Vision, Evolutionary Computation

Via

Access Paper or Ask Questions

RamanNet: A generalized neural network architecture for Raman Spectrum Analysis

Jan 20, 2022

Nabil Ibtehaz, Muhammad E. H. Chowdhury, Amith Khandakar, Susu M. Zughaier, Serkan Kiranyaz, M. Sohel Rahman

Figure 1 for RamanNet: A generalized neural network architecture for Raman Spectrum Analysis

Figure 2 for RamanNet: A generalized neural network architecture for Raman Spectrum Analysis

Figure 3 for RamanNet: A generalized neural network architecture for Raman Spectrum Analysis

Figure 4 for RamanNet: A generalized neural network architecture for Raman Spectrum Analysis

Abstract:Raman spectroscopy provides a vibrational profile of the molecules and thus can be used to uniquely identify different kind of materials. This sort of fingerprinting molecules has thus led to widespread application of Raman spectrum in various fields like medical dignostics, forensics, mineralogy, bacteriology and virology etc. Despite the recent rise in Raman spectra data volume, there has not been any significant effort in developing generalized machine learning methods for Raman spectra analysis. We examine, experiment and evaluate existing methods and conjecture that neither current sequential models nor traditional machine learning models are satisfactorily sufficient to analyze Raman spectra. Both has their perks and pitfalls, therefore we attempt to mix the best of both worlds and propose a novel network architecture RamanNet. RamanNet is immune to invariance property in CNN and at the same time better than traditional machine learning models for the inclusion of sparse connectivity. Our experiments on 4 public datasets demonstrate superior performance over the much complex state-of-the-art methods and thus RamanNet has the potential to become the defacto standard in Raman spectra data analysis

Via

Access Paper or Ask Questions

Data transformation based optimized customer churn prediction model for the telecommunication industry

Jan 11, 2022

Joydeb Kumar Sana, Mohammad Zoynul Abedin, M. Sohel Rahman, M. Saifur Rahman

Figure 1 for Data transformation based optimized customer churn prediction model for the telecommunication industry

Figure 2 for Data transformation based optimized customer churn prediction model for the telecommunication industry

Figure 3 for Data transformation based optimized customer churn prediction model for the telecommunication industry

Figure 4 for Data transformation based optimized customer churn prediction model for the telecommunication industry

Abstract:Data transformation (DT) is a process that transfers the original data into a form which supports a particular classification algorithm and helps to analyze the data for a special purpose. To improve the prediction performance we investigated various data transform methods. This study is conducted in a customer churn prediction (CCP) context in the telecommunication industry (TCI), where customer attrition is a common phenomenon. We have proposed a novel approach of combining data transformation methods with the machine learning models for the CCP problem. We conducted our experiments on publicly available TCI datasets and assessed the performance in terms of the widely used evaluation measures (e.g. AUC, precision, recall, and F-measure). In this study, we presented comprehensive comparisons to affirm the effect of the transformation methods. The comparison results and statistical test proved that most of the proposed data transformation based optimized models improve the performance of CCP significantly. Overall, an efficient and optimized CCP model for the telecommunication industry has been presented through this manuscript.

* 24 pages

Via

Access Paper or Ask Questions

Lung-Originated Tumor Segmentation from Computed Tomography Scan (LOTUS) Benchmark

Jan 03, 2022

Parnian Afshar, Arash Mohammadi, Konstantinos N. Plataniotis, Keyvan Farahani, Justin Kirby, Anastasia Oikonomou, Amir Asif, Leonard Wee, Andre Dekker, Xin Wu(+15 more)

Figure 1 for Lung-Originated Tumor Segmentation from Computed Tomography Scan (LOTUS) Benchmark

Figure 2 for Lung-Originated Tumor Segmentation from Computed Tomography Scan (LOTUS) Benchmark

Figure 3 for Lung-Originated Tumor Segmentation from Computed Tomography Scan (LOTUS) Benchmark

Figure 4 for Lung-Originated Tumor Segmentation from Computed Tomography Scan (LOTUS) Benchmark

Abstract:Lung cancer is one of the deadliest cancers, and in part its effective diagnosis and treatment depend on the accurate delineation of the tumor. Human-centered segmentation, which is currently the most common approach, is subject to inter-observer variability, and is also time-consuming, considering the fact that only experts are capable of providing annotations. Automatic and semi-automatic tumor segmentation methods have recently shown promising results. However, as different researchers have validated their algorithms using various datasets and performance metrics, reliably evaluating these methods is still an open challenge. The goal of the Lung-Originated Tumor Segmentation from Computed Tomography Scan (LOTUS) Benchmark created through 2018 IEEE Video and Image Processing (VIP) Cup competition, is to provide a unique dataset and pre-defined metrics, so that different researchers can develop and evaluate their methods in a unified fashion. The 2018 VIP Cup started with a global engagement from 42 countries to access the competition data. At the registration stage, there were 129 members clustered into 28 teams from 10 countries, out of which 9 teams made it to the final stage and 6 teams successfully completed all the required tasks. In a nutshell, all the algorithms proposed during the competition, are based on deep learning models combined with a false positive reduction technique. Methods developed by the three finalists show promising results in tumor segmentation, however, more effort should be put into reducing the false positive rate. This competition manuscript presents an overview of the VIP-Cup challenge, along with the proposed algorithms and results.

Via

Access Paper or Ask Questions

Segmentation of Lung Tumor from CT Images using Deep Supervision

Nov 17, 2021

Farhanaz Farheen, Md. Salman Shamil, Nabil Ibtehaz, M. Sohel Rahman

Figure 1 for Segmentation of Lung Tumor from CT Images using Deep Supervision

Figure 2 for Segmentation of Lung Tumor from CT Images using Deep Supervision

Figure 3 for Segmentation of Lung Tumor from CT Images using Deep Supervision

Figure 4 for Segmentation of Lung Tumor from CT Images using Deep Supervision

Abstract:Lung cancer is a leading cause of death in most countries of the world. Since prompt diagnosis of tumors can allow oncologists to discern their nature, type and the mode of treatment, tumor detection and segmentation from CT Scan images is a crucial field of study worldwide. This paper approaches lung tumor segmentation by applying two-dimensional discrete wavelet transform (DWT) on the LOTUS dataset for more meticulous texture analysis whilst integrating information from neighboring CT slices before feeding them to a Deeply Supervised MultiResUNet model. Variations in learning rates, decay and optimization algorithms while training the network have led to different dice co-efficients, the detailed statistics of which have been included in this paper. We also discuss the challenges in this dataset and how we opted to overcome them. In essence, this study aims to maximize the success rate of predicting tumor regions from two dimensional CT Scan slices by experimenting with a number of adequate networks, resulting in a dice co-efficient of 0.8472.

Via

Access Paper or Ask Questions

A Shallow U-Net Architecture for Reliably Predicting Blood Pressure from Photoplethysmogram and Electrocardiogram Signals

Nov 12, 2021

Sakib Mahmud, Nabil Ibtehaz, Amith Khandakar, Anas Tahir, Tawsifur Rahman, Khandaker Reajul Islam, Md Shafayet Hossain, M. Sohel Rahman, Mohammad Tariqul Islam, Muhammad E. H. Chowdhury

Figure 1 for A Shallow U-Net Architecture for Reliably Predicting Blood Pressure from Photoplethysmogram and Electrocardiogram Signals

Figure 2 for A Shallow U-Net Architecture for Reliably Predicting Blood Pressure from Photoplethysmogram and Electrocardiogram Signals

Figure 3 for A Shallow U-Net Architecture for Reliably Predicting Blood Pressure from Photoplethysmogram and Electrocardiogram Signals

Figure 4 for A Shallow U-Net Architecture for Reliably Predicting Blood Pressure from Photoplethysmogram and Electrocardiogram Signals

Abstract:Cardiovascular diseases are the most common causes of death around the world. To detect and treat heart-related diseases, continuous Blood Pressure (BP) monitoring along with many other parameters are required. Several invasive and non-invasive methods have been developed for this purpose. Most existing methods used in the hospitals for continuous monitoring of BP are invasive. On the contrary, cuff-based BP monitoring methods, which can predict Systolic Blood Pressure (SBP) and Diastolic Blood Pressure (DBP), cannot be used for continuous monitoring. Several studies attempted to predict BP from non-invasively collectible signals such as Photoplethysmogram (PPG) and Electrocardiogram (ECG), which can be used for continuous monitoring. In this study, we explored the applicability of autoencoders in predicting BP from PPG and ECG signals. The investigation was carried out on 12,000 instances of 942 patients of the MIMIC-II dataset and it was found that a very shallow, one-dimensional autoencoder can extract the relevant features to predict the SBP and DBP with the state-of-the-art performance on a very large dataset. Independent test set from a portion of the MIMIC-II dataset provides an MAE of 2.333 and 0.713 for SBP and DBP, respectively. On an external dataset of forty subjects, the model trained on the MIMIC-II dataset, provides an MAE of 2.728 and 1.166 for SBP and DBP, respectively. For both the cases, the results met British Hypertension Society (BHS) Grade A and surpassed the studies from the current literature.

* 22 pages, Figure 8, Table 13

Via

Access Paper or Ask Questions

XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages

Jun 25, 2021

Tahmid Hasan, Abhik Bhattacharjee, Md Saiful Islam, Kazi Samin, Yuan-Fang Li, Yong-Bin Kang, M. Sohel Rahman, Rifat Shahriyar

Figure 1 for XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages

Figure 2 for XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages

Figure 3 for XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages

Figure 4 for XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages

Abstract:Contemporary works on abstractive text summarization have focused primarily on high-resource languages like English, mostly due to the limited availability of datasets for low/mid-resource ones. In this work, we present XL-Sum, a comprehensive and diverse dataset comprising 1 million professionally annotated article-summary pairs from BBC, extracted using a set of carefully designed heuristics. The dataset covers 44 languages ranging from low to high-resource, for many of which no public dataset is currently available. XL-Sum is highly abstractive, concise, and of high quality, as indicated by human and intrinsic evaluation. We fine-tune mT5, a state-of-the-art pretrained multilingual model, with XL-Sum and experiment on multilingual and low-resource summarization tasks. XL-Sum induces competitive results compared to the ones obtained using similar monolingual datasets: we show higher than 11 ROUGE-2 scores on 10 languages we benchmark on, with some of them exceeding 15, as obtained by multilingual training. Additionally, training on low-resource languages individually also provides competitive performance. To the best of our knowledge, XL-Sum is the largest abstractive summarization dataset in terms of the number of samples collected from a single source and the number of languages covered. We are releasing our dataset and models to encourage future research on multilingual abstractive summarization. The resources can be found at \url{https://github.com/csebuetnlp/xl-sum}.

* Findings of the Association for Computational Linguistics, ACL 2021 (camera-ready)

Via

Access Paper or Ask Questions