Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sidike Paheding

Fairfield University

Evaluating the Impact of Underwater Image Enhancement on Object Detection Performance: A Comprehensive Study

Nov 26, 2024

Ali Awad, Ashraf Saleem, Sidike Paheding, Evan Lucas, Serein Al-Ratrout, Timothy C. Havens

Figure 1 for Evaluating the Impact of Underwater Image Enhancement on Object Detection Performance: A Comprehensive Study

Figure 2 for Evaluating the Impact of Underwater Image Enhancement on Object Detection Performance: A Comprehensive Study

Figure 3 for Evaluating the Impact of Underwater Image Enhancement on Object Detection Performance: A Comprehensive Study

Figure 4 for Evaluating the Impact of Underwater Image Enhancement on Object Detection Performance: A Comprehensive Study

Abstract:Underwater imagery often suffers from severe degradation that results in low visual quality and object detection performance. This work aims to evaluate state-of-the-art image enhancement models, investigate their impact on underwater object detection, and explore their potential to improve detection performance. To this end, we selected representative underwater image enhancement models covering major enhancement categories and applied them separately to two recent datasets: 1) the Real-World Underwater Object Detection Dataset (RUOD), and 2) the Challenging Underwater Plant Detection Dataset (CUPDD). Following this, we conducted qualitative and quantitative analyses on the enhanced images and developed a quality index (Q-index) to compare the quality distribution of the original and enhanced images. Subsequently, we compared the performance of several YOLO-NAS detection models that are separately trained and tested on the original and enhanced image sets. Then, we performed a correlation study to examine the relationship between enhancement metrics and detection performance. We also analyzed the inference results from the trained detectors presenting cases where enhancement increased the detection performance as well as cases where enhancement revealed missed objects by human annotators. This study suggests that although enhancement generally deteriorates the detection performance, it can still be harnessed in some cases for increased detection performance and more accurate human annotation.

Via

Access Paper or Ask Questions

Cross-view geo-localization: a survey

Jun 14, 2024

Abhilash Durgam, Sidike Paheding, Vikas Dhiman, Vijay Devabhaktuni

Abstract:Cross-view geo-localization has garnered notable attention in the realm of computer vision, spurred by the widespread availability of copious geotagged datasets and the advancements in machine learning techniques. This paper provides a thorough survey of cutting-edge methodologies, techniques, and associated challenges that are integral to this domain, with a focus on feature-based and deep learning strategies. Feature-based methods capitalize on unique features to establish correspondences across disparate viewpoints, whereas deep learning-based methodologies deploy convolutional neural networks to embed view-invariant attributes. This work also delineates the multifaceted challenges encountered in cross-view geo-localization, such as variations in viewpoints and illumination, the occurrence of occlusions, and it elucidates innovative solutions that have been formulated to tackle these issues. Furthermore, we delineate benchmark datasets and relevant evaluation metrics, and also perform a comparative analysis of state-of-the-art techniques. Finally, we conclude the paper with a discussion on prospective avenues for future research and the burgeoning applications of cross-view geo-localization in an intricately interconnected global landscape.

Via

Access Paper or Ask Questions

What Is Near?: Room Locality Learning for Enhanced Robot Vision-Language-Navigation in Indoor Living Environments

Sep 10, 2023

Muraleekrishna Gopinathan, Jumana Abu-Khalaf, David Suter, Sidike Paheding, Nathir A. Rawashdeh

Abstract:Humans use their knowledge of common house layouts obtained from previous experiences to predict nearby rooms while navigating in new environments. This greatly helps them navigate previously unseen environments and locate their target room. To provide layout prior knowledge to navigational agents based on common human living spaces, we propose WIN (\textit{W}hat \textit{I}s \textit{N}ear), a commonsense learning model for Vision Language Navigation (VLN) tasks. VLN requires an agent to traverse indoor environments based on descriptive navigational instructions. Unlike existing layout learning works, WIN predicts the local neighborhood map based on prior knowledge of living spaces and current observation, operating on an imagined global map of the entire environment. The model infers neighborhood regions based on visual cues of current observations, navigational history, and layout common sense. We show that local-global planning based on locality knowledge and predicting the indoor layout allows the agent to efficiently select the appropriate action. Specifically, we devised a cross-modal transformer that utilizes this locality prior for decision-making in addition to visual inputs and instructions. Experimental results show that locality learning using WIN provides better generalizability compared to classical VLN agents in unseen environments. Our model performs favorably on standard VLN metrics, with Success Rate 68\% and Success weighted by Path Length 63\% in unseen environments.

Via

Access Paper or Ask Questions

The Forward-Forward Algorithm as a feature extractor for skin lesion classification: A preliminary study

Jul 02, 2023

Abel Reyes-Angulo, Sidike Paheding

Abstract:Skin cancer, a deadly form of cancer, exhibits a 23\% survival rate in the USA with late diagnosis. Early detection can significantly increase the survival rate, and facilitate timely treatment. Accurate biomedical image classification is vital in medical analysis, aiding clinicians in disease diagnosis and treatment. Deep learning (DL) techniques, such as convolutional neural networks and transformers, have revolutionized clinical decision-making automation. However, computational cost and hardware constraints limit the implementation of state-of-the-art DL architectures. In this work, we explore a new type of neural network that does not need backpropagation (BP), namely the Forward-Forward Algorithm (FFA), for skin lesion classification. While FFA is claimed to use very low-power analog hardware, BP still tends to be superior in terms of classification accuracy. In addition, our experimental results suggest that the combination of FFA and BP can be a better alternative to achieve a more accurate prediction.

* This is a camera-ready version of the paper for the LXAI @ ICML'23 workshop

Via

Access Paper or Ask Questions

Forward-Forward Algorithm for Hyperspectral Image Classification: A Preliminary Study

Jul 01, 2023

Sidike Paheding, Abel A. Reyes-Angulo

Abstract:The back-propagation algorithm has long been the de-facto standard in optimizing weights and biases in neural networks, particularly in cutting-edge deep learning models. Its widespread adoption in fields like natural language processing, computer vision, and remote sensing has revolutionized automation in various tasks. The popularity of back-propagation stems from its ability to achieve outstanding performance in tasks such as classification, detection, and segmentation. Nevertheless, back-propagation is not without its limitations, encompassing sensitivity to initial conditions, vanishing gradients, overfitting, and computational complexity. The recent introduction of a forward-forward algorithm (FFA), which computes local goodness functions to optimize network parameters, alleviates the dependence on substantial computational resources and the constant need for architectural scaling. This study investigates the application of FFA for hyperspectral image classification. Experimental results and comparative analysis are provided with the use of the traditional back-propagation algorithm. Preliminary results show the potential behind FFA and its promises.

Via

Access Paper or Ask Questions

GAF-NAU: Gramian Angular Field encoded Neighborhood Attention U-Net for Pixel-Wise Hyperspectral Image Classification

Apr 21, 2022

Sidike Paheding, Abel A. Reyes, Anush Kasaragod, Thomas Oommen

Figure 1 for GAF-NAU: Gramian Angular Field encoded Neighborhood Attention U-Net for Pixel-Wise Hyperspectral Image Classification

Figure 2 for GAF-NAU: Gramian Angular Field encoded Neighborhood Attention U-Net for Pixel-Wise Hyperspectral Image Classification

Figure 3 for GAF-NAU: Gramian Angular Field encoded Neighborhood Attention U-Net for Pixel-Wise Hyperspectral Image Classification

Figure 4 for GAF-NAU: Gramian Angular Field encoded Neighborhood Attention U-Net for Pixel-Wise Hyperspectral Image Classification

Abstract:Hyperspectral image (HSI) classification is the most vibrant area of research in the hyperspectral community due to the rich spectral information contained in HSI can greatly aid in identifying objects of interest. However, inherent non-linearity between materials and the corresponding spectral profiles brings two major challenges in HSI classification: interclass similarity and intraclass variability. Many advanced deep learning methods have attempted to address these issues from the perspective of a region/patch-based approach, instead of a pixel-based alternate. However, the patch-based approaches hypothesize that neighborhood pixels of a target pixel in a fixed spatial window belong to the same class. And this assumption is not always true. To address this problem, we herein propose a new deep learning architecture, namely Gramian Angular Field encoded Neighborhood Attention U-Net (GAF-NAU), for pixel-based HSI classification. The proposed method does not require regions or patches centered around a raw target pixel to perform 2D-CNN based classification, instead, our approach transforms 1D pixel vector in HSI into 2D angular feature space using Gramian Angular Field (GAF) and then embed it to a new neighborhood attention network to suppress irrelevant angular feature while emphasizing on pertinent features useful for HSI classification task. Evaluation results on three publicly available HSI datasets demonstrate the superior performance of the proposed model.

* 8 Pages, 9 Figures

Via

Access Paper or Ask Questions