Abstract:We introduce KANICE (Kolmogorov-Arnold Networks with Interactive Convolutional Elements), a novel neural architecture that combines Convolutional Neural Networks (CNNs) with Kolmogorov-Arnold Network (KAN) principles. KANICE integrates Interactive Convolutional Blocks (ICBs) and KAN linear layers into a CNN framework. This leverages KANs' universal approximation capabilities and ICBs' adaptive feature learning. KANICE captures complex, non-linear data relationships while enabling dynamic, context-dependent feature extraction based on the Kolmogorov-Arnold representation theorem. We evaluated KANICE on four datasets: MNIST, Fashion-MNIST, EMNIST, and SVHN, comparing it against standard CNNs, CNN-KAN hybrids, and ICB variants. KANICE consistently outperformed baseline models, achieving 99.35% accuracy on MNIST and 90.05% on the SVHN dataset. Furthermore, we introduce KANICE-mini, a compact variant designed for efficiency. A comprehensive ablation study demonstrates that KANICE-mini achieves comparable performance to KANICE with significantly fewer parameters. KANICE-mini reached 90.00% accuracy on SVHN with 2,337,828 parameters, compared to KANICE's 25,432,000. This study highlights the potential of KAN-based architectures in balancing performance and computational efficiency in image classification tasks. Our work contributes to research in adaptive neural networks, integrates mathematical theorems into deep learning architectures, and explores the trade-offs between model complexity and performance, advancing computer vision and pattern recognition. The source code for this paper is publicly accessible through our GitHub repository (https://github.com/m-ferdaus/kanice).
Abstract:Imbalanced datasets are a significant challenge in real-world scenarios. They lead to models that underperform on underrepresented classes, which is a critical issue in infrastructure inspection. This paper introduces the Enhanced Feature Pyramid Network (E-FPN), a deep learning model for the semantic segmentation of culverts and sewer pipes within imbalanced datasets. The E-FPN incorporates architectural innovations like sparsely connected blocks and depth-wise separable convolutions to improve feature extraction and handle object variations. To address dataset imbalance, the model employs strategies like class decomposition and data augmentation. Experimental results on the culvert-sewer defects dataset and a benchmark aerial semantic segmentation drone dataset show that the E-FPN outperforms state-of-the-art methods, achieving an average Intersection over Union (IoU) improvement of 13.8% and 27.2%, respectively. Additionally, class decomposition and data augmentation together boost the model's performance by approximately 6.9% IoU. The proposed E-FPN presents a promising solution for enhancing object segmentation in challenging, multi-class real-world datasets, with potential applications extending beyond culvert-sewer defect detection.
Abstract:This paper introduces Semantic Haar-Adaptive Refined Pyramid Network (SHARP-Net), a novel architecture for semantic segmentation. SHARP-Net integrates a bottom-up pathway featuring Inception-like blocks with varying filter sizes (3x3$ and 5x5), parallel max-pooling, and additional spatial detection layers. This design captures multi-scale features and fine structural details. Throughout the network, depth-wise separable convolutions are used to reduce complexity. The top-down pathway of SHARP-Net focuses on generating high-resolution features through upsampling and information fusion using $1\times1$ and $3\times3$ depth-wise separable convolutions. We evaluated our model using our developed challenging Culvert-Sewer Defects dataset and the benchmark DeepGlobe Land Cover dataset. Our experimental evaluation demonstrated the base model's (excluding Haar-like features) effectiveness in handling irregular defect shapes, occlusions, and class imbalances. It outperformed state-of-the-art methods, including U-Net, CBAM U-Net, ASCU-Net, FPN, and SegFormer, achieving average improvements of 14.4% and 12.1% on the Culvert-Sewer Defects and DeepGlobe Land Cover datasets, respectively, with IoU scores of 77.2% and 70.6%. Additionally, the training time was reduced. Furthermore, the integration of carefully selected and fine-tuned Haar-like features enhanced the performance of deep learning models by at least 20%. The proposed SHARP-Net, incorporating Haar-like features, achieved an impressive IoU of 94.75%, representing a 22.74% improvement over the base model. These features were also applied to other deep learning models, showing a 35.0% improvement, proving their versatility and effectiveness. SHARP-Net thus provides a powerful and efficient solution for accurate semantic segmentation in challenging real-world scenarios.
Abstract:Spatiotemporal networks' observational capabilities are crucial for accurate data gathering and informed decisions across multiple sectors. This study focuses on the Spatiotemporal Ranged Observer-Observable Bipartite Network (STROOBnet), linking observational nodes (e.g., surveillance cameras) to events within defined geographical regions, enabling efficient monitoring. Using data from Real-Time Crime Camera (RTCC) systems and Calls for Service (CFS) in New Orleans, where RTCC combats rising crime amidst reduced police presence, we address the network's initial observational imbalances. Aiming for uniform observational efficacy, we propose the Proximal Recurrence approach. It outperformed traditional clustering methods like k-means and DBSCAN by offering holistic event frequency and spatial consideration, enhancing observational coverage.
Abstract:The proposed architecture, Dual Attentive U-Net with Feature Infusion (DAU-FI Net), addresses challenges in semantic segmentation, particularly on multiclass imbalanced datasets with limited samples. DAU-FI Net integrates multiscale spatial-channel attention mechanisms and feature injection to enhance precision in object localization. The core employs a multiscale depth-separable convolution block, capturing localized patterns across scales. This block is complemented by a spatial-channel squeeze and excitation (scSE) attention unit, modeling inter-dependencies between channels and spatial regions in feature maps. Additionally, additive attention gates refine segmentation by connecting encoder-decoder pathways. To augment the model, engineered features using Gabor filters for textural analysis, Sobel and Canny filters for edge detection are injected guided by semantic masks to expand the feature space strategically. Comprehensive experiments on a challenging sewer pipe and culvert defect dataset and a benchmark dataset validate DAU-FI Net's capabilities. Ablation studies highlight incremental benefits from attention blocks and feature injection. DAU-FI Net achieves state-of-the-art mean Intersection over Union (IoU) of 95.6% and 98.8% on the defect test set and benchmark respectively, surpassing prior methods by 8.9% and 12.6%, respectively. Ablation studies highlight incremental benefits from attention blocks and feature injection. The proposed architecture provides a robust solution, advancing semantic segmentation for multiclass problems with limited training data. Our sewer-culvert defects dataset, featuring pixel-level annotations, opens avenues for further research in this crucial domain. Overall, this work delivers key innovations in architecture, attention, and feature engineering to elevate semantic segmentation efficacy.
Abstract:In this paper, we propose a Light Gradient Boosting (LightGBM) to forecast dominant wave periods in oceanic waters. First, we use the data collected from CDIP buoys and apply various data filtering methods. The data filtering methods allow us to obtain a high-quality dataset for training and validation purposes. We then extract various wave-based features like wave heights, periods, skewness, kurtosis, etc., and atmospheric features like humidity, pressure, and air temperature for the buoys. Afterward, we train algorithms that use LightGBM and Extra Trees through a hv-block cross-validation scheme to forecast dominant wave periods for up to 30 days ahead. LightGBM has the R2 score of 0.94, 0.94, and 0.94 for 1-day ahead, 15-day ahead, and 30-day ahead prediction. Similarly, Extra Trees (ET) has an R2 score of 0.88, 0.86, and 0.85 for 1-day ahead, 15-day ahead, and 30 day ahead prediction. In case of the test dataset, LightGBM has R2 score of 0.94, 0.94, and 0.94 for 1-day ahead, 15-day ahead and 30-day ahead prediction. ET has R2 score of 0.88, 0.86, and 0.85 for 1-day ahead, 15-day ahead, and 30-day ahead prediction. A similar R2 score for both training and the test dataset suggests that the machine learning models developed in this paper are robust. Since the LightGBM algorithm outperforms ET for all the windows tested, it is taken as the final algorithm. Note that the performance of both methods does not decrease significantly as the forecast horizon increases. Likewise, the proposed method outperforms the numerical approaches included in this paper in the test dataset. For 1 day ahead prediction, the proposed algorithm has SI, Bias, CC, and RMSE of 0.09, 0.00, 0.97, and 1.78 compared to 0.268, 0.40, 0.63, and 2.18 for the European Centre for Medium-range Weather Forecasts (ECMWF) model, which outperforms all the other methods in the test dataset.
Abstract:This paper proposes a machine learning method based on the Extra Trees (ET) algorithm for forecasting Significant Wave Heights in oceanic waters. To derive multiple features from the CDIP buoys, which make point measurements, we first nowcast various parameters and then forecast them at 30-min intervals. The proposed algorithm has Scatter Index (SI), Bias, Correlation Coefficient, Root Mean Squared Error (RMSE) of 0.130, -0.002, 0.97, and 0.14, respectively, for one day ahead prediction and 0.110, -0.001, 0.98, and 0.122, respectively, for 14-day ahead prediction on the testing dataset. While other state-of-the-art methods can only forecast up to 120 hours ahead, we extend it further to 14 days. This 14-day limit is not the forecasting limit, but it arises due to our experiment's setup. Our proposed setup includes spectral features, hv-block cross-validation, and stringent QC criteria. The proposed algorithm performs significantly better than the state-of-the-art methods commonly used for significant wave height forecasting for one-day ahead prediction. Moreover, the improved performance of the proposed machine learning method compared to the numerical methods, shows that this performance can be extended to even longer time periods allowing for early prediction of significant wave heights in oceanic waters.
Abstract:In this paper, we present a novel approach for the prediction of rogue waves in oceans using statistical machine learning methods. Since the ocean is composed of many wave systems, the change from a bimodal or multimodal directional distribution to unimodal one is taken as the warning criteria. Likewise, we explore various features that help in predicting rogue waves. The analysis of the results shows that the Spectral features are significant in predicting rogue waves. We find that nonlinear classifiers have better prediction accuracy than the linear ones. Finally, we propose a Random Forest Classifier based algorithm to predict rogue waves in oceanic conditions. The proposed algorithm has an Overall Accuracy of 89.57% to 91.81%, and the Balanced Accuracy varies between 79.41% to 89.03% depending on the forecast time window. Moreover, due to the model-free nature of the evaluation criteria and interdisciplinary characteristics of the approach, similar studies may be motivated in other nonlinear dispersive media, such as nonlinear optics, plasma, and solids, governed by similar equations, which will allow for the early detection of extreme waves