Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vijitha Herath

BandRC: Band Shifted Raised Cosine Activated Implicit Neural Representations

May 16, 2025

Pandula Thennakoon, Avishka Ranasinghe, Mario De Silva, Buwaneka Epakanda, Roshan Godaliyadda, Parakrama Ekanayake, Vijitha Herath

Abstract:In recent years, implicit neural representations(INRs) have gained popularity in the computer vision community. This is mainly due to the strong performance of INRs in many computer vision tasks. These networks can extract a continuous signal representation given a discrete signal representation. In previous studies, it has been repeatedly shown that INR performance has a strong correlation with the activation functions used in its multilayer perceptrons. Although numerous activation functions have been proposed that are competitive with one another, they share some common set of challenges such as spectral bias(Lack of sensitivity to high-frequency content in signals), limited robustness to signal noise and difficulties in simultaneous capturing both local and global features. and furthermore, the requirement for manual parameter tuning. To address these issues, we introduce a novel activation function, Band Shifted Raised Cosine Activated Implicit Neural Networks \textbf{(BandRC)} tailored to enhance signal representation capacity further. We also incorporate deep prior knowledge extracted from the signal to adjust the activation functions through a task-specific model. Through a mathematical analysis and a series of experiments which include image reconstruction (with a +8.93 dB PSNR improvement over the nearest counterpart), denoising (with a +0.46 dB increase in PSNR), super-resolution (with a +1.03 dB improvement over the nearest State-Of-The-Art (SOTA) method for 6X super-resolution), inpainting, and 3D shape reconstruction we demonstrate the dominance of BandRC over existing state of the art activation functions.

* Submitted as a conference paper to ICCV 2025

Via

Access Paper or Ask Questions

Enhanced SCanNet with CBAM and Dice Loss for Semantic Change Detection

May 07, 2025

Athulya Ratnayake, Buddhi Wijenayake, Praveen Sumanasekara, Roshan Godaliyadda, Vijitha Herath, Parakrama Ekanayake

Abstract:Semantic Change Detection (SCD) in remote sensing imagery requires accurately identifying land-cover changes across multi-temporal image pairs. Despite substantial advancements, including the introduction of transformer-based architectures, current SCD models continue to struggle with challenges such as noisy inputs, subtle class boundaries, and significant class imbalance. In this study, we propose enhancing the Semantic Change Network (SCanNet) by integrating the Convolutional Block Attention Module (CBAM) and employing Dice loss during training. CBAM sequentially applies channel attention to highlight feature maps with the most meaningful content, followed by spatial attention to pinpoint critical regions within these maps. This sequential approach ensures precise suppression of irrelevant features and spatial noise, resulting in more accurate and robust detection performance compared to attention mechanisms that apply both processes simultaneously or independently. Dice loss, designed explicitly for handling class imbalance, further boosts sensitivity to minority change classes. Quantitative experiments conducted on the SECOND dataset demonstrate consistent improvements. Qualitative analysis confirms these improvements, showing clearer segmentation boundaries and more accurate recovery of small-change regions. These findings highlight the effectiveness of attention mechanisms and Dice loss in improving feature representation and addressing class imbalance in semantic change detection tasks.

* 7 pages, 3 figures, conference

Via

Access Paper or Ask Questions

Performance Benchmarking of Psychomotor Skills Using Wearable Devices: An Application in Sport

Nov 25, 2024

Mahela Pandukabhaya, Tharaka Fonseka, Madhumini Kulathunge, Roshan Godaliyadda, Parakrama Ekanayake, Chanaka Senanayake, Vijitha Herath

Abstract:This study proposes a versatile framework for optimizing psychomotor learning through human motion analysis. Utilizing a wearable IMU sensor system, the motion trajectories of a given psychomotor task are acquired and then linked to points in a performance space using a predefined set of quality metrics specific to the psychomotor skill. This enables the identification of a benchmark cluster in the performance space, allowing correspondences to be established between the performance clusters and sets of trajectories in the motion space. As a result, common or specific deviations in the performance space can be identified, enabling remedial actions in the motion space to optimize performance. A thorough validation of the proposed framework is done in this paper using a Table Tennis forehand stroke as a case study. The resulting quantitative and visual representation of performance empowers individuals to optimize their skills and achieve peak performance.

* 15 pages, 14 figures, 5 tables, currently under review at IEEE Access

Via

Access Paper or Ask Questions

Iso-Diffusion: Improving Diffusion Probabilistic Models Using the Isotropy of the Additive Gaussian Noise

Mar 25, 2024

Dilum Fernando, Dhananjaya jayasundara, Roshan Godaliyadda, Chaminda Bandara, Parakrama Ekanayake, Vijitha Herath

Abstract:Denoising Diffusion Probabilistic Models (DDPMs) have accomplished much in the realm of generative AI. Despite their high performance, there is room for improvement, especially in terms of sample fidelity by utilizing statistical properties that impose structural integrity, such as isotropy. Minimizing the mean squared error between the additive and predicted noise alone does not impose constraints on the predicted noise to be isotropic. Thus, we were motivated to utilize the isotropy of the additive noise as a constraint on the objective function to enhance the fidelity of DDPMs. Our approach is simple and can be applied to any DDPM variant. We validate our approach by presenting experiments conducted on four synthetic 2D datasets as well as on unconditional image generation. As demonstrated by the results, the incorporation of this constraint improves the fidelity metrics, Precision and Density for the 2D datasets as well as for the unconditional image generation.

Via

Access Paper or Ask Questions

GAUSS: Guided Encoder-Decoder Architecture for Hyperspectral Unmixing with Spatial Smoothness

Apr 16, 2022

Yasiru Ranasinghe, Kavinga Weerasooriya, Roshan Godaliyadda, Vijitha Herath, Parakrama Ekanayake, Dhananjaya Jayasundara, Lakshitha Ramanayake, Neranjan Senarath, Dulantha Wickramasinghe

Figure 1 for GAUSS: Guided Encoder-Decoder Architecture for Hyperspectral Unmixing with Spatial Smoothness

Figure 2 for GAUSS: Guided Encoder-Decoder Architecture for Hyperspectral Unmixing with Spatial Smoothness

Figure 3 for GAUSS: Guided Encoder-Decoder Architecture for Hyperspectral Unmixing with Spatial Smoothness

Figure 4 for GAUSS: Guided Encoder-Decoder Architecture for Hyperspectral Unmixing with Spatial Smoothness

Abstract:In recent hyperspectral unmixing (HU) literature, the application of deep learning (DL) has become more prominent, especially with the autoencoder (AE) architecture. We propose a split architecture and use a pseudo-ground truth for abundances to guide the `unmixing network' (UN) optimization. Preceding the UN, an `approximation network' (AN) is proposed, which will improve the association between the centre pixel and its neighbourhood. Hence, it will accentuate spatial correlation in the abundances as its output is the input to the UN and the reference for the `mixing network' (MN). In the Guided Encoder-Decoder Architecture for Hyperspectral Unmixing with Spatial Smoothness (GAUSS), we proposed using one-hot encoded abundances as the pseudo-ground truth to guide the UN; computed using the k-means algorithm to exclude the use of prior HU methods. Furthermore, we release the single-layer constraint on MN by introducing the UN generated abundances in contrast to the standard AE for HU. Secondly, we experimented with two modifications on the pre-trained network using the GAUSS method. In GAUSS$_\textit{blind}$, we have concatenated the UN and the MN to back-propagate the reconstruction error gradients to the encoder. Then, in the GAUSS$_\textit{prime}$, abundance results of a signal processing (SP) method with reliable abundance results were used as the pseudo-ground truth with the GAUSS architecture. According to quantitative and graphical results for four experimental datasets, the three architectures either transcended or equated the performance of existing HU algorithms from both DL and SP domains.

* 16 pages, 6 figures

Via

Access Paper or Ask Questions

Holistic Interpretation of Public Scenes Using Computer Vision and Temporal Graphs to Identify Social Distancing Violations

Dec 13, 2021

Gihan Jayatilaka, Jameel Hassan, Suren Sritharan, Janith Bandara Senananayaka, Harshana Weligampola, Roshan Godaliyadda, Parakrama Ekanayake, Vijitha Herath, Janaka Ekanayake, Samath Dharmaratne

Figure 1 for Holistic Interpretation of Public Scenes Using Computer Vision and Temporal Graphs to Identify Social Distancing Violations

Figure 2 for Holistic Interpretation of Public Scenes Using Computer Vision and Temporal Graphs to Identify Social Distancing Violations

Figure 3 for Holistic Interpretation of Public Scenes Using Computer Vision and Temporal Graphs to Identify Social Distancing Violations

Figure 4 for Holistic Interpretation of Public Scenes Using Computer Vision and Temporal Graphs to Identify Social Distancing Violations

Abstract:The COVID-19 pandemic has caused an unprecedented global public health crisis. Given its inherent nature, social distancing measures are proposed as the primary strategies to curb the spread of this pandemic. Therefore, identifying situations where these protocols are violated, has implications for curtailing the spread of the disease and promoting a sustainable lifestyle. This paper proposes a novel computer vision-based system to analyze CCTV footage to provide a threat level assessment of COVID-19 spread. The system strives to holistically capture and interpret the information content of CCTV footage spanning multiple frames to recognize instances of various violations of social distancing protocols, across time and space, as well as identification of group behaviors. This functionality is achieved primarily by utilizing a temporal graph-based structure to represent the information of the CCTV footage and a strategy to holistically interpret the graph and quantify the threat level of the given scene. The individual components are tested and validated on a range of scenarios and the complete system is tested against human expert opinion. The results reflect the dependence of the threat level on people, their physical proximity, interactions, protective clothing, and group dynamics. The system performance has an accuracy of 76%, thus enabling a deployable threat monitoring system in cities, to permit normalcy and sustainability in the society.

* 35 pages, 22 figures

Via

Access Paper or Ask Questions

Hands Off: A Handshake Interaction Detection and Localization Model for COVID-19 Threat Control

Oct 18, 2021

A. S. Jameel Hassan, Suren Sritharan, Gihan Jayatilaka, Roshan I. Godaliyadda, Parakrama B. Ekanayake, Vijitha Herath, Janaka B. Ekanayake

Figure 1 for Hands Off: A Handshake Interaction Detection and Localization Model for COVID-19 Threat Control

Figure 2 for Hands Off: A Handshake Interaction Detection and Localization Model for COVID-19 Threat Control

Figure 3 for Hands Off: A Handshake Interaction Detection and Localization Model for COVID-19 Threat Control

Figure 4 for Hands Off: A Handshake Interaction Detection and Localization Model for COVID-19 Threat Control

Abstract:The COVID-19 outbreak has affected millions of people across the globe and is continuing to spread at a drastic scale. Out of the numerous steps taken to control the spread of the virus, social distancing has been a crucial and effective practice. However, recent reports of social distancing violations suggest the need for non-intrusive detection techniques to ensure safety in public spaces. In this paper, a real-time detection model is proposed to identify handshake interactions in a range of realistic scenarios with multiple people in the scene and also detect multiple interactions in a single frame. This is the first work that performs dyadic interaction localization in a multi-person setting. The efficacy of the proposed model was evaluated across two different datasets on more than 3200 frames, thus enabling a robust localization model in different environments. The proposed model is the first dyadic interaction localizer in a multi-person setting, which enables it to be used in public spaces to identify handshake interactions and thereby identify and mitigate COVID-19 transmission.

* 6 pages

Via

Access Paper or Ask Questions

A generalized forecasting solution to enable future insights of COVID-19 at sub-national level resolutions

Aug 21, 2021

Umar Marikkar, Harshana Weligampola, Rumali Perera, Jameel Hassan, Suren Sritharan, Gihan Jayatilaka, Roshan Godaliyadda, Vijitha Herath, Parakrama Ekanayake, Janaka Ekanayake(+2 more)

Figure 1 for A generalized forecasting solution to enable future insights of COVID-19 at sub-national level resolutions

Figure 2 for A generalized forecasting solution to enable future insights of COVID-19 at sub-national level resolutions

Figure 3 for A generalized forecasting solution to enable future insights of COVID-19 at sub-national level resolutions

Figure 4 for A generalized forecasting solution to enable future insights of COVID-19 at sub-national level resolutions

Abstract:COVID-19 continues to cause a significant impact on public health. To minimize this impact, policy makers undertake containment measures that however, when carried out disproportionately to the actual threat, as a result if errorneous threat assessment, cause undesirable long-term socio-economic complications. In addition, macro-level or national level decision making fails to consider the localized sensitivities in small regions. Hence, the need arises for region-wise threat assessments that provide insights on the behaviour of COVID-19 through time, enabled through accurate forecasts. In this study, a forecasting solution is proposed, to predict daily new cases of COVID-19 in regions small enough where containment measures could be locally implemented, by targeting three main shortcomings that exist in literature; the unreliability of existing data caused by inconsistent testing patterns in smaller regions, weak deploy-ability of forecasting models towards predicting cases in previously unseen regions, and model training biases caused by the imbalanced nature of data in COVID-19 epi-curves. Hence, the contributions of this study are three-fold; an optimized smoothing technique to smoothen less deterministic epi-curves based on epidemiological dynamics of that region, a Long-Short-Term-Memory (LSTM) based forecasting model trained using data from select regions to create a representative and diverse training set that maximizes deploy-ability in regions with lack of historical data, and an adaptive loss function whilst training to mitigate the data imbalances seen in epi-curves. The proposed smoothing technique, the generalized training strategy and the adaptive loss function largely increased the overall accuracy of the forecast, which enables efficient containment measures at a more localized micro-level.

Via

Access Paper or Ask Questions

An Optical physics inspired CNN approach for intrinsic image decomposition

May 21, 2021

Harshana Weligampola, Gihan Jayatilaka, Suren Sritharan, Parakrama Ekanayake, Roshan Ragel, Vijitha Herath, Roshan Godaliyadda

Figure 1 for An Optical physics inspired CNN approach for intrinsic image decomposition

Figure 2 for An Optical physics inspired CNN approach for intrinsic image decomposition

Figure 3 for An Optical physics inspired CNN approach for intrinsic image decomposition

Figure 4 for An Optical physics inspired CNN approach for intrinsic image decomposition

Abstract:Intrinsic Image Decomposition is an open problem of generating the constituents of an image. Generating reflectance and shading from a single image is a challenging task specifically when there is no ground truth. There is a lack of unsupervised learning approaches for decomposing an image into reflectance and shading using a single image. We propose a neural network architecture capable of this decomposition using physics-based parameters derived from the image. Through experimental results, we show that (a) the proposed methodology outperforms the existing deep learning-based IID techniques and (b) the derived parameters improve the efficacy significantly. We conclude with a closer analysis of the results (numerical and example images) showing several avenues for improvement.

* 5 pages, 3 figures, 1 table, ICIP 2021

Via

Access Paper or Ask Questions

Convolutional Autoencoder for Blind Hyperspectral Image Unmixing

Nov 18, 2020

Yasiru Ranasinghe, Sanjaya Herath, Kavinga Weerasooriya, Mevan Ekanayake, Roshan Godaliyadda, Parakrama Ekanayake, Vijitha Herath

Figure 1 for Convolutional Autoencoder for Blind Hyperspectral Image Unmixing

Figure 2 for Convolutional Autoencoder for Blind Hyperspectral Image Unmixing

Figure 3 for Convolutional Autoencoder for Blind Hyperspectral Image Unmixing

Figure 4 for Convolutional Autoencoder for Blind Hyperspectral Image Unmixing

Abstract:In the remote sensing context spectral unmixing is a technique to decompose a mixed pixel into two fundamental representatives: endmembers and abundances. In this paper, a novel architecture is proposed to perform blind unmixing on hyperspectral images. The proposed architecture consists of convolutional layers followed by an autoencoder. The encoder transforms the feature space produced through convolutional layers to a latent space representation. Then, from these latent characteristics the decoder reconstructs the roll-out image of the monochrome image which is at the input of the architecture; and each single-band image is fed sequentially. Experimental results on real hyperspectral data concludes that the proposed algorithm outperforms existing unmixing methods at abundance estimation and generates competitive results for endmember extraction with RMSE and SAD as the metrics, respectively.

* 7 pages, 4 figures, conference

Via

Access Paper or Ask Questions