Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Roshan Ragel

S2TPVFormer: Spatio-Temporal Tri-Perspective View for temporally coherent 3D Semantic Occupancy Prediction

Jan 24, 2024

Sathira Silva, Savindu Bhashitha Wannigama, Roshan Ragel, Gihan Jayatilaka

Abstract:Holistic understanding and reasoning in 3D scenes play a vital role in the success of autonomous driving systems. The evolution of 3D semantic occupancy prediction as a pretraining task for autonomous driving and robotic downstream tasks captures finer 3D details compared to methods like 3D detection. Existing approaches predominantly focus on spatial cues, often overlooking temporal cues. Query-based methods tend to converge on computationally intensive Voxel representation for encoding 3D scene information. This study introduces S2TPVFormer, an extension of TPVFormer, utilizing a spatiotemporal transformer architecture for coherent 3D semantic occupancy prediction. Emphasizing the importance of spatiotemporal cues in 3D scene perception, particularly in 3D semantic occupancy prediction, our work explores the less-explored realm of temporal cues. Leveraging Tri-Perspective View (TPV) representation, our spatiotemporal encoder generates temporally rich embeddings, improving prediction coherence while maintaining computational efficiency. To achieve this, we propose a novel Temporal Cross-View Hybrid Attention (TCVHA) mechanism, facilitating effective spatiotemporal information exchange across TPV views. Experimental evaluations on the nuScenes dataset demonstrate a substantial 3.1% improvement in mean Intersection over Union (mIoU) for 3D Semantic Occupancy compared to TPVFormer, confirming the effectiveness of the proposed S2TPVFormer in enhancing 3D scene perception.

Via

Access Paper or Ask Questions

An Optical physics inspired CNN approach for intrinsic image decomposition

May 21, 2021

Harshana Weligampola, Gihan Jayatilaka, Suren Sritharan, Parakrama Ekanayake, Roshan Ragel, Vijitha Herath, Roshan Godaliyadda

Figure 1 for An Optical physics inspired CNN approach for intrinsic image decomposition

Figure 2 for An Optical physics inspired CNN approach for intrinsic image decomposition

Figure 3 for An Optical physics inspired CNN approach for intrinsic image decomposition

Figure 4 for An Optical physics inspired CNN approach for intrinsic image decomposition

Abstract:Intrinsic Image Decomposition is an open problem of generating the constituents of an image. Generating reflectance and shading from a single image is a challenging task specifically when there is no ground truth. There is a lack of unsupervised learning approaches for decomposing an image into reflectance and shading using a single image. We propose a neural network architecture capable of this decomposition using physics-based parameters derived from the image. Through experimental results, we show that (a) the proposed methodology outperforms the existing deep learning-based IID techniques and (b) the derived parameters improve the efficacy significantly. We conclude with a closer analysis of the results (numerical and example images) showing several avenues for improvement.

* 5 pages, 3 figures, 1 table, ICIP 2021

Via

Access Paper or Ask Questions

A Retinex based GAN Pipeline to Utilize Paired and Unpaired Datasets for Enhancing Low Light Images

Jun 27, 2020

Harshana Weligampola, Gihan Jayatilaka, Suren Sritharan, Roshan Godaliyadda, Parakrama Ekanayaka, Roshan Ragel, Vijitha Herath

Figure 1 for A Retinex based GAN Pipeline to Utilize Paired and Unpaired Datasets for Enhancing Low Light Images

Figure 2 for A Retinex based GAN Pipeline to Utilize Paired and Unpaired Datasets for Enhancing Low Light Images

Figure 3 for A Retinex based GAN Pipeline to Utilize Paired and Unpaired Datasets for Enhancing Low Light Images

Figure 4 for A Retinex based GAN Pipeline to Utilize Paired and Unpaired Datasets for Enhancing Low Light Images

Abstract:Low light image enhancement is an important challenge for the development of robust computer vision algorithms. The machine learning approaches to this have been either unsupervised, supervised based on paired dataset or supervised based on unpaired dataset. This paper presents a novel deep learning pipeline that can learn from both paired and unpaired datasets. Convolution Neural Networks (CNNs) that are optimized to minimize standard loss, and Generative Adversarial Networks (GANs) that are optimized to minimize the adversarial loss are used to achieve different steps of the low light image enhancement process. Cycle consistency loss and a patched discriminator are utilized to further improve the performance. The paper also analyses the functionality and the performance of different components, hidden layers, and the entire pipeline.

Via

Access Paper or Ask Questions

Non-contact Infant Sleep Apnea Detection

Oct 10, 2019

Gihan Jayatilaka, Harshana Weligampola, Suren Sritharan, Pankayraj Pathmanathan, Roshan Ragel, Isuru Nawinne

Figure 1 for Non-contact Infant Sleep Apnea Detection

Figure 2 for Non-contact Infant Sleep Apnea Detection

Figure 3 for Non-contact Infant Sleep Apnea Detection

Abstract:Sleep apnea is a breathing disorder where a person repeatedly stops breathing in sleep. Early detection is crucial for infants because it might bring long term adversities. The existing accurate detection mechanism (pulse oximetry) is a skin contact measurement. The existing non-contact mechanisms (acoustics, video processing) are not accurate enough. This paper presents a novel algorithm for the detection of sleep apnea with video processing. The solution is non-contact, accurate and lightweight enough to run on a single board computer. The paper discusses the accuracy of the algorithm on real data, advantages of the new algorithm, its limitations and suggests future improvements.

* Gihan Jayatilaka, Harshana Weligampola and Suren Sritharan are equally contributing authors

Via

Access Paper or Ask Questions

Near Real-Time Data Labeling Using a Depth Sensor for EMG Based Prosthetic Arms

Nov 10, 2018

Geesara Prathap, Titus Nanda Kumara, Roshan Ragel

Figure 1 for Near Real-Time Data Labeling Using a Depth Sensor for EMG Based Prosthetic Arms

Figure 2 for Near Real-Time Data Labeling Using a Depth Sensor for EMG Based Prosthetic Arms

Figure 3 for Near Real-Time Data Labeling Using a Depth Sensor for EMG Based Prosthetic Arms

Figure 4 for Near Real-Time Data Labeling Using a Depth Sensor for EMG Based Prosthetic Arms

Abstract:Recognizing sEMG (Surface Electromyography) signals belonging to a particular action (e.g., lateral arm raise) automatically is a challenging task as EMG signals themselves have a lot of variation even for the same action due to several factors. To overcome this issue, there should be a proper separation which indicates similar patterns repetitively for a particular action in raw signals. A repetitive pattern is not always matched because the same action can be carried out with different time duration. Thus, a depth sensor (Kinect) was used for pattern identification where three joint angles were recording continuously which is clearly separable for a particular action while recording sEMG signals. To Segment out a repetitive pattern in angle data, MDTW (Moving Dynamic Time Warping) approach is introduced. This technique is allowed to retrieve suspected motion of interest from raw signals. MDTW based on DTW algorithm, but it will be moving through the whole dataset in a pre-defined manner which is capable of picking up almost all the suspected segments inside a given dataset an optimal way. Elevated bicep curl and lateral arm raise movements are taken as motions of interest to show how the proposed technique can be employed to achieve auto identification and labelling. The full implementation is available at https://github.com/GPrathap/OpenBCIPython

Via

Access Paper or Ask Questions

High Throughput Virtual Screening with Data Level Parallelism in Multi-core Processors

Dec 04, 2013

Upul Senanayake, Rahal Prabuddha, Roshan Ragel

Figure 1 for High Throughput Virtual Screening with Data Level Parallelism in Multi-core Processors

Figure 2 for High Throughput Virtual Screening with Data Level Parallelism in Multi-core Processors

Figure 3 for High Throughput Virtual Screening with Data Level Parallelism in Multi-core Processors

Figure 4 for High Throughput Virtual Screening with Data Level Parallelism in Multi-core Processors

Abstract:Improving the throughput of molecular docking, a computationally intensive phase of the virtual screening process, is a highly sought area of research since it has a significant weight in the drug designing process. With such improvements, the world might find cures for incurable diseases like HIV disease and Cancer sooner. Our approach presented in this paper is to utilize a multi-core environment to introduce Data Level Parallelism (DLP) to the Autodock Vina software, which is a widely used for molecular docking software. Autodock Vina already exploits Instruction Level Parallelism (ILP) in multi-core environments and therefore optimized for such environments. However, with the results we have obtained, it can be clearly seen that our approach has enhanced the throughput of the already optimized software by more than six times. This will dramatically reduce the time consumed for the lead identification phase in drug designing along with the shift in the processor technology from multi-core to many-core of the current era. Therefore, we believe that the contribution of this project will effectively make it possible to expand the number of small molecules docked against a drug target and improving the chances to design drugs for incurable diseases.

* Information and Automation for Sustainability (ICIAfS), 2012 IEEE 6th International Conference on

Via

Access Paper or Ask Questions