Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jayanta Mukhopadhyay

Sub-Aperture Feature Adaptation in Single Image Super-resolution Model for Light Field Imaging

Jul 26, 2022

Aupendu Kar, Suresh Nehra, Jayanta Mukhopadhyay, Prabir Kumar Biswas

Figure 1 for Sub-Aperture Feature Adaptation in Single Image Super-resolution Model for Light Field Imaging

Figure 2 for Sub-Aperture Feature Adaptation in Single Image Super-resolution Model for Light Field Imaging

Figure 3 for Sub-Aperture Feature Adaptation in Single Image Super-resolution Model for Light Field Imaging

Figure 4 for Sub-Aperture Feature Adaptation in Single Image Super-resolution Model for Light Field Imaging

Abstract:With the availability of commercial Light Field (LF) cameras, LF imaging has emerged as an up and coming technology in computational photography. However, the spatial resolution is significantly constrained in commercial microlens based LF cameras because of the inherent multiplexing of spatial and angular information. Therefore, it becomes the main bottleneck for other applications of light field cameras. This paper proposes an adaptation module in a pretrained Single Image Super Resolution (SISR) network to leverage the powerful SISR model instead of using highly engineered light field imaging domain specific Super Resolution models. The adaption module consists of a Sub aperture Shift block and a fusion block. It is an adaptation in the SISR network to further exploit the spatial and angular information in LF images to improve the super resolution performance. Experimental validation shows that the proposed method outperforms existing light field super resolution algorithms. It also achieves PSNR gains of more than 1 dB across all the datasets as compared to the same pretrained SISR models for scale factor 2, and PSNR gains 0.6 to 1 dB for scale factor 4.

* \c{opyright} 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Via

Access Paper or Ask Questions

Deriving Explanation of Deep Visual Saliency Models

Sep 08, 2021

Sai Phani Kumar Malladi, Jayanta Mukhopadhyay, Chaker Larabi, Santanu Chaudhury

Figure 1 for Deriving Explanation of Deep Visual Saliency Models

Figure 2 for Deriving Explanation of Deep Visual Saliency Models

Figure 3 for Deriving Explanation of Deep Visual Saliency Models

Figure 4 for Deriving Explanation of Deep Visual Saliency Models

Abstract:Deep neural networks have shown their profound impact on achieving human level performance in visual saliency prediction. However, it is still unclear how they learn the task and what it means in terms of understanding human visual system. In this work, we develop a technique to derive explainable saliency models from their corresponding deep neural architecture based saliency models by applying human perception theories and the conventional concepts of saliency. This technique helps us understand the learning pattern of the deep network at its intermediate layers through their activation maps. Initially, we consider two state-of-the-art deep saliency models, namely UNISAL and MSI-Net for our interpretation. We use a set of biologically plausible log-gabor filters for identifying and reconstructing the activation maps of them using our explainable saliency model. The final saliency map is generated using these reconstructed activation maps. We also build our own deep saliency model named cross-concatenated multi-scale residual block based network (CMRNet) for saliency prediction. Then, we evaluate and compare the performance of the explainable models derived from UNISAL, MSI-Net and CMRNet on three benchmark datasets with other state-of-the-art methods. Hence, we propose that this approach of explainability can be applied to any deep visual saliency model for interpretation which makes it a generic one.

Via

Access Paper or Ask Questions

Tabular Structure Detection from Document Images for Resource Constrained Devices Using A Row Based Similarity Measure

Aug 26, 2020

Soumyadeep Dey, Jayanta Mukhopadhyay, Shamik Sural

Figure 1 for Tabular Structure Detection from Document Images for Resource Constrained Devices Using A Row Based Similarity Measure

Figure 2 for Tabular Structure Detection from Document Images for Resource Constrained Devices Using A Row Based Similarity Measure

Figure 3 for Tabular Structure Detection from Document Images for Resource Constrained Devices Using A Row Based Similarity Measure

Figure 4 for Tabular Structure Detection from Document Images for Resource Constrained Devices Using A Row Based Similarity Measure

Abstract:Tabular structures are used to present crucial information in a structured and crisp manner. Detection of such regions is of great importance for proper understanding of a document. Tabular structures can be of various layouts and types. Therefore, detection of these regions is a hard problem. Most of the existing techniques detect tables from a document image by using prior knowledge of the structures of the tables. However, these methods are not applicable for generalized tabular structures. In this work, we propose a similarity measure to find similarities between pairs of rows in a tabular structure. This similarity measure is utilized to identify a tabular region. Since the tabular regions are detected exploiting the similarities among all rows, the method is inherently independent of layouts of the tabular regions present in the training data. Moreover, the proposed similarity measure can be used to identify tabular regions without using large sets of parameters associated with recent deep learning based methods. Thus, the proposed method can easily be used with resource constrained devices such as mobile devices without much of an overhead.

Via

Access Paper or Ask Questions

RECAL: Reuse of Established CNN classifer Apropos unsupervised Learning paradigm

Jun 15, 2019

Jayasree Saha, Jayanta Mukhopadhyay

Figure 1 for RECAL: Reuse of Established CNN classifer Apropos unsupervised Learning paradigm

Figure 2 for RECAL: Reuse of Established CNN classifer Apropos unsupervised Learning paradigm

Figure 3 for RECAL: Reuse of Established CNN classifer Apropos unsupervised Learning paradigm

Figure 4 for RECAL: Reuse of Established CNN classifer Apropos unsupervised Learning paradigm

Abstract:Recently, clustering with deep network framework has attracted attention of several researchers in the computer vision community. Deep framework gains extensive attention due to its efficiency and scalability towards large-scale and high-dimensional data. In this paper, we transform supervised CNN classifier architecture into an unsupervised clustering model, called RECAL, which jointly learns discriminative embedding subspace and cluster labels. RECAL is made up of feature extraction layers which are convolutional, followed by unsupervised classifier layers which is fully connected. A multinomial logistic regression function (softmax) stacked on top of classifier layers. We train this network using stochastic gradient descent (SGD) optimizer. However, the successful implementation of our model is revolved around the design of loss function. Our loss function uses the heuristics that true partitioning entails lower entropy given that the class distribution is not heavily skewed. This is a trade-off between the situations of "skewed distribution" and "low-entropy". To handle this, we have proposed classification entropy and class entropy which are the two components of our loss function. In this approach, size of the mini-batch should be kept high. Experimental results indicate the consistent and competitive behavior of our model for clustering well-known digit, multi-viewed object and face datasets. Morever, we use this model to generate unsupervised patch segmentation for multi-spectral LISS-IV images. We observe that it is able to distinguish built-up area, wet land, vegetation and waterbody from the underlying scene.

Via

Access Paper or Ask Questions

Efficient Retrieval of Logos Using Rough Set Reducts

Apr 10, 2019

Ushasi Chaudhuri, Partha Bhowmick, Jayanta Mukhopadhyay

Figure 1 for Efficient Retrieval of Logos Using Rough Set Reducts

Figure 2 for Efficient Retrieval of Logos Using Rough Set Reducts

Figure 3 for Efficient Retrieval of Logos Using Rough Set Reducts

Figure 4 for Efficient Retrieval of Logos Using Rough Set Reducts

Abstract:Searching for similar logos in the registered logo database is a very important and tedious task at the trademark office. Speed and accuracy are two aspects that one must attend to while developing a system for retrieval of logos. In this paper, we propose a rough-set based method to quantify the structural information in a logo image that can be used to efficiently index an image. A logo is split into a number of polygons, and for each polygon, we compute the tight upper and lower approximations based on the principles of a rough set. This representation is used for forming feature vectors for retrieval of logos. Experimentation on a standard data set shows the usefulness of the proposed technique. It is computationally efficient and also provides retrieval results at high accuracy.

Via

Access Paper or Ask Questions

Visual Based Navigation of Mobile Robots

Dec 15, 2017

Shailja, Soumabh Bhowmick, Jayanta Mukhopadhyay

Figure 1 for Visual Based Navigation of Mobile Robots

Figure 2 for Visual Based Navigation of Mobile Robots

Figure 3 for Visual Based Navigation of Mobile Robots

Figure 4 for Visual Based Navigation of Mobile Robots

Abstract:We have developed an algorithm to generate a complete map of the traversable region for a personal assistant robot using monocular vision only. Using multiple taken by a simple webcam, obstacle detection and avoidance algorithms have been developed. Simple Linear Iterative Clustering (SLIC) has been used for segmentation to reduce the memory and computation cost. A simple mapping technique using inverse perspective mapping and occupancy grids, which is robust, and supports very fast updates has been used to create the map for indoor navigation.

* Bachelor Thesis, Electrical Engineering Department, IIT Kharagpur, 2016

Via

Access Paper or Ask Questions

Bayesian Optimisation with Prior Reuse for Motion Planning in Robot Soccer

Oct 18, 2017

Abhinav Agarwalla, Arnav Kumar Jain, KV Manohar, Arpit Saxena, Jayanta Mukhopadhyay

Figure 1 for Bayesian Optimisation with Prior Reuse for Motion Planning in Robot Soccer

Figure 2 for Bayesian Optimisation with Prior Reuse for Motion Planning in Robot Soccer

Figure 3 for Bayesian Optimisation with Prior Reuse for Motion Planning in Robot Soccer

Figure 4 for Bayesian Optimisation with Prior Reuse for Motion Planning in Robot Soccer

Abstract:We integrate learning and motion planning for soccer playing differential drive robots using Bayesian optimisation. Trajectories generated using end-slope cubic Bezier splines are first optimised globally through Bayesian optimisation for a set of candidate points with obstacles. The optimised trajectories along with robot and obstacle positions and velocities are stored in a database. The closest planning situation is identified from the database using k-Nearest Neighbour approach. It is further optimised online through reuse of prior information from previously optimised trajectory. Our approach reduces computation time of trajectory optimisation considerably. Velocity profiling generates velocities consistent with robot kinodynamoic constraints, and avoids collision and slipping. Extensive testing is done on developed simulator, as well as on physical differential drive robots. Our method shows marked improvements in mitigating tracking error, and reducing traversal and computational time over competing techniques under the constraints of performing tasks in real time.

* Accepted at ACM India Joint Conference on Data Science and Management of Data 2018

Via

Access Paper or Ask Questions

5-DoF Monocular Visual Localization Over Grid Based Floor

Sep 14, 2017

Manash Pratim Das, Gaurav Gardi, Jayanta Mukhopadhyay

Figure 1 for 5-DoF Monocular Visual Localization Over Grid Based Floor

Figure 2 for 5-DoF Monocular Visual Localization Over Grid Based Floor

Figure 3 for 5-DoF Monocular Visual Localization Over Grid Based Floor

Figure 4 for 5-DoF Monocular Visual Localization Over Grid Based Floor

Abstract:Reliable localization is one of the most important parts of an MAV system. Localization in an indoor GPS-denied environment is a relatively difficult problem. Current vision based algorithms track optical features to calculate odometry. We present a novel localization method which can be applied in an environment having orthogonal sets of equally spaced lines to form a grid. With the help of a monocular camera and using the properties of the grid-lines below, the MAV is localized inside each sub-cell of the grid and consequently over the entire grid for a relative localization over the grid. We demonstrate the effectiveness of our system onboard a customized MAV platform. The experimental results show that our method provides accurate 5-DoF localization over grid lines and it can be performed in real-time.

* Accepted to International Conference on Indoor Positioning and Indoor Navigation 2017

Via

Access Paper or Ask Questions