Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chiranjoy Chattopadhyay

Knowledge driven Description Synthesis for Floor Plan Interpretation

Mar 15, 2021

Shreya Goyal, Chiranjoy Chattopadhyay, Gaurav Bhatnagar

Figure 1 for Knowledge driven Description Synthesis for Floor Plan Interpretation

Figure 2 for Knowledge driven Description Synthesis for Floor Plan Interpretation

Figure 3 for Knowledge driven Description Synthesis for Floor Plan Interpretation

Figure 4 for Knowledge driven Description Synthesis for Floor Plan Interpretation

Abstract:Image captioning is a widely known problem in the area of AI. Caption generation from floor plan images has applications in indoor path planning, real estate, and providing architectural solutions. Several methods have been explored in literature for generating captions or semi-structured descriptions from floor plan images. Since only the caption is insufficient to capture fine-grained details, researchers also proposed descriptive paragraphs from images. However, these descriptions have a rigid structure and lack flexibility, making it difficult to use them in real-time scenarios. This paper offers two models, Description Synthesis from Image Cue (DSIC) and Transformer Based Description Generation (TBDG), for the floor plan image to text generation to fill the gaps in existing methods. These two models take advantage of modern deep neural networks for visual feature extraction and text generation. The difference between both models is in the way they take input from the floor plan image. The DSIC model takes only visual features automatically extracted by a deep neural network, while the TBDG model learns textual captions extracted from input floor plan images with paragraphs. The specific keywords generated in TBDG and understanding them with paragraphs make it more robust in a general floor plan image. Experiments were carried out on a large-scale publicly available dataset and compared with state-of-the-art techniques to show the proposed model's superiority.

* 19 pages, 18 Figure

Via

Access Paper or Ask Questions

GRIHA: Synthesizing 2-Dimensional Building Layouts from Images Captured using a Smart Phone

Mar 15, 2021

Shreya Goyal, Naimul Khan, Chiranjoy Chattopadhyay, Gaurav Bhatnagar

Figure 1 for GRIHA: Synthesizing 2-Dimensional Building Layouts from Images Captured using a Smart Phone

Figure 2 for GRIHA: Synthesizing 2-Dimensional Building Layouts from Images Captured using a Smart Phone

Figure 3 for GRIHA: Synthesizing 2-Dimensional Building Layouts from Images Captured using a Smart Phone

Figure 4 for GRIHA: Synthesizing 2-Dimensional Building Layouts from Images Captured using a Smart Phone

Abstract:Reconstructing an indoor scene and generating a layout/floor plan in 3D or 2D is a widely known problem. Quite a few algorithms have been proposed in the literature recently. However, most existing methods either use RGB-D images, thus requiring a depth camera, or depending on panoramic photos, assuming that there is little to no occlusion in the rooms. In this work, we proposed GRIHA (Generating Room Interior of a House using ARCore), a framework for generating a layout using an RGB image captured using a simple mobile phone camera. We take advantage of Simultaneous Localization and Mapping (SLAM) to assess the 3D transformations required for layout generation. SLAM technology is built-in in recent mobile libraries such as ARCore by Google. Hence, the proposed method is fast and efficient. It gives the user freedom to generate layout by merely taking a few conventional photos, rather than relying on specialized depth hardware or occlusion-free panoramic images. We have compared GRIHA with other existing methods and obtained superior results. Also, the system is tested on multiple hardware platforms to test the dependency and efficiency.

* 19 pages, 22 Figures, 4 Tables

Via

Access Paper or Ask Questions

Automatic Feature Weight Determination using Indexing and Pseudo-Relevance Feedback for Multi-feature Content-Based Image Retrieval

Dec 11, 2018

Asheet Kumar, Shivam Choudhary, Vaibhav Singh Khokhar, Vikas Meena, Chiranjoy Chattopadhyay

Figure 1 for Automatic Feature Weight Determination using Indexing and Pseudo-Relevance Feedback for Multi-feature Content-Based Image Retrieval

Figure 2 for Automatic Feature Weight Determination using Indexing and Pseudo-Relevance Feedback for Multi-feature Content-Based Image Retrieval

Figure 3 for Automatic Feature Weight Determination using Indexing and Pseudo-Relevance Feedback for Multi-feature Content-Based Image Retrieval

Figure 4 for Automatic Feature Weight Determination using Indexing and Pseudo-Relevance Feedback for Multi-feature Content-Based Image Retrieval

Abstract:Content-based image retrieval (CBIR) is one of the most active research areas in multimedia information retrieval. Given a query image, the task is to search relevant images in a repository. Low level features like color, texture, and shape feature vectors of an image are always considered to be an important attribute in CBIR system. Thus the performance of the CBIR system can be enhanced by combining these feature vectors. In this paper, we propose a novel CBIR framework by applying to index using multiclass SVM and finding the appropriate weights of the individual features automatically using the relevance ratio and mean difference. We have taken four feature descriptors to represent color, texture and shape features. During retrieval, feature vectors of query image are combined, weighted and compared with feature vectors of images in the database to rank order the results. Experiments were performed on four benchmark datasets and performance is compared with existing techniques to validate the superiority of our proposed framework.

* 9 pages, 6 figures

Via

Access Paper or Ask Questions

Automatic Rendering of Building Floor Plan Images from Textual Descriptions in English

Nov 29, 2018

Mahak Jain, Anurag Sanyal, Shreya Goyal, Chiranjoy Chattopadhyay, Gaurav Bhatnagar

Figure 1 for Automatic Rendering of Building Floor Plan Images from Textual Descriptions in English

Figure 2 for Automatic Rendering of Building Floor Plan Images from Textual Descriptions in English

Figure 3 for Automatic Rendering of Building Floor Plan Images from Textual Descriptions in English

Figure 4 for Automatic Rendering of Building Floor Plan Images from Textual Descriptions in English

Abstract:Human beings understand natural language description and could able to imagine a corresponding visual for the same. For example, given a description of the interior of a house, we could imagine its structure and arrangements of furniture. Automatic synthesis of real-world images from text descriptions has been explored in the computer vision community. However, there is no such attempt in the area of document images, like floor plans. Floor plan synthesis from sketches, as well as data-driven models, were proposed earlier. Ours is the first attempt to render building floor plan images from textual description automatically. Here, the input is a natural language description of the internal structure and furniture arrangements within a house, and the output is the 2D floor plan image of the same. We have experimented on publicly available benchmark floor plan datasets. We were able to render realistic synthesized floor plan images from the description written in English.

* 8 pages, 9 Figures

Via

Access Paper or Ask Questions

SUGAMAN: Describing Floor Plans for Visually Impaired by Annotation Learning and Proximity based Grammar

Nov 14, 2018

Shreya Goyal, Satya Bhavsar, Shreya Patel, Chiranjoy Chattopadhyay, Gaurav Bhatnagar

Figure 1 for SUGAMAN: Describing Floor Plans for Visually Impaired by Annotation Learning and Proximity based Grammar

Figure 2 for SUGAMAN: Describing Floor Plans for Visually Impaired by Annotation Learning and Proximity based Grammar

Figure 3 for SUGAMAN: Describing Floor Plans for Visually Impaired by Annotation Learning and Proximity based Grammar

Figure 4 for SUGAMAN: Describing Floor Plans for Visually Impaired by Annotation Learning and Proximity based Grammar

Abstract:In this paper, we propose SUGAMAN (Supervised and Unified framework using Grammar and Annotation Model for Access and Navigation). SUGAMAN is a Hindi word meaning "easy passage from one place to another". SUGAMAN synthesizes textual description from a given floor plan image for the visually impaired. A visually impaired person can navigate in an indoor environment using the textual description generated by SUGAMAN. With the help of a text reader software, the target user can understand the rooms within the building and arrangement of furniture to navigate. SUGAMAN is the first framework for describing a floor plan and giving direction for obstacle-free movement within a building. We learn $5$ classes of room categories from $1355$ room image samples under a supervised learning paradigm. These learned annotations are fed into a description synthesis framework to yield a holistic description of a floor plan image. We demonstrate the performance of various supervised classifiers on room learning. We also provide a comparative analysis of system generated and human written descriptions. SUGAMAN gives state of the art performance on challenging, real-world floor plan images. This work can be applied to areas like understanding floor plans of historical monuments, stability analysis of buildings, and retrieval.

* 19 pages, 20 figures, Under review in IET Image Processing

Via

Access Paper or Ask Questions

Siamese LSTM based Fiber Structural Similarity Network (FS2Net) for Rotation Invariant Brain Tractography Segmentation

Dec 28, 2017

Shreyas Malakarjun Patil, Aditya Nigam, Arnav Bhavsar, Chiranjoy Chattopadhyay

Figure 1 for Siamese LSTM based Fiber Structural Similarity Network (FS2Net) for Rotation Invariant Brain Tractography Segmentation

Figure 2 for Siamese LSTM based Fiber Structural Similarity Network (FS2Net) for Rotation Invariant Brain Tractography Segmentation

Figure 3 for Siamese LSTM based Fiber Structural Similarity Network (FS2Net) for Rotation Invariant Brain Tractography Segmentation

Figure 4 for Siamese LSTM based Fiber Structural Similarity Network (FS2Net) for Rotation Invariant Brain Tractography Segmentation

Abstract:In this paper, we propose a novel deep learning architecture combining stacked Bi-directional LSTM and LSTMs with the Siamese network architecture for segmentation of brain fibers, obtained from tractography data, into anatomically meaningful clusters. The proposed network learns the structural difference between fibers of different classes, which enables it to classify fibers with high accuracy. Importantly, capturing such deep inter and intra class structural relationship also ensures that the segmentation is robust to relative rotation among test and training data, hence can be used with unregistered data. Our extensive experimentation over order of hundred-thousands of fibers show that the proposed model achieves state-of-the-art results, even in cases of large relative rotations between test and training data.

Via

Access Paper or Ask Questions

An Interactive Medical Image Segmentation Framework Using Iterative Refinement

Jun 05, 2016

Pratik Kalshetti, Manas Bundele, Parag Rahangdale, Dinesh Jangra, Chiranjoy Chattopadhyay, Gaurav Harit, Abhay Elhence

Figure 1 for An Interactive Medical Image Segmentation Framework Using Iterative Refinement

Figure 2 for An Interactive Medical Image Segmentation Framework Using Iterative Refinement

Figure 3 for An Interactive Medical Image Segmentation Framework Using Iterative Refinement

Figure 4 for An Interactive Medical Image Segmentation Framework Using Iterative Refinement

Abstract:Image segmentation is often performed on medical images for identifying diseases in clinical evaluation. Hence it has become one of the major research areas. Conventional image segmentation techniques are unable to provide satisfactory segmentation results for medical images as they contain irregularities. They need to be pre-processed before segmentation. In order to obtain the most suitable method for medical image segmentation, we propose a two stage algorithm. The first stage automatically generates a binary marker image of the region of interest using mathematical morphology. This marker serves as the mask image for the second stage which uses GrabCut on the input image thus resulting in an efficient segmented result. The obtained result can be further refined by user interaction which can be done using the Graphical User Interface (GUI). Experimental results show that the proposed method is accurate and provides satisfactory segmentation results with minimum user interaction on medical as well as natural images.

* 19 pages, 19 figures, Submitted for review in Computers in Biology and Medicine

Via

Access Paper or Ask Questions