Abstract:In recent years, several Weakly Supervised Semantic Segmentation (WS3) methods have been proposed that use class activation maps (CAMs) generated by a classifier to produce pseudo-ground truths for training segmentation models. While CAMs are good at highlighting discriminative regions (DR) of an image, they are known to disregard regions of the object that do not contribute to the classifier's prediction, termed non-discriminative regions (NDR). In contrast, attribution methods such as saliency maps provide an alternative approach for assigning a score to every pixel based on its contribution to the classification prediction. This paper provides a comprehensive comparison between saliencies and CAMs for WS3. Our study includes multiple perspectives on understanding their similarities and dissimilarities. Moreover, we provide new evaluation metrics that perform a comprehensive assessment of WS3 performance of alternative methods w.r.t. CAMs. We demonstrate the effectiveness of saliencies in addressing the limitation of CAMs through our empirical studies on benchmark datasets. Furthermore, we propose random cropping as a stochastic aggregation technique that improves the performance of saliency, making it a strong alternative to CAM for WS3.
Abstract:Bridge health monitoring using machine learning tools has become an efficient and cost-effective approach in recent times. In the present study, strains in railway bridge member, available from a previous study conducted by IIT Guwahati has been utilized. These strain data were collected from an existing bridge while trains were passing over the bridge. LSTM is used to train the network and to predict strains in different members of the railway bridge. Actual field data has been used for the purpose of predicting strain in different members using strain data from a single member, yet it has been observed that they are quite agreeable to those of ground truth values. This is in spite of the fact that a lot of noise existed in the data, thus showing the efficacy of LSTM in training and predicting even from noisy field data. This may easily open up the possibility of collecting data from the bridge with a much lesser number of sensors and predicting the strain data in other members through LSTM network.
Abstract:The current work deals with the problem of attempting to predict the popularity of images before even being uploaded. This method is specifically focused on Flickr images. Social features of each image as well as that of the user who had uploaded it, have been recorded. The dataset also includes the engagement score of each image which is the ground truth value of the views obtained by each image over a period of 30 days. The work aims to predict the popularity of images on Flickr over a period of 30 days using the social features of the user and the image, as well as the visual features of the images. The method states that the engagement sequence of an image can be said to depend on two independent quantities, namely scale and shape of an image. Once the shape and scale of an image have been predicted, combining them the predicted sequence of an image over 30 days is obtained. The current work follows a previous work done in the same direction, with certain speculations and suggestions of improvement.