Abstract:This paper presents the results of a study conducted on the perceptual acceptability of audio-video desynchronization for sports videos. The study was conducted with 45 videos generated by applying 8 audio-video offsets on 5 source contents. 20 subjects participated in the study. The results show that humans are more sensitive to audio-video offset errors for speech stimuli, and the complex events that occur in sports broadcasts have higher thresholds of acceptability. This suggests the tuning of audio-video synchronization requirements in broadcasting to the content of the broadcast.
Abstract:We present a method to restore a clear image from a haze-affected image using a Wasserstein generative adversarial network. As the problem is ill-conditioned, previous methods have required a prior on natural images or multiple images of the same scene. We train a generative adversarial network to learn the probability distribution of clear images conditioned on the haze-affected images using the Wasserstein loss function, using a gradient penalty to enforce the Lipschitz constraint. The method is data-adaptive, end-to-end, and requires no further processing or tuning of parameters. We also incorporate the use of a texture-based loss metric and the L1 loss to improve results, and show that our results are better than the current state-of-the-art.
Abstract:This paper summarizes the method used in our submission to Task 1 of the International Skin Imaging Collaboration's (ISIC) Skin Lesion Analysis Towards Melanoma Detection challenge held in 2018. We used a fully automated method to accurately segment lesion boundaries from dermoscopic images. A U-net deep learning network is trained on publicly available data from ISIC. We introduce the use of intensity, color, and texture enhancement operations as pre-processing steps and morphological operations and contour identification as post-processing steps.