Abstract:Interest modeling in recommender system has been a constant topic for improving user experience, and typical interest modeling tasks (e.g. multi-interest, long-tail interest and long-term interest) have been investigated in many existing works. However, most of them only consider one interest in isolation, while neglecting their interrelationships. In this paper, we argue that these tasks suffer from a common "interest amnesia" problem, and a solution exists to mitigate it simultaneously. We figure that long-term cues can be the cornerstone since they reveal multi-interest and clarify long-tail interest. Inspired by the observation, we propose a novel and unified framework in the retrieval stage, "Trinity", to solve interest amnesia problem and improve multiple interest modeling tasks. We construct a real-time clustering system that enables us to project items into enumerable clusters, and calculate statistical interest histograms over these clusters. Based on these histograms, Trinity recognizes underdelivered themes and remains stable when facing emerging hot topics. Trinity is more appropriate for large-scale industry scenarios because of its modest computational overheads. Its derived retrievers have been deployed on the recommender system of Douyin, significantly improving user experience and retention. We believe that such practical experience can be well generalized to other scenarios.
Abstract:Anterior segment optical coherence tomography (AS-OCT) is a non-invasive imaging technique that is highly valuable for ophthalmic diagnosis. However, speckles in AS-OCT images can often degrade the image quality and affect clinical analysis. As a result, removing speckles in AS-OCT images can greatly benefit automatic ophthalmology analysis. Unfortunately, challenges still exist in deploying effective AS-OCT image denoising algorithms, including collecting sufficient paired training data and the requirement to preserve consistent content in medical images. To address these practical issues, we propose an unsupervised AS-OCT despeckling algorithm via Content Preserving Diffusion Model (CPDM) with statistical knowledge. At the training stage, a Markov chain transforms clean images to white Gaussian noise by repeatedly adding random noise and removes the predicted noise in a reverse procedure. At the inference stage, we first analyze the statistical distribution of speckles and convert it into a Gaussian distribution, aiming to match the fast truncated reverse diffusion process. We then explore the posterior distribution of observed images as a fidelity term to ensure content consistency in the iterative procedure. Our experimental results show that CPDM significantly improves image quality compared to competitive methods. Furthermore, we validate the benefits of CPDM for subsequent clinical analysis, including ciliary muscle (CM) segmentation and scleral spur (SS) localization.
Abstract:Interactive Machine Learning (IML) is an iterative learning process that tightly couples a human with a machine learner, which is widely used by researchers and practitioners to effectively solve a wide variety of real-world application problems. Although recent years have witnessed the proliferation of IML in the field of visual analytics, most recent surveys either focus on a specific area of IML or aim to summarize a visualization field that is too generic for IML. In this paper, we systematically review the recent literature on IML and classify them into a task-oriented taxonomy built by us. We conclude the survey with a discussion of open challenges and research opportunities that we believe are inspiring for future work in IML.