Abstract:Recent advancements in Natural Language Processing (NLP) have led to the development of NLP-based recommender systems that have shown superior performance. However, current models commonly treat items as mere IDs and adopt discriminative modeling, resulting in limitations of (1) fully leveraging the content information of items and the language modeling capabilities of NLP models; (2) interpreting user interests to improve relevance and diversity; and (3) adapting practical circumstances such as growing item inventories. To address these limitations, we present GPT4Rec, a novel and flexible generative framework inspired by search engines. It first generates hypothetical "search queries" given item titles in a user's history, and then retrieves items for recommendation by searching these queries. The framework overcomes previous limitations by learning both user and item embeddings in the language space. To well-capture user interests with different aspects and granularity for improving relevance and diversity, we propose a multi-query generation technique with beam search. The generated queries naturally serve as interpretable representations of user interests and can be searched to recommend cold-start items. With GPT-2 language model and BM25 search engine, our framework outperforms state-of-the-art methods by $75.7\%$ and $22.2\%$ in Recall@K on two public datasets. Experiments further revealed that multi-query generation with beam search improves both the diversity of retrieved items and the coverage of a user's multi-interests. The adaptiveness and interpretability of generated queries are discussed with qualitative case studies.
Abstract:Most machine learning-based coronary artery segmentation methods represent the vascular lumen surface in an implicit way by the centerline and the associated lumen radii, which makes the subsequent modeling process to generate a whole piece of watertight coronary artery tree model difficult. To solve this problem, in this paper, we propose a modeling method with the learning-based segmentation results by (1) considering mesh vertices as physical particles and using interaction force model and particle expansion model to generate uniformly distributed point cloud on the implicit lumen surface and; (2) doing incremental Delaunay-based triangulation. Our method has the advantage of being able to consider the complex shape of the coronary artery tree as a whole piece; hence no extra stitching or intersection removal algorithm is needed to generate a watertight model. Experiment results demonstrate that our method is capable of generating high quality mesh model which is highly consistent with the given implicit vascular lumen surface, with an average error of 0.08 mm.
Abstract:Recently, machine learning has been successfully applied to model-based left ventricle (LV) segmentation. The general framework involves two stages, which starts with LV localization and is followed by boundary delineation. Both are driven by supervised learning techniques. When compared to previous non-learning-based methods, several advantages have been shown, including full automation and improved accuracy. However, the speed is still slow, in the order of several seconds, for applications involving a large number of cases or case loads requiring real-time performance. In this paper, we propose a fast LV segmentation algorithm by joint localization and boundary delineation via training explicit shape regressor with random pixel difference features. Tested on 3D cardiac computed tomography (CT) image volumes, the average running time of the proposed algorithm is 1.2 milliseconds per case. On a dataset consisting of 139 CT volumes, a 5-fold cross validation shows the segmentation error is $1.21 \pm 0.11$ for LV endocardium and $1.23 \pm 0.11$ millimeters for epicardium. Compared with previous work, the proposed method is more stable (lower standard deviation) without significant compromise to the accuracy.
Abstract:Recently, a successful pose estimation algorithm, called Cascade Pose Regression (CPR), was proposed in the literature. Trained over Pose Index Feature, CPR is a regressor ensemble that is similar to Boosting. In this paper we show how CPR can be represented as a Neural Network. Specifically, we adopt a Graph Transformer Network (GTN) representation and accordingly train CPR with Back Propagation (BP) that permits globally tuning. In contrast, previous CPR literature only took a layer wise training without any post fine tuning. We empirically show that global training with BP outperforms layer-wise (pre-)training. Our CPR-GTN adopts a Multi Layer Percetron as the regressor, which utilized sparse connection to learn local image feature representation. We tested the proposed CPR-GTN on 2D face pose estimation problem as in previous CPR literature. Besides, we also investigated the possibility of extending CPR-GTN to 3D pose estimation by doing experiments using 3D Computed Tomography dataset for heart segmentation.