Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jiazhi Xia

Embed Progressive Implicit Preference in Unified Space for Deep Collaborative Filtering

May 28, 2025

Zhongjin Zhang, Yu Liang, Cong Fu, Yuxuan Zhu, Kun Wang, Yabo Ni, Anxiang Zeng, Jiazhi Xia

Abstract:Embedding-based collaborative filtering, often coupled with nearest neighbor search, is widely deployed in large-scale recommender systems for personalized content selection. Modern systems leverage multiple implicit feedback signals (e.g., clicks, add to cart, purchases) to model user preferences comprehensively. However, prevailing approaches adopt a feedback-wise modeling paradigm, which (1) fails to capture the structured progression of user engagement entailed among different feedback and (2) embeds feedback-specific information into disjoint spaces, making representations incommensurable, increasing system complexity, and leading to suboptimal retrieval performance. A promising alternative is Ordinal Logistic Regression (OLR), which explicitly models discrete ordered relations. However, existing OLR-based recommendation models mainly focus on explicit feedback (e.g., movie ratings) and struggle with implicit, correlated feedback, where ordering is vague and non-linear. Moreover, standard OLR lacks flexibility in handling feedback-dependent covariates, resulting in suboptimal performance in real-world systems. To address these limitations, we propose Generalized Neural Ordinal Logistic Regression (GNOLR), which encodes multiple feature-feedback dependencies into a unified, structured embedding space and enforces feedback-specific dependency learning through a nested optimization framework. Thus, GNOLR enhances predictive accuracy, captures the progression of user engagement, and simplifies the retrieval process. We establish a theoretical comparison with existing paradigms, demonstrating how GNOLR avoids disjoint spaces while maintaining effectiveness. Extensive experiments on ten real-world datasets show that GNOLR significantly outperforms state-of-the-art methods in efficiency and adaptability.

Via

Access Paper or Ask Questions

Retrieval Augmented Decision-Making: A Requirements-Driven, Multi-Criteria Framework for Structured Decision Support

May 24, 2025

Hongjia Wu, Hongxin Zhang, Wei Chen, Jiazhi Xia

Abstract:Various industries have produced a large number of documents such as industrial plans, technical guidelines, and regulations that are structurally complex and content-wise fragmented. This poses significant challenges for experts and decision-makers in terms of retrieval and understanding. Although existing LLM-based Retrieval-Augmented Generation methods can provide context-related suggestions, they lack quantitative weighting and traceable reasoning paths, making it difficult to offer multi-level and transparent decision support. To address this issue, this paper proposes the RAD method, which integrates Multi-Criteria Decision Making with the semantic understanding capabilities of LLMs. The method automatically extracts key criteria from industry documents, builds a weighted hierarchical decision model, and generates structured reports under model guidance. The RAD framework introduces explicit weight assignment and reasoning chains in decision generation to ensure accuracy, completeness, and traceability. Experiments show that in various decision-making tasks, the decision reports generated by RAD significantly outperform existing methods in terms of detail, rationality, and structure, demonstrating its application value and potential in complex decision support scenarios.

Via

Access Paper or Ask Questions

AKRMap: Adaptive Kernel Regression for Trustworthy Visualization of Cross-Modal Embeddings

May 20, 2025

Yilin Ye, Junchao Huang, Xingchen Zeng, Jiazhi Xia, Wei Zeng

Abstract:Cross-modal embeddings form the foundation for multi-modal models. However, visualization methods for interpreting cross-modal embeddings have been primarily confined to traditional dimensionality reduction (DR) techniques like PCA and t-SNE. These DR methods primarily focus on feature distributions within a single modality, whilst failing to incorporate metrics (e.g., CLIPScore) across multiple modalities.This paper introduces AKRMap, a new DR technique designed to visualize cross-modal embeddings metric with enhanced accuracy by learning kernel regression of the metric landscape in the projection space. Specifically, AKRMap constructs a supervised projection network guided by a post-projection kernel regression loss, and employs adaptive generalized kernels that can be jointly optimized with the projection. This approach enables AKRMap to efficiently generate visualizations that capture complex metric distributions, while also supporting interactive features such as zoom and overlay for deeper exploration. Quantitative experiments demonstrate that AKRMap outperforms existing DR methods in generating more accurate and trustworthy visualizations. We further showcase the effectiveness of AKRMap in visualizing and comparing cross-modal embeddings for text-to-image models. Code and demo are available at https://github.com/yilinye/AKRMap.

Via

Access Paper or Ask Questions

FlexPara: Flexible Neural Surface Parameterization

Apr 27, 2025

Yuming Zhao, Qijian Zhang, Junhui Hou, Jiazhi Xia, Wenping Wang, Ying He

Abstract:Surface parameterization is a fundamental geometry processing task, laying the foundations for the visual presentation of 3D assets and numerous downstream shape analysis scenarios. Conventional parameterization approaches demand high-quality mesh triangulation and are restricted to certain simple topologies unless additional surface cutting and decomposition are provided. In practice, the optimal configurations (e.g., type of parameterization domains, distribution of cutting seams, number of mapping charts) may vary drastically with different surface structures and task characteristics, thus requiring more flexible and controllable processing pipelines. To this end, this paper introduces FlexPara, an unsupervised neural optimization framework to achieve both global and multi-chart surface parameterizations by establishing point-wise mappings between 3D surface points and adaptively-deformed 2D UV coordinates. We ingeniously design and combine a series of geometrically-interpretable sub-networks, with specific functionalities of cutting, deforming, unwrapping, and wrapping, to construct a bi-directional cycle mapping framework for global parameterization without the need for manually specified cutting seams. Furthermore, we construct a multi-chart parameterization framework with adaptively-learned chart assignment. Extensive experiments demonstrate the universality, superiority, and inspiring potential of our neural surface parameterization paradigm. The code will be publicly available at https://github.com/AidenZhao/FlexPara

Via

Access Paper or Ask Questions

SketchFlex: Facilitating Spatial-Semantic Coherence in Text-to-Image Generation with Region-Based Sketches

Feb 11, 2025

Haichuan Lin, Yilin Ye, Jiazhi Xia, Wei Zeng

Abstract:Text-to-image models can generate visually appealing images from text descriptions. Efforts have been devoted to improving model controls with prompt tuning and spatial conditioning. However, our formative study highlights the challenges for non-expert users in crafting appropriate prompts and specifying fine-grained spatial conditions (e.g., depth or canny references) to generate semantically cohesive images, especially when multiple objects are involved. In response, we introduce SketchFlex, an interactive system designed to improve the flexibility of spatially conditioned image generation using rough region sketches. The system automatically infers user prompts with rational descriptions within a semantic space enriched by crowd-sourced object attributes and relationships. Additionally, SketchFlex refines users' rough sketches into canny-based shape anchors, ensuring the generation quality and alignment of user intentions. Experimental results demonstrate that SketchFlex achieves more cohesive image generations than end-to-end models, meanwhile significantly reducing cognitive load and better matching user intentions compared to region-based generation baseline.

* conference: CHI2025

Via

Access Paper or Ask Questions

LiDAR-RT: Gaussian-based Ray Tracing for Dynamic LiDAR Re-simulation

Dec 19, 2024

Chenxu Zhou, Lvchang Fu, Sida Peng, Yunzhi Yan, Zhanhua Zhang, Yong Chen, Jiazhi Xia, Xiaowei Zhou

Figure 1 for LiDAR-RT: Gaussian-based Ray Tracing for Dynamic LiDAR Re-simulation

Figure 2 for LiDAR-RT: Gaussian-based Ray Tracing for Dynamic LiDAR Re-simulation

Figure 3 for LiDAR-RT: Gaussian-based Ray Tracing for Dynamic LiDAR Re-simulation

Figure 4 for LiDAR-RT: Gaussian-based Ray Tracing for Dynamic LiDAR Re-simulation

Abstract:This paper targets the challenge of real-time LiDAR re-simulation in dynamic driving scenarios. Recent approaches utilize neural radiance fields combined with the physical modeling of LiDAR sensors to achieve high-fidelity re-simulation results. Unfortunately, these methods face limitations due to high computational demands in large-scale scenes and cannot perform real-time LiDAR rendering. To overcome these constraints, we propose LiDAR-RT, a novel framework that supports real-time, physically accurate LiDAR re-simulation for driving scenes. Our primary contribution is the development of an efficient and effective rendering pipeline, which integrates Gaussian primitives and hardware-accelerated ray tracing technology. Specifically, we model the physical properties of LiDAR sensors using Gaussian primitives with learnable parameters and incorporate scene graphs to handle scene dynamics. Building upon this scene representation, our framework first constructs a bounding volume hierarchy (BVH), then casts rays for each pixel and generates novel LiDAR views through a differentiable rendering algorithm. Importantly, our framework supports realistic rendering with flexible scene editing operations and various sensor configurations. Extensive experiments across multiple public benchmarks demonstrate that our method outperforms state-of-the-art methods in terms of rendering quality and efficiency. Our project page is at https://zju3dv.github.io/lidar-rt.

* Project page: https://zju3dv.github.io/lidar-rt

Via

Access Paper or Ask Questions

Diagnosing Ensemble Few-Shot Classifiers

Jun 09, 2022

Weikai Yang, Xi Ye, Xingxing Zhang, Lanxi Xiao, Jiazhi Xia, Zhongyuan Wang, Jun Zhu, Hanspeter Pfister, Shixia Liu

Figure 1 for Diagnosing Ensemble Few-Shot Classifiers

Figure 2 for Diagnosing Ensemble Few-Shot Classifiers

Figure 3 for Diagnosing Ensemble Few-Shot Classifiers

Figure 4 for Diagnosing Ensemble Few-Shot Classifiers

Abstract:The base learners and labeled samples (shots) in an ensemble few-shot classifier greatly affect the model performance. When the performance is not satisfactory, it is usually difficult to understand the underlying causes and make improvements. To tackle this issue, we propose a visual analysis method, FSLDiagnotor. Given a set of base learners and a collection of samples with a few shots, we consider two problems: 1) finding a subset of base learners that well predict the sample collections; and 2) replacing the low-quality shots with more representative ones to adequately represent the sample collections. We formulate both problems as sparse subset selection and develop two selection algorithms to recommend appropriate learners and shots, respectively. A matrix visualization and a scatterplot are combined to explain the recommended learners and shots in context and facilitate users in adjusting them. Based on the adjustment, the algorithm updates the recommendation results for another round of improvement. Two case studies are conducted to demonstrate that FSLDiagnotor helps build a few-shot classifier efficiently and increases the accuracy by 12% and 21%, respectively.

* Accepted in IEEE TVCG

Via

Access Paper or Ask Questions

Revisiting the Modifiable Areal Unit Problem in Deep Traffic Prediction with Visual Analytics

Sep 07, 2020

Wei Zeng, Chengqiao Lin, Juncong Lin, Jincheng Jiang, Jiazhi Xia, Cagatay Turkay, Wei Chen

Figure 1 for Revisiting the Modifiable Areal Unit Problem in Deep Traffic Prediction with Visual Analytics

Figure 2 for Revisiting the Modifiable Areal Unit Problem in Deep Traffic Prediction with Visual Analytics

Figure 3 for Revisiting the Modifiable Areal Unit Problem in Deep Traffic Prediction with Visual Analytics

Figure 4 for Revisiting the Modifiable Areal Unit Problem in Deep Traffic Prediction with Visual Analytics

Abstract:Deep learning methods are being increasingly used for urban traffic prediction where spatiotemporal traffic data is aggregated into sequentially organized matrices that are then fed into convolution-based residual neural networks. However, the widely known modifiable areal unit problem within such aggregation processes can lead to perturbations in the network inputs. This issue can significantly destabilize the feature embeddings and the predictions, rendering deep networks much less useful for the experts. This paper approaches this challenge by leveraging unit visualization techniques that enable the investigation of many-to-many relationships between dynamically varied multi-scalar aggregations of urban traffic data and neural network predictions. Through regular exchanges with a domain expert, we design and develop a visual analytics solution that integrates 1) a Bivariate Map equipped with an advanced bivariate colormap to simultaneously depict input traffic and prediction errors across space, 2) a Morans I Scatterplot that provides local indicators of spatial association analysis, and 3) a Multi-scale Attribution View that arranges non-linear dot plots in a tree layout to promote model analysis and comparison across scales. We evaluate our approach through a series of case studies involving a real-world dataset of Shenzhen taxi trips, and through interviews with domain experts. We observe that geographical scale variations have important impact on prediction performances, and interactive visual exploration of dynamically varying inputs and outputs benefit experts in the development of deep traffic prediction models.

Via

Access Paper or Ask Questions

SMAP: A Joint Dimensionality Reduction Scheme for Secure Multi-Party Visualization

Jul 30, 2020

Jiazhi Xia, Tianxiang Chen, Lei Zhang, Wei Chen, Yang Chen, Xiaolong Zhang, Cong Xie, Tobias Schreck

Figure 1 for SMAP: A Joint Dimensionality Reduction Scheme for Secure Multi-Party Visualization

Figure 2 for SMAP: A Joint Dimensionality Reduction Scheme for Secure Multi-Party Visualization

Figure 3 for SMAP: A Joint Dimensionality Reduction Scheme for Secure Multi-Party Visualization

Figure 4 for SMAP: A Joint Dimensionality Reduction Scheme for Secure Multi-Party Visualization

Abstract:Nowadays, as data becomes increasingly complex and distributed, data analyses often involve several related datasets that are stored on different servers and probably owned by different stakeholders. While there is an emerging need to provide these stakeholders with a full picture of their data under a global context, conventional visual analytical methods, such as dimensionality reduction, could expose data privacy when multi-party datasets are fused into a single site to build point-level relationships. In this paper, we reformulate the conventional t-SNE method from the single-site mode into a secure distributed infrastructure. We present a secure multi-party scheme for joint t-SNE computation, which can minimize the risk of data leakage. Aggregated visualization can be optionally employed to hide disclosure of point-level relationships. We build a prototype system based on our method, SMAP, to support the organization, computation, and exploration of secure joint embedding. We demonstrate the effectiveness of our approach with three case studies, one of which is based on the deployment of our system in real-world applications.

* 12 pages, 10 figures. Conditionally accepted by VAST 2020

Via

Access Paper or Ask Questions

Scale-Invariant Structure Saliency Selection for Fast Image Fusion

Oct 30, 2018

Yixiong Liang, Yuan Mao, Jiazhi Xia, Yao Xiang, Jianfeng Liu

Figure 1 for Scale-Invariant Structure Saliency Selection for Fast Image Fusion

Figure 2 for Scale-Invariant Structure Saliency Selection for Fast Image Fusion

Figure 3 for Scale-Invariant Structure Saliency Selection for Fast Image Fusion

Figure 4 for Scale-Invariant Structure Saliency Selection for Fast Image Fusion

Abstract:In this paper, we present a fast yet effective method for pixel-level scale-invariant image fusion in spatial domain based on the scale-space theory. Specifically, we propose a scale-invariant structure saliency selection scheme based on the difference-of-Gaussian (DoG) pyramid of images to build the weights or activity map. Due to the scale-invariant structure saliency selection, our method can keep both details of small size objects and the integrity information of large size objects in images. In addition, our method is very efficient since there are no complex operation involved and easy to be implemented and therefore can be used for fast high resolution images fusion. Experimental results demonstrate the proposed method yields competitive or even better results comparing to state-of-the-art image fusion methods both in terms of visual quality and objective evaluation metrics. Furthermore, the proposed method is very fast and can be used to fuse the high resolution images in real-time. Code is available at https://github.com/yiqingmy/Fusion.

Via

Access Paper or Ask Questions