Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yixuan Jiang

Adaptive Spatial Transcriptomics Interpolation via Cross-modal Cross-slice Modeling

May 15, 2025

NingFeng Que, Xiaofei Wang, Jingjing Chen, Yixuan Jiang, Chao Li

Abstract:Spatial transcriptomics (ST) is a promising technique that characterizes the spatial gene profiling patterns within the tissue context. Comprehensive ST analysis depends on consecutive slices for 3D spatial insights, whereas the missing intermediate tissue sections and high costs limit the practical feasibility of generating multi-slice ST. In this paper, we propose C2-STi, the first attempt for interpolating missing ST slices at arbitrary intermediate positions between adjacent ST slices. Despite intuitive, effective ST interpolation presents significant challenges, including 1) limited continuity across heterogeneous tissue sections, 2) complex intrinsic correlation across genes, and 3) intricate cellular structures and biological semantics within each tissue section. To mitigate these challenges, in C2-STi, we design 1) a distance-aware local structural modulation module to adaptively capture cross-slice deformations and enhance positional correlations between ST slices, 2) a pyramid gene co-expression correlation module to capture multi-scale biological associations among genes, and 3) a cross-modal alignment module that integrates the ST-paired hematoxylin and eosin (H&E)-stained images to filter and align the essential cellular features across ST and H\&E images. Extensive experiments on the public dataset demonstrate our superiority over state-of-the-art approaches on both single-slice and multi-slice ST interpolation. Codes are available at https://github.com/XiaofeiWang2018/C2-STi.

* Early accepted by MICCAI 2025

Via

Access Paper or Ask Questions

Redundant feature screening method for human activity recognition based on attention purification mechanism

Mar 30, 2025

Hanyu Liu, Xiaoyang Li, Yixuan Jiang, Haotian Tang, Dongchen Wu, Yameng Guo

Abstract:In the field of sensor-based Human Activity Recognition (HAR), deep neural networks provide advanced technical support. Many studies have proven that recognition accuracy can be improved by increasing the depth or width of the network. However, for wearable devices, the balance between network performance and resource consumption is crucial. With minimum resource consumption as the basic principle, we propose a universal attention feature purification mechanism, called MSAP, which is suitable for multi-scale networks. The mechanism effectively solves the feature redundancy caused by the superposition of multi-scale features by means of inter-scale attention screening and connection method. In addition, we have designed a network correction module that integrates seamlessly between layers of individual network modules to mitigate inherent problems in deep networks. We also built an embedded deployment system that is in line with the current level of wearable technology to test the practical feasibility of the HAR model, and further prove the efficiency of the method. Extensive experiments on four public datasets show that the proposed method model effectively reduces redundant features in filtered data and provides excellent performance with little resource consumption.

* 12 pages,7 figures

Via

Access Paper or Ask Questions

CMD-HAR: Cross-Modal Disentanglement for Wearable Human Activity Recognition

Mar 27, 2025

Hanyu Liu, Siyao Li, Ying Yu, Yixuan Jiang, Hang Xiao, Jingxi Long, Haotian Tang

Abstract:Human Activity Recognition (HAR) is a fundamental technology for numerous human - centered intelligent applications. Although deep learning methods have been utilized to accelerate feature extraction, issues such as multimodal data mixing, activity heterogeneity, and complex model deployment remain largely unresolved. The aim of this paper is to address issues such as multimodal data mixing, activity heterogeneity, and complex model deployment in sensor-based human activity recognition. We propose a spatiotemporal attention modal decomposition alignment fusion strategy to tackle the problem of the mixed distribution of sensor data. Key discriminative features of activities are captured through cross-modal spatio-temporal disentangled representation, and gradient modulation is combined to alleviate data heterogeneity. In addition, a wearable deployment simulation system is constructed. We conducted experiments on a large number of public datasets, demonstrating the effectiveness of the model.

Via

Access Paper or Ask Questions

MobA: A Two-Level Agent System for Efficient Mobile Task Automation

Oct 17, 2024

Zichen Zhu, Hao Tang, Yansi Li, Kunyao Lan, Yixuan Jiang, Hao Zhou, Yixiao Wang, Situo Zhang, Liangtai Sun, Lu Chen(+1 more)

Figure 1 for MobA: A Two-Level Agent System for Efficient Mobile Task Automation

Figure 2 for MobA: A Two-Level Agent System for Efficient Mobile Task Automation

Figure 3 for MobA: A Two-Level Agent System for Efficient Mobile Task Automation

Figure 4 for MobA: A Two-Level Agent System for Efficient Mobile Task Automation

Abstract:Current mobile assistants are limited by dependence on system APIs or struggle with complex user instructions and diverse interfaces due to restricted comprehension and decision-making abilities. To address these challenges, we propose MobA, a novel Mobile phone Agent powered by multimodal large language models that enhances comprehension and planning capabilities through a sophisticated two-level agent architecture. The high-level Global Agent (GA) is responsible for understanding user commands, tracking history memories, and planning tasks. The low-level Local Agent (LA) predicts detailed actions in the form of function calls, guided by sub-tasks and memory from the GA. Integrating a Reflection Module allows for efficient task completion and enables the system to handle previously unseen complex tasks. MobA demonstrates significant improvements in task execution efficiency and completion rate in real-life evaluations, underscoring the potential of MLLM-empowered mobile assistants.

* 27 pages, 6 figures, and 5 tables. We will release our source code in a few days

Via

Access Paper or Ask Questions

Multifaceted Improvements for Conversational Open-Domain Question Answering

Apr 01, 2022

Tingting Liang, Yixuan Jiang, Congying Xia, Ziqiang Zhao, Yuyu Yin, Philip S. Yu

Figure 1 for Multifaceted Improvements for Conversational Open-Domain Question Answering

Figure 2 for Multifaceted Improvements for Conversational Open-Domain Question Answering

Figure 3 for Multifaceted Improvements for Conversational Open-Domain Question Answering

Figure 4 for Multifaceted Improvements for Conversational Open-Domain Question Answering

Abstract:Open-domain question answering (OpenQA) is an important branch of textual QA which discovers answers for the given questions based on a large number of unstructured documents. Effectively mining correct answers from the open-domain sources still has a fair way to go. Existing OpenQA systems might suffer from the issues of question complexity and ambiguity, as well as insufficient background knowledge. Recently, conversational OpenQA is proposed to address these issues with the abundant contextual information in the conversation. Promising as it might be, there exist several fundamental limitations including the inaccurate question understanding, the coarse ranking for passage selection, and the inconsistent usage of golden passage in the training and inference phases. To alleviate these limitations, in this paper, we propose a framework with Multifaceted Improvements for Conversational open-domain Question Answering (MICQA). Specifically, MICQA has three significant advantages. First, the proposed KL-divergence based regularization is able to lead to a better question understanding for retrieval and answer reading. Second, the added post-ranker module can push more relevant passages to the top placements and be selected for reader with a two-aspect constrains. Third, the well designed curriculum learning strategy effectively narrows the gap between the golden passage settings of training and inference, and encourages the reader to find true answer without the golden passage assistance. Extensive experiments conducted on the publicly available dataset OR-QuAC demonstrate the superiority of MICQA over the state-of-the-art model in conversational OpenQA task.

Via

Access Paper or Ask Questions

Adaptive Learning Rate and Momentum for Training Deep Neural Networks

Jun 22, 2021

Zhiyong Hao, Yixuan Jiang, Huihua Yu, Hsiao-Dong Chiang

Figure 1 for Adaptive Learning Rate and Momentum for Training Deep Neural Networks

Figure 2 for Adaptive Learning Rate and Momentum for Training Deep Neural Networks

Figure 3 for Adaptive Learning Rate and Momentum for Training Deep Neural Networks

Figure 4 for Adaptive Learning Rate and Momentum for Training Deep Neural Networks

Abstract:Recent progress on deep learning relies heavily on the quality and efficiency of training algorithms. In this paper, we develop a fast training method motivated by the nonlinear Conjugate Gradient (CG) framework. We propose the Conjugate Gradient with Quadratic line-search (CGQ) method. On the one hand, a quadratic line-search determines the step size according to current loss landscape. On the other hand, the momentum factor is dynamically updated in computing the conjugate gradient parameter (like Polak-Ribiere). Theoretical results to ensure the convergence of our method in strong convex settings is developed. And experiments in image classification datasets show that our method yields faster convergence than other local solvers and has better generalization capability (test set accuracy). One major advantage of the paper method is that tedious hand tuning of hyperparameters like the learning rate and momentum is avoided.

Via

Access Paper or Ask Questions

A Survey on Dialog Management: Recent Advances and Challenges

May 05, 2020

Yinpei Dai, Huihua Yu, Yixuan Jiang, Chengguang Tang, Yongbin Li, Jian Sun

Figure 1 for A Survey on Dialog Management: Recent Advances and Challenges

Figure 2 for A Survey on Dialog Management: Recent Advances and Challenges

Figure 3 for A Survey on Dialog Management: Recent Advances and Challenges

Figure 4 for A Survey on Dialog Management: Recent Advances and Challenges

Abstract:Dialog management (DM) is a crucial component in a task-oriented dialog system. Given the dialog history, DM predicts the dialog state and decides the next action that the dialog agent should take. Recently, dialog policy learning has been widely formulated as a Reinforcement Learning (RL) problem, and more works focus on the applicability of DM. In this paper, we survey recent advances and challenges within three critical topics for DM: (1) improving model scalability to facilitate dialog system modeling in new scenarios, (2) dealing with the data scarcity problem for dialog policy learning, and (3) enhancing the training efficiency to achieve better task-completion performance . We believe that this survey can shed a light on future research in dialog management.

Via

Access Paper or Ask Questions