Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rui Jin

ReID5o: Achieving Omni Multi-modal Person Re-identification in a Single Model

Jun 11, 2025

Jialong Zuo, Yongtai Deng, Mengdan Tan, Rui Jin, Dongyue Wu, Nong Sang, Liang Pan, Changxin Gao

Abstract:In real-word scenarios, person re-identification (ReID) expects to identify a person-of-interest via the descriptive query, regardless of whether the query is a single modality or a combination of multiple modalities. However, existing methods and datasets remain constrained to limited modalities, failing to meet this requirement. Therefore, we investigate a new challenging problem called Omni Multi-modal Person Re-identification (OM-ReID), which aims to achieve effective retrieval with varying multi-modal queries. To address dataset scarcity, we construct ORBench, the first high-quality multi-modal dataset comprising 1,000 unique identities across five modalities: RGB, infrared, color pencil, sketch, and textual description. This dataset also has significant superiority in terms of diversity, such as the painting perspectives and textual information. It could serve as an ideal platform for follow-up investigations in OM-ReID. Moreover, we propose ReID5o, a novel multi-modal learning framework for person ReID. It enables synergistic fusion and cross-modal alignment of arbitrary modality combinations in a single model, with a unified encoding and multi-expert routing mechanism proposed. Extensive experiments verify the advancement and practicality of our ORBench. A wide range of possible models have been evaluated and compared on it, and our proposed ReID5o model gives the best performance. The dataset and code will be made publicly available at https://github.com/Zplusdragon/ReID5o_ORBench.

Via

Access Paper or Ask Questions

HGS-Planner: Hierarchical Planning Framework for Active Scene Reconstruction Using 3D Gaussian Splatting

Sep 26, 2024

Zijun Xu, Rui Jin, Ke Wu, Yi Zhao, Zhiwei Zhang, Jieru Zhao, Zhongxue Gan, Wenchao Ding

Abstract:In complex missions such as search and rescue,robots must make intelligent decisions in unknown environments, relying on their ability to perceive and understand their surroundings. High-quality and real-time reconstruction enhances situational awareness and is crucial for intelligent robotics. Traditional methods often struggle with poor scene representation or are too slow for real-time use. Inspired by the efficacy of 3D Gaussian Splatting (3DGS), we propose a hierarchical planning framework for fast and high-fidelity active reconstruction. Our method evaluates completion and quality gain to adaptively guide reconstruction, integrating global and local planning for efficiency. Experiments in simulated and real-world environments show our approach outperforms existing real-time methods.

Via

Access Paper or Ask Questions

AI-based Automatic Segmentation of Prostate on Multi-modality Images: A Review

Jul 09, 2024

Rui Jin, Derun Li, Dehui Xiang, Lei Zhang, Hailing Zhou, Fei Shi, Weifang Zhu, Jing Cai, Tao Peng, Xinjian Chen

Abstract:Prostate cancer represents a major threat to health. Early detection is vital in reducing the mortality rate among prostate cancer patients. One approach involves using multi-modality (CT, MRI, US, etc.) computer-aided diagnosis (CAD) systems for the prostate region. However, prostate segmentation is challenging due to imperfections in the images and the prostate's complex tissue structure. The advent of precision medicine and a significant increase in clinical capacity have spurred the need for various data-driven tasks in the field of medical imaging. Recently, numerous machine learning and data mining tools have been integrated into various medical areas, including image segmentation. This article proposes a new classification method that differentiates supervision types, either in number or kind, during the training phase. Subsequently, we conducted a survey on artificial intelligence (AI)-based automatic prostate segmentation methods, examining the advantages and limitations of each. Additionally, we introduce variants of evaluation metrics for the verification and performance assessment of the segmentation method and summarize the current challenges. Finally, future research directions and development trends are discussed, reflecting the outcomes of our literature survey, suggesting high-precision detection and treatment of prostate cancer as a promising avenue.

Via

Access Paper or Ask Questions

GS-Planner: A Gaussian-Splatting-based Planning Framework for Active High-Fidelity Reconstruction

May 16, 2024

Rui Jin, Yuman Gao, Haojian Lu, Fei Gao

Abstract:Active reconstruction technique enables robots to autonomously collect scene data for full coverage, relieving users from tedious and time-consuming data capturing process. However, designed based on unsuitable scene representations, existing methods show unrealistic reconstruction results or the inability of online quality evaluation. Due to the recent advancements in explicit radiance field technology, online active high-fidelity reconstruction has become achievable. In this paper, we propose GS-Planner, a planning framework for active high-fidelity reconstruction using 3D Gaussian Splatting. With improvement on 3DGS to recognize unobserved regions, we evaluate the reconstruction quality and completeness of 3DGS map online to guide the robot. Then we design a sampling-based active reconstruction strategy to explore the unobserved areas and improve the reconstruction geometric and textural quality. To establish a complete robot active reconstruction system, we choose quadrotor as the robotic platform for its high agility. Then we devise a safety constraint with 3DGS to generate executable trajectories for quadrotor navigation in the 3DGS map. To validate the effectiveness of our method, we conduct extensive experiments and ablation studies in highly realistic simulation scenes.

Via

Access Paper or Ask Questions

Adaptive Tracking and Perching for Quadrotor in Dynamic Scenarios

Dec 19, 2023

Yuman Gao, Jialin Ji, Qianhao Wang, Rui Jin, Yi Lin, Zhimeng Shang, Yanjun Cao, Shaojie Shen, Chao Xu, Fei Gao

Figure 1 for Adaptive Tracking and Perching for Quadrotor in Dynamic Scenarios

Figure 2 for Adaptive Tracking and Perching for Quadrotor in Dynamic Scenarios

Figure 3 for Adaptive Tracking and Perching for Quadrotor in Dynamic Scenarios

Figure 4 for Adaptive Tracking and Perching for Quadrotor in Dynamic Scenarios

Abstract:Perching on the moving platforms is a promising solution to enhance the endurance and operational range of quadrotors, which could benefit the efficiency of a variety of air-ground cooperative tasks. To ensure robust perching, tracking with a steady relative state and reliable perception is a prerequisite. This paper presents an adaptive dynamic tracking and perching scheme for autonomous quadrotors to achieve tight integration with moving platforms. For reliable perception of dynamic targets, we introduce elastic visibility-aware planning to actively avoid occlusion and target loss. Additionally, we propose a flexible terminal adjustment method that adapts the changes in flight duration and the coupled terminal states, ensuring full-state synchronization with the time-varying perching surface at various angles. A relaxation strategy is developed by optimizing the tangential relative speed to address the dynamics and safety violations brought by hard boundary conditions. Moreover, we take SE(3) motion planning into account to ensure no collision between the quadrotor and the platform until the contact moment. Furthermore, we propose an efficient spatiotemporal trajectory optimization framework considering full state dynamics for tracking and perching. The proposed method is extensively tested through benchmark comparisons and ablation studies. To facilitate the application of academic research to industry and to validate the efficiency of our scheme under strictly limited computational resources, we deploy our system on a commercial drone (DJI-MAVIC3) with a full-size sport-utility vehicle (SUV). We conduct extensive real-world experiments, where the drone successfully tracks and perches at 30~km/h (8.3~m/s) on the top of the SUV, and at 3.5~m/s with 60{\deg} inclined into the trunk of the SUV.

Via

Access Paper or Ask Questions

Crossword: A Semantic Approach to Data Compression via Masking

Apr 03, 2023

Mingxiao Li, Rui Jin, Liyao Xiang, Kaiming Shen, Shuguang Cui

Figure 1 for Crossword: A Semantic Approach to Data Compression via Masking

Figure 2 for Crossword: A Semantic Approach to Data Compression via Masking

Figure 3 for Crossword: A Semantic Approach to Data Compression via Masking

Figure 4 for Crossword: A Semantic Approach to Data Compression via Masking

Abstract:The traditional methods for data compression are typically based on the symbol-level statistics, with the information source modeled as a long sequence of i.i.d. random variables or a stochastic process, thus establishing the fundamental limit as entropy for lossless compression and as mutual information for lossy compression. However, the source (including text, music, and speech) in the real world is often statistically ill-defined because of its close connection to human perception, and thus the model-driven approach can be quite suboptimal. This study places careful emphasis on English text and exploits its semantic aspect to enhance the compression efficiency further. The main idea stems from the puzzle crossword, observing that the hidden words can still be precisely reconstructed so long as some key letters are provided. The proposed masking-based strategy resembles the above game. In a nutshell, the encoder evaluates the semantic importance of each word according to the semantic loss and then masks the minor ones, while the decoder aims to recover the masked words from the semantic context by means of the Transformer. Our experiments show that the proposed semantic approach can achieve much higher compression efficiency than the traditional methods such as Huffman code and UTF-8 code, while preserving the meaning in the target text to a great extent.

* 6 pages, 8 figures

Via

Access Paper or Ask Questions

MixNN: A design for protecting deep learning models

Mar 28, 2022

Chao Liu, Hao Chen, Yusen Wu, Rui Jin

Figure 1 for MixNN: A design for protecting deep learning models

Figure 2 for MixNN: A design for protecting deep learning models

Figure 3 for MixNN: A design for protecting deep learning models

Figure 4 for MixNN: A design for protecting deep learning models

Abstract:In this paper, we propose a novel design, called MixNN, for protecting deep learning model structure and parameters. The layers in a deep learning model of MixNN are fully decentralized. It hides communication address, layer parameters and operations, and forward as well as backward message flows among non-adjacent layers using the ideas from mix networks. MixNN has following advantages: 1) an adversary cannot fully control all layers of a model including the structure and parameters, 2) even some layers may collude but they cannot tamper with other honest layers, 3) model privacy is preserved in the training phase. We provide detailed descriptions for deployment. In one classification experiment, we compared a neural network deployed in a virtual machine with the same one using the MixNN design on the AWS EC2. The result shows that our MixNN retains less than 0.001 difference in terms of classification accuracy, while the whole running time of MixNN is about 7.5 times slower than the one running on a single virtual machine.

Via

Access Paper or Ask Questions

Accelerating Federated Learning with a Global Biased Optimiser

Aug 20, 2021

Jed Mills, Jia Hu, Geyong Min, Rui Jin, Siwei Zheng, Jin Wang

Figure 1 for Accelerating Federated Learning with a Global Biased Optimiser

Figure 2 for Accelerating Federated Learning with a Global Biased Optimiser

Figure 3 for Accelerating Federated Learning with a Global Biased Optimiser

Figure 4 for Accelerating Federated Learning with a Global Biased Optimiser

Abstract:Federated Learning (FL) is a recent development in the field of machine learning that collaboratively trains models without the training data leaving client devices, in order to preserve data-privacy. In realistic settings, the total training set is distributed over clients in a highly non-Independent and Identically Distributed (non-IID) fashion, which has been shown extensively to harm FL convergence speed and final model performance. We propose a novel, generalised approach for applying adaptive optimisation techniques to FL with the Federated Global Biased Optimiser (FedGBO) algorithm. FedGBO accelerates FL by applying a set of global biased optimiser values during the local training phase of FL, which helps to reduce `client-drift' from non-IID data, whilst also benefiting from adaptive momentum/learning-rate methods. We show that the FedGBO update with a generic optimiser can be viewed as a centralised update with biased gradients and optimiser update, and use this theoretical framework to prove the convergence of FedGBO using momentum-Stochastic Gradient Descent. We also perform extensive experiments using 4 realistic benchmark FL datasets and 3 popular adaptive optimisers to compare the performance of different adaptive-FL approaches, demonstrating that FedGBO has highly competitive performance considering its low communication and computation costs, and providing highly practical insights for the use of adaptive optimisation in FL.

Via

Access Paper or Ask Questions

Subspace-based compressive sensing algorithm for raypath separation in a shallow-water waveguide

Mar 26, 2021

Longyu Jiang, Zhe Zhang, Rui Jin, Xiao Zhou, Philippe Roux

Figure 1 for Subspace-based compressive sensing algorithm for raypath separation in a shallow-water waveguide

Figure 2 for Subspace-based compressive sensing algorithm for raypath separation in a shallow-water waveguide

Figure 3 for Subspace-based compressive sensing algorithm for raypath separation in a shallow-water waveguide

Figure 4 for Subspace-based compressive sensing algorithm for raypath separation in a shallow-water waveguide

Abstract:Compressive sensing (CS) has been applied to estimate the direction of arrival (DOA) in underwater acoustics. However, the key problem needed to be resolved in a {multipath} propagation environment is to suppress the interferences between the raypaths. Thus, in this paper, {a subspace-based compressive sensing algorithm that formulates the statistic information of the signal subspace in a CS framework is proposed.} The experiment results show that (1) the proposed algorithm enables the separation of raypaths that arrive closely at the {receiver} array and (2) the existing algorithms fail, especially in a low signal-to-noise ratio (SNR) environment.

Via

Access Paper or Ask Questions