Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rui Han

Shining Yourself: High-Fidelity Ornaments Virtual Try-on with Diffusion Model

Mar 20, 2025

Yingmao Miao, Zhanpeng Huang, Rui Han, Zibin Wang, Chenhao Lin, Chao Shen

Abstract:While virtual try-on for clothes and shoes with diffusion models has gained attraction, virtual try-on for ornaments, such as bracelets, rings, earrings, and necklaces, remains largely unexplored. Due to the intricate tiny patterns and repeated geometric sub-structures in most ornaments, it is much more difficult to guarantee identity and appearance consistency under large pose and scale variances between ornaments and models. This paper proposes the task of virtual try-on for ornaments and presents a method to improve the geometric and appearance preservation of ornament virtual try-ons. Specifically, we estimate an accurate wearing mask to improve the alignments between ornaments and models in an iterative scheme alongside the denoising process. To preserve structure details, we further regularize attention layers to map the reference ornament mask to the wearing mask in an implicit way. Experimental results demonstrate that our method successfully wears ornaments from reference images onto target models, handling substantial differences in scale and pose while preserving identity and achieving realistic visual effects.

Via

Access Paper or Ask Questions

Let Curves Speak: A Continuous Glucose Monitor based Large Sensor Foundation Model for Diabetes Management

Dec 12, 2024

Junjie Luo, Abhimanyu Kumbara, Mansur Shomali, Rui Han, Anand Iyer, Ritu Agarwal, Gordon Gao

Figure 1 for Let Curves Speak: A Continuous Glucose Monitor based Large Sensor Foundation Model for Diabetes Management

Figure 2 for Let Curves Speak: A Continuous Glucose Monitor based Large Sensor Foundation Model for Diabetes Management

Figure 3 for Let Curves Speak: A Continuous Glucose Monitor based Large Sensor Foundation Model for Diabetes Management

Figure 4 for Let Curves Speak: A Continuous Glucose Monitor based Large Sensor Foundation Model for Diabetes Management

Abstract:While previous studies of AI in diabetes management focus on long-term risk, research on near-future glucose prediction remains limited but important as it enables timely diabetes self-management. Integrating AI with continuous glucose monitoring (CGM) holds promise for near-future glucose prediction. However, existing models have limitations in capturing patterns of blood glucose fluctuations and demonstrate poor generalizability. A robust approach is needed to leverage massive CGM data for near-future glucose prediction. We propose large sensor models (LSMs) to capture knowledge in CGM data by modeling patients as sequences of glucose. CGM-LSM is pretrained on 15.96 million glucose records from 592 diabetes patients for near-future glucose prediction. We evaluated CGM-LSM against state-of-the-art methods using the OhioT1DM dataset across various metrics, prediction horizons, and unseen patients. Additionally, we assessed its generalizability across factors like diabetes type, age, gender, and hour of day. CGM-LSM achieved exceptional performance, with an rMSE of 29.81 mg/dL for type 1 diabetes patients and 23.49 mg/dL for type 2 diabetes patients in a two-hour prediction horizon. For the OhioT1DM dataset, CGM-LSM achieved a one-hour rMSE of 15.64 mg/dL, halving the previous best of 31.97 mg/dL. Robustness analyses revealed consistent performance not only for unseen patients and future periods, but also across diabetes type, age, and gender. The model demonstrated adaptability to different hours of day, maintaining accuracy across periods of various activity intensity levels. CGM-LSM represents a transformative step in diabetes management by leveraging pretraining to uncover latent glucose generation patterns in sensor data. Our findings also underscore the broader potential of LSMs to drive innovation across domains involving complex sensor data.

Via

Access Paper or Ask Questions

RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose

Mar 13, 2023

Tao Jiang, Peng Lu, Li Zhang, Ningsheng Ma, Rui Han, Chengqi Lyu, Yining Li, Kai Chen

Figure 1 for RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose

Figure 2 for RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose

Figure 3 for RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose

Figure 4 for RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose

Abstract:Recent studies on 2D pose estimation have achieved excellent performance on public benchmarks, yet its application in the industrial community still suffers from heavy model parameters and high latency. In order to bridge this gap, we empirically explore key factors in pose estimation including paradigm, model architecture, training strategy, and deployment, and present a high-performance real-time multi-person pose estimation framework, RTMPose, based on MMPose. Our RTMPose-m achieves 75.8% AP on COCO with 90+ FPS on an Intel i7-11700 CPU and 430+ FPS on an NVIDIA GTX 1660 Ti GPU, and RTMPose-l achieves 67.0% AP on COCO-WholeBody with 130+ FPS. To further evaluate RTMPose's capability in critical real-time applications, we also report the performance after deploying on the mobile device. Our RTMPose-s achieves 72.2% AP on COCO with 70+ FPS on a Snapdragon 865 chip, outperforming existing open-source libraries. Code and models are released at https://github.com/open-mmlab/mmpose/tree/1.x/projects/rtmpose.

Via

Access Paper or Ask Questions

FedKNOW: Federated Continual Learning with Signature Task Knowledge Integration at Edge

Dec 04, 2022

Yaxin Luopan, Rui Han, Qinglong Zhang, Chi Harold Liu, Guoren Wang

Figure 1 for FedKNOW: Federated Continual Learning with Signature Task Knowledge Integration at Edge

Figure 2 for FedKNOW: Federated Continual Learning with Signature Task Knowledge Integration at Edge

Figure 3 for FedKNOW: Federated Continual Learning with Signature Task Knowledge Integration at Edge

Figure 4 for FedKNOW: Federated Continual Learning with Signature Task Knowledge Integration at Edge

Abstract:Deep Neural Networks (DNNs) have been ubiquitously adopted in internet of things and are becoming an integral of our daily life. When tackling the evolving learning tasks in real world, such as classifying different types of objects, DNNs face the challenge to continually retrain themselves according to the tasks on different edge devices. Federated continual learning is a promising technique that offers partial solutions but yet to overcome the following difficulties: the significant accuracy loss due to the limited on-device processing, the negative knowledge transfer caused by the limited communication of non-IID data, and the limited scalability on the tasks and edge devices. In this paper, we propose FedKNOW, an accurate and scalable federated continual learning framework, via a novel concept of signature task knowledge. FedKNOW is a client side solution that continuously extracts and integrates the knowledge of signature tasks which are highly influenced by the current task. Each client of FedKNOW is composed of a knowledge extractor, a gradient restorer and, most importantly, a gradient integrator. Upon training for a new task, the gradient integrator ensures the prevention of catastrophic forgetting and mitigation of negative knowledge transfer by effectively combining signature tasks identified from the past local tasks and other clients' current tasks through the global model. We implement FedKNOW in PyTorch and extensively evaluate it against state-of-the-art techniques using popular federated continual learning benchmarks. Extensive evaluation results on heterogeneous edge devices show that FedKNOW improves model accuracy by 63.24% without increasing model training time, reduces communication cost by 34.28%, and achieves more improvements under difficult scenarios such as large numbers of tasks or clients, and training different complex networks.

Via

Access Paper or Ask Questions

Atrial Fibrillation Detection Using Weight-Pruned, Log-Quantised Convolutional Neural Networks

Jun 14, 2022

Xiu Qi Chang, Ann Feng Chew, Benjamin Chen Ming Choong, Shuhui Wang, Rui Han, Wang He, Li Xiaolin, Rajesh C. Panicker, Deepu John

Figure 1 for Atrial Fibrillation Detection Using Weight-Pruned, Log-Quantised Convolutional Neural Networks

Figure 2 for Atrial Fibrillation Detection Using Weight-Pruned, Log-Quantised Convolutional Neural Networks

Figure 3 for Atrial Fibrillation Detection Using Weight-Pruned, Log-Quantised Convolutional Neural Networks

Figure 4 for Atrial Fibrillation Detection Using Weight-Pruned, Log-Quantised Convolutional Neural Networks

Abstract:Deep neural networks (DNN) are a promising tool in medical applications. However, the implementation of complex DNNs on battery-powered devices is challenging due to high energy costs for communication. In this work, a convolutional neural network model is developed for detecting atrial fibrillation from electrocardiogram (ECG) signals. The model demonstrates high performance despite being trained on limited, variable-length input data. Weight pruning and logarithmic quantisation are combined to introduce sparsity and reduce model size, which can be exploited for reduced data movement and lower computational complexity. The final model achieved a 91.1% model compression ratio while maintaining high model accuracy of 91.7% and less than 1% loss.

Via

Access Paper or Ask Questions

LegoDNN: Block-grained Scaling of Deep Neural Networks for Mobile Vision

Dec 18, 2021

Rui Han, Qinglong Zhang, Chi Harold Liu, Guoren Wang, Jian Tang, Lydia Y. Chen

Figure 1 for LegoDNN: Block-grained Scaling of Deep Neural Networks for Mobile Vision

Figure 2 for LegoDNN: Block-grained Scaling of Deep Neural Networks for Mobile Vision

Figure 3 for LegoDNN: Block-grained Scaling of Deep Neural Networks for Mobile Vision

Figure 4 for LegoDNN: Block-grained Scaling of Deep Neural Networks for Mobile Vision

Abstract:Deep neural networks (DNNs) have become ubiquitous techniques in mobile and embedded systems for applications such as image/object recognition and classification. The trend of executing multiple DNNs simultaneously exacerbate the existing limitations of meeting stringent latency/accuracy requirements on resource constrained mobile devices. The prior art sheds light on exploring the accuracy-resource tradeoff by scaling the model sizes in accordance to resource dynamics. However, such model scaling approaches face to imminent challenges: (i) large space exploration of model sizes, and (ii) prohibitively long training time for different model combinations. In this paper, we present LegoDNN, a lightweight, block-grained scaling solution for running multi-DNN workloads in mobile vision systems. LegoDNN guarantees short model training times by only extracting and training a small number of common blocks (e.g. 5 in VGG and 8 in ResNet) in a DNN. At run-time, LegoDNN optimally combines the descendant models of these blocks to maximize accuracy under specific resources and latency constraints, while reducing switching overhead via smart block-level scaling of the DNN. We implement LegoDNN in TensorFlow Lite and extensively evaluate it against state-of-the-art techniques (FLOP scaling, knowledge distillation and model compression) using a set of 12 popular DNN models. Evaluation results show that LegoDNN provides 1,296x to 279,936x more options in model sizes without increasing training time, thus achieving as much as 31.74% improvement in inference accuracy and 71.07% reduction in scaling energy consumptions.

* In MobiCom'21, pages 406-419, 2021. ACM
* 13 pages, 15 figures

Via

Access Paper or Ask Questions

Enhancing Robustness of On-line Learning Models on Highly Noisy Data

Mar 19, 2021

Zilong Zhao, Robert Birke, Rui Han, Bogdan Robu, Sara Bouchenak, Sonia Ben Mokhtar, Lydia Y. Chen

Figure 1 for Enhancing Robustness of On-line Learning Models on Highly Noisy Data

Figure 2 for Enhancing Robustness of On-line Learning Models on Highly Noisy Data

Figure 3 for Enhancing Robustness of On-line Learning Models on Highly Noisy Data

Figure 4 for Enhancing Robustness of On-line Learning Models on Highly Noisy Data

Abstract:Classification algorithms have been widely adopted to detect anomalies for various systems, e.g., IoT, cloud and face recognition, under the common assumption that the data source is clean, i.e., features and labels are correctly set. However, data collected from the wild can be unreliable due to careless annotations or malicious data transformation for incorrect anomaly detection. In this paper, we extend a two-layer on-line data selection framework: Robust Anomaly Detector (RAD) with a newly designed ensemble prediction where both layers contribute to the final anomaly detection decision. To adapt to the on-line nature of anomaly detection, we consider additional features of conflicting opinions of classifiers, repetitive cleaning, and oracle knowledge. We on-line learn from incoming data streams and continuously cleanse the data, so as to adapt to the increasing learning capacity from the larger accumulated data set. Moreover, we explore the concept of oracle learning that provides additional information of true labels for difficult data points. We specifically focus on three use cases, (i) detecting 10 classes of IoT attacks, (ii) predicting 4 classes of task failures of big data jobs, and (iii) recognising 100 celebrities faces. Our evaluation results show that RAD can robustly improve the accuracy of anomaly detection, to reach up to 98.95% for IoT device attacks (i.e., +7%), up to 85.03% for cloud task failures (i.e., +14%) under 40% label noise, and for its extension, it can reach up to 77.51% for face recognition (i.e., +39%) under 30% label noise. The proposed RAD and its extensions are general and can be applied to different anomaly detection algorithms.

* Published in IEEE Transactions on Dependable and Secure Computing. arXiv admin note: substantial text overlap with arXiv:1911.04383

Via

Access Paper or Ask Questions

ExpertNet: Adversarial Learning and Recovery Against Noisy Labels

Jul 13, 2020

Amirmasoud Ghiassi, Robert Birke, Rui Han, Lydia Y. Chen

Figure 1 for ExpertNet: Adversarial Learning and Recovery Against Noisy Labels

Figure 2 for ExpertNet: Adversarial Learning and Recovery Against Noisy Labels

Figure 3 for ExpertNet: Adversarial Learning and Recovery Against Noisy Labels

Figure 4 for ExpertNet: Adversarial Learning and Recovery Against Noisy Labels

Abstract:Today's available datasets in the wild, e.g., from social media and open platforms, present tremendous opportunities and challenges for deep learning, as there is a significant portion of tagged images, but often with noisy, i.e. erroneous, labels. Recent studies improve the robustness of deep models against noisy labels without the knowledge of true labels. In this paper, we advocate to derive a stronger classifier which proactively makes use of the noisy labels in addition to the original images - turning noisy labels into learning features. To such an end, we propose a novel framework, ExpertNet, composed of Amateur and Expert, which iteratively learn from each other. Amateur is a regular image classifier trained by the feedback of Expert, which imitates how human experts would correct the predicted labels from Amateur using the noise pattern learnt from the knowledge of both the noisy and ground truth labels. The trained Amateur and Expert proactively leverage the images and their noisy labels to infer image classes. Our empirical evaluations on noisy versions of CIFAR-10, CIFAR-100 and real-world data of Clothing1M show that the proposed model can achieve robust classification against a wide range of noise ratios and with as little as 20-50% training data, compared to state-of-the-art deep models that solely focus on distilling the impact of noisy labels.

Via

Access Paper or Ask Questions

RAD: On-line Anomaly Detection for Highly Unreliable Data

Nov 11, 2019

Zilong Zhao, Robert Birke, Rui Han, Bogdan Robu, Sara Bouchenak, Sonia Ben Mokhtar, Lydia Y. Chen

Figure 1 for RAD: On-line Anomaly Detection for Highly Unreliable Data

Figure 2 for RAD: On-line Anomaly Detection for Highly Unreliable Data

Figure 3 for RAD: On-line Anomaly Detection for Highly Unreliable Data

Figure 4 for RAD: On-line Anomaly Detection for Highly Unreliable Data

Abstract:Classification algorithms have been widely adopted to detect anomalies for various systems, e.g., IoT, cloud and face recognition, under the common assumption that the data source is clean, i.e., features and labels are correctly set. However, data collected from the wild can be unreliable due to careless annotations or malicious data transformation for incorrect anomaly detection. In this paper, we present a two-layer on-line learning framework for robust anomaly detection (RAD) in the presence of unreliable anomaly labels, where the first layer is to filter out the suspicious data, and the second layer detects the anomaly patterns from the remaining data. To adapt to the on-line nature of anomaly detection, we extend RAD with additional features of repetitively cleaning, conflicting opinions of classifiers, and oracle knowledge. We on-line learn from the incoming data streams and continuously cleanse the data, so as to adapt to the increasing learning capacity from the larger accumulated data set. Moreover, we explore the concept of oracle learning that provides additional information of true labels for difficult data points. We specifically focus on three use cases, (i) detecting 10 classes of IoT attacks, (ii) predicting 4 classes of task failures of big data jobs, (iii) recognising 20 celebrities faces. Our evaluation results show that RAD can robustly improve the accuracy of anomaly detection, to reach up to 98% for IoT device attacks (i.e., +11%), up to 84% for cloud task failures (i.e., +20%) under 40% noise, and up to 74% for face recognition (i.e., +28%) under 30% noisy labels. The proposed RAD is general and can be applied to different anomaly detection algorithms.

Via

Access Paper or Ask Questions

Facial Makeup Transfer Combining Illumination Transfer

Jul 08, 2019

Xin Jin, Rui Han, Ning Ning, Xiaodong Li, Xiaokun Zhang

Figure 1 for Facial Makeup Transfer Combining Illumination Transfer

Figure 2 for Facial Makeup Transfer Combining Illumination Transfer

Figure 3 for Facial Makeup Transfer Combining Illumination Transfer

Figure 4 for Facial Makeup Transfer Combining Illumination Transfer

Abstract:To meet the women appearance needs, we present a novel virtual experience approach of facial makeup transfer, developed into windows platform application software. The makeup effects could present on the user's input image in real time, with an only single reference image. The input image and reference image are divided into three layers by facial feature points landmarked: facial structure layer, facial color layer, and facial detail layer. Except for the above layers are processed by different algorithms to generate output image, we also add illumination transfer, so that the illumination effect of the reference image is automatically transferred to the input image. Our approach has the following three advantages: (1) Black or dark and white facial makeup could be effectively transferred by introducing illumination transfer; (2) Efficiently transfer facial makeup within seconds compared to those methods based on deep learning frameworks; (3) Reference images with the air-bangs could transfer makeup perfectly.

* in IEEE Access, vol. 7, pp. 80928-80936, 2019
* IEEE Access, conference short version: ISAIR2019

Via

Access Paper or Ask Questions