Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rui-Yang Ju

ToonifyGB: StyleGAN-based Gaussian Blendshapes for 3D Stylized Head Avatars

May 15, 2025

Rui-Yang Ju, Sheng-Yen Huang, Yi-Ping Hung

Abstract:The introduction of 3D Gaussian blendshapes has enabled the real-time reconstruction of animatable head avatars from monocular video. Toonify, a StyleGAN-based framework, has become widely used for facial image stylization. To extend Toonify for synthesizing diverse stylized 3D head avatars using Gaussian blendshapes, we propose an efficient two-stage framework, ToonifyGB. In Stage 1 (stylized video generation), we employ an improved StyleGAN to generate the stylized video from the input video frames, which addresses the limitation of cropping aligned faces at a fixed resolution as preprocessing for normal StyleGAN. This process provides a more stable video, which enables Gaussian blendshapes to better capture the high-frequency details of the video frames, and efficiently generate high-quality animation in the next stage. In Stage 2 (Gaussian blendshapes synthesis), we learn a stylized neutral head model and a set of expression blendshapes from the generated video. By combining the neutral head model with expression blendshapes, ToonifyGB can efficiently render stylized avatars with arbitrary expressions. We validate the effectiveness of ToonifyGB on the benchmark dataset using two styles: Arcane and Pixar.

Via

Access Paper or Ask Questions

FCE-YOLOv8: YOLOv8 with Feature Context Excitation Modules for Fracture Detection in Pediatric Wrist X-ray Images

Oct 01, 2024

Rui-Yang Ju, Chun-Tse Chien, Enkaer Xieerke, Jen-Shiun Chiang

Abstract:Children often suffer wrist trauma in daily life, while they usually need radiologists to analyze and interpret X-ray images before surgical treatment by surgeons. The development of deep learning has enabled neural networks to serve as computer-assisted diagnosis (CAD) tools to help doctors and experts in medical image diagnostics. Since the You Only Look Once Version-8 (YOLOv8) model has obtained the satisfactory success in object detection tasks, it has been applied to various fracture detection. This work introduces four variants of Feature Contexts Excitation-YOLOv8 (FCE-YOLOv8) model, each incorporating a different FCE module (i.e., modules of Squeeze-and-Excitation (SE), Global Context (GC), Gather-Excite (GE), and Gaussian Context Transformer (GCT)) to enhance the model performance. Experimental results on GRAZPEDWRI-DX dataset demonstrate that our proposed YOLOv8+GC-M3 model improves the mAP@50 value from 65.78% to 66.32%, outperforming the state-of-the-art (SOTA) model while reducing inference time. Furthermore, our proposed YOLOv8+SE-M3 model achieves the highest mAP@50 value of 67.07%, exceeding the SOTA performance. The implementation of this work is available at https://github.com/RuiyangJu/FCE-YOLOv8.

* arXiv admin note: text overlap with arXiv:2407.03163

Via

Access Paper or Ask Questions

YOLOv8-ResCBAM: YOLOv8 Based on An Effective Attention Module for Pediatric Wrist Fracture Detection

Sep 27, 2024

Rui-Yang Ju, Chun-Tse Chien, Jen-Shiun Chiang

Abstract:Wrist trauma and even fractures occur frequently in daily life, particularly among children who account for a significant proportion of fracture cases. Before performing surgery, surgeons often request patients to undergo X-ray imaging first, and prepare for the surgery based on the analysis of the X-ray images. With the development of neural networks, You Only Look Once (YOLO) series models have been widely used in fracture detection for Computer-Assisted Diagnosis, where the YOLOv8 model has obtained the satisfactory results. Applying the attention modules to neural networks is one of the effective methods to improve the model performance. This paper proposes YOLOv8-ResCBAM, which incorporates Convolutional Block Attention Module integrated with resblock (ResCBAM) into the original YOLOv8 network architecture. The experimental results on the GRAZPEDWRI-DX dataset demonstrate that the mean Average Precision calculated at Intersection over Union threshold of 0.5 (mAP 50) of the proposed model increased from 63.6% of the original YOLOv8 model to 65.8%, which achieves the state-of-the-art performance. The implementation code is available at https://github.com/RuiyangJu/Fracture_Detection_Improved_YOLOv8.

* Accepted by ICONIP 2024. arXiv admin note: substantial text overlap with arXiv:2402.09329

Via

Access Paper or Ask Questions

ORB-SfMLearner: ORB-Guided Self-supervised Visual Odometry with Selective Online Adaptation

Sep 18, 2024

Yanlin Jin, Rui-Yang Ju, Haojun Liu, Yuzhong Zhong

Figure 1 for ORB-SfMLearner: ORB-Guided Self-supervised Visual Odometry with Selective Online Adaptation

Figure 2 for ORB-SfMLearner: ORB-Guided Self-supervised Visual Odometry with Selective Online Adaptation

Figure 3 for ORB-SfMLearner: ORB-Guided Self-supervised Visual Odometry with Selective Online Adaptation

Figure 4 for ORB-SfMLearner: ORB-Guided Self-supervised Visual Odometry with Selective Online Adaptation

Abstract:Deep visual odometry, despite extensive research, still faces limitations in accuracy and generalizability that prevent its broader application. To address these challenges, we propose an Oriented FAST and Rotated BRIEF (ORB)-guided visual odometry with selective online adaptation named ORB-SfMLearner. We present a novel use of ORB features for learning-based ego-motion estimation, leading to more robust and accurate results. We also introduce the cross-attention mechanism to enhance the explainability of PoseNet and have revealed that driving direction of the vehicle can be explained through attention weights, marking a novel exploration in this area. To improve generalizability, our selective online adaptation allows the network to rapidly and selectively adjust to the optimal parameters across different domains. Experimental results on KITTI and vKITTI datasets show that our method outperforms previous state-of-the-art deep visual odometry methods in terms of ego-motion accuracy and generalizability.

Via

Access Paper or Ask Questions

Efficient GANs for Document Image Binarization Based on DWT and Normalization

Jul 05, 2024

Rui-Yang Ju, KokSheik Wong, Jen-Shiun Chiang

Abstract:For document image binarization task, generative adversarial networks (GANs) can generate images where shadows and noise are effectively removed, which allow for text information extraction. The current state-of-the-art (SOTA) method proposes a three-stage network architecture that utilizes six GANs. Despite its excellent model performance, the SOTA network architecture requires long training and inference times. To overcome this problem, this work introduces an efficient GAN method based on the three-stage network architecture that incorporates the Discrete Wavelet Transformation and normalization to reduce the input image size, which in turns, decrease both training and inference times. In addition, this work presents novel generators, discriminators, and loss functions to improve the model's performance. Experimental results show that the proposed method reduces the training time by 10% and the inference time by 26% when compared to the SOTA method while maintaining the model performance at 73.79 of Avg-Score. Our implementation code is available on GitHub at https://github.com/RuiyangJu/Efficient_Document_Image_Binarization.

Via

Access Paper or Ask Questions

Global Context Modeling in YOLOv8 for Pediatric Wrist Fracture Detection

Jul 03, 2024

Rui-Yang Ju, Chun-Tse Chien, Chia-Min Lin, Jen-Shiun Chiang

Figure 1 for Global Context Modeling in YOLOv8 for Pediatric Wrist Fracture Detection

Figure 2 for Global Context Modeling in YOLOv8 for Pediatric Wrist Fracture Detection

Figure 3 for Global Context Modeling in YOLOv8 for Pediatric Wrist Fracture Detection

Figure 4 for Global Context Modeling in YOLOv8 for Pediatric Wrist Fracture Detection

Abstract:Children often suffer wrist injuries in daily life, while fracture injuring radiologists usually need to analyze and interpret X-ray images before surgical treatment by surgeons. The development of deep learning has enabled neural network models to work as computer-assisted diagnosis (CAD) tools to help doctors and experts in diagnosis. Since the YOLOv8 models have obtained the satisfactory success in object detection tasks, it has been applied to fracture detection. The Global Context (GC) block effectively models the global context in a lightweight way, and incorporating it into YOLOv8 can greatly improve the model performance. This paper proposes the YOLOv8+GC model for fracture detection, which is an improved version of the YOLOv8 model with the GC block. Experimental results demonstrate that compared to the original YOLOv8 model, the proposed YOLOv8-GC model increases the mean average precision calculated at intersection over union threshold of 0.5 (mAP 50) from 63.58% to 66.32% on the GRAZPEDWRI-DX dataset, achieving the state-of-the-art (SOTA) level. The implementation code for this work is available on GitHub at https://github.com/RuiyangJu/YOLOv8_Global_Context_Fracture_Detection.

Via

Access Paper or Ask Questions

FAD-SAR: A Novel Fishing Activity Detection System via Synthetic Aperture Radar Images Based on Deep Learning Method

Apr 28, 2024

Yanbing Bai, Rui-Yang Ju, Siao Li, Zihao Yang, Jinze Yu

Abstract:Illegal, unreported, and unregulated (IUU) fishing seriously affects various aspects of human life. However, current methods for detecting and monitoring IUU activities at sea have limitations. While Synthetic Aperture Radar (SAR) can complement existing vessel detection systems, extracting useful information from SAR images using traditional methods, especially for IUU fishing identification, poses challenges. This paper proposes a deep learning-based system for detecting fishing activities. We implemented this system on the xView3 dataset using six classical object detection models: Faster R-CNN, Cascade R-CNN, SSD, RetinaNet, FSAF, and FCOS. We applied improvement methods to enhance the performance of the Faster R-CNN model. Specifically, training the Faster R-CNN model using Online Hard Example Mining (OHEM) strategy improved the Avg-F1 value from 0.212 to 0.216, representing a 1.96% improvement.

Via

Access Paper or Ask Questions

Flood Data Analysis on SpaceNet 8 Using Apache Sedona

Apr 28, 2024

Yanbing Bai, Zihao Yang, Jinze Yu, Rui-Yang Ju, Bin Yang, Erick Mas, Shunichi Koshimura

Figure 1 for Flood Data Analysis on SpaceNet 8 Using Apache Sedona

Figure 2 for Flood Data Analysis on SpaceNet 8 Using Apache Sedona

Figure 3 for Flood Data Analysis on SpaceNet 8 Using Apache Sedona

Figure 4 for Flood Data Analysis on SpaceNet 8 Using Apache Sedona

Abstract:With the escalating frequency of floods posing persistent threats to human life and property, satellite remote sensing has emerged as an indispensable tool for monitoring flood hazards. SpaceNet8 offers a unique opportunity to leverage cutting-edge artificial intelligence technologies to assess these hazards. A significant contribution of this research is its application of Apache Sedona, an advanced platform specifically designed for the efficient and distributed processing of large-scale geospatial data. This platform aims to enhance the efficiency of error analysis, a critical aspect of improving flood damage detection accuracy. Based on Apache Sedona, we introduce a novel approach that addresses the challenges associated with inaccuracies in flood damage detection. This approach involves the retrieval of cases from historical flood events, the adaptation of these cases to current scenarios, and the revision of the model based on clustering algorithms to refine its performance. Through the replication of both the SpaceNet8 baseline and its top-performing models, we embark on a comprehensive error analysis. This analysis reveals several main sources of inaccuracies. To address these issues, we employ data visual interpretation and histogram equalization techniques, resulting in significant improvements in model metrics. After these enhancements, our indicators show a notable improvement, with precision up by 5%, F1 score by 2.6%, and IoU by 4.5%. This work highlights the importance of advanced geospatial data processing tools, such as Apache Sedona. By improving the accuracy and efficiency of flood detection, this research contributes to safeguarding public safety and strengthening infrastructure resilience in flood-prone areas, making it a valuable addition to the field of remote sensing and disaster management.

Via

Access Paper or Ask Questions

YOLOv9 for Fracture Detection in Pediatric Wrist Trauma X-ray Images

Mar 17, 2024

Chun-Tse Chien, Rui-Yang Ju, Kuang-Yi Chou, Jen-Shiun Chiang

Abstract:The introduction of YOLOv9, the latest version of the You Only Look Once (YOLO) series, has led to its widespread adoption across various scenarios. This paper is the first to apply the YOLOv9 algorithm model to the fracture detection task as computer-assisted diagnosis (CAD) to help radiologists and surgeons to interpret X-ray images. Specifically, this paper trained the model on the GRAZPEDWRI-DX dataset and extended the training set using data augmentation techniques to improve the model performance. Experimental results demonstrate that compared to the mAP 50-95 of the current state-of-the-art (SOTA) model, the YOLOv9 model increased the value from 42.16% to 43.73%, with an improvement of 3.7%. The implementation code is publicly available at https://github.com/RuiyangJu/YOLOv9-Fracture-Detection.

Via

Access Paper or Ask Questions

YOLOv8-AM: YOLOv8 with Attention Mechanisms for Pediatric Wrist Fracture Detection

Feb 17, 2024

Chun-Tse Chien, Rui-Yang Ju, Kuang-Yi Chou, Jen-Shiun Chiang

Abstract:Wrist trauma and even fractures occur frequently in daily life, particularly among children who account for a significant proportion of fracture cases. Before performing surgery, surgeons often request patients to undergo X-ray imaging first and prepare for it based on the analysis of the radiologist. With the development of neural networks, You Only Look Once (YOLO) series models have been widely used in fracture detection as computer-assisted diagnosis (CAD). In 2023, Ultralytics presented the latest version of the YOLO models, which has been employed for detecting fractures across various parts of the body. Attention mechanism is one of the hottest methods to improve the model performance. This research work proposes YOLOv8-AM, which incorporates the attention mechanism into the original YOLOv8 architecture. Specifically, we respectively employ four attention modules, Convolutional Block Attention Module (CBAM), Global Attention Mechanism (GAM), Efficient Channel Attention (ECA), and Shuffle Attention (SA), to design the improved models and train them on GRAZPEDWRI-DX dataset. Experimental results demonstrate that the mean Average Precision at IoU 50 (mAP 50) of the YOLOv8-AM model based on ResBlock + CBAM (ResCBAM) increased from 63.6% to 65.8%, which achieves the state-of-the-art (SOTA) performance. Conversely, YOLOv8-AM model incorporating GAM obtains the mAP 50 value of 64.2%, which is not a satisfactory enhancement. Therefore, we combine ResBlock and GAM, introducing ResGAM to design another new YOLOv8-AM model, whose mAP 50 value is increased to 65.0%.

Via

Access Paper or Ask Questions