Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hui Guo

DCD: A Semantic Segmentation Model for Fetal Ultrasound Four-Chamber View

Jun 10, 2025

Donglian Li, Hui Guo, Minglang Chen, Huizhen Chen, Jialing Chen, Bocheng Liang, Pengchen Liang, Ying Tan

Abstract:Accurate segmentation of anatomical structures in the apical four-chamber (A4C) view of fetal echocardiography is essential for early diagnosis and prenatal evaluation of congenital heart disease (CHD). However, precise segmentation remains challenging due to ultrasound artifacts, speckle noise, anatomical variability, and boundary ambiguity across different gestational stages. To reduce the workload of sonographers and enhance segmentation accuracy, we propose DCD, an advanced deep learning-based model for automatic segmentation of key anatomical structures in the fetal A4C view. Our model incorporates a Dense Atrous Spatial Pyramid Pooling (Dense ASPP) module, enabling superior multi-scale feature extraction, and a Convolutional Block Attention Module (CBAM) to enhance adaptive feature representation. By effectively capturing both local and global contextual information, DCD achieves precise and robust segmentation, contributing to improved prenatal cardiac assessment.

Via

Access Paper or Ask Questions

A Self-Learning Multimodal Approach for Fake News Detection

Dec 08, 2024

Hao Chen, Hui Guo, Baochen Hu, Shu Hu, Jinrong Hu, Siwei Lyu, Xi Wu, Xin Wang

Figure 1 for A Self-Learning Multimodal Approach for Fake News Detection

Figure 2 for A Self-Learning Multimodal Approach for Fake News Detection

Figure 3 for A Self-Learning Multimodal Approach for Fake News Detection

Figure 4 for A Self-Learning Multimodal Approach for Fake News Detection

Abstract:The rapid growth of social media has resulted in an explosion of online news content, leading to a significant increase in the spread of misleading or false information. While machine learning techniques have been widely applied to detect fake news, the scarcity of labeled datasets remains a critical challenge. Misinformation frequently appears as paired text and images, where a news article or headline is accompanied by a related visuals. In this paper, we introduce a self-learning multimodal model for fake news classification. The model leverages contrastive learning, a robust method for feature extraction that operates without requiring labeled data, and integrates the strengths of Large Language Models (LLMs) to jointly analyze both text and image features. LLMs are excel at this task due to their ability to process diverse linguistic data drawn from extensive training corpora. Our experimental results on a public dataset demonstrate that the proposed model outperforms several state-of-the-art classification approaches, achieving over 85% accuracy, precision, recall, and F1-score. These findings highlight the model's effectiveness in tackling the challenges of multimodal fake news detection.

Via

Access Paper or Ask Questions

Learning from Noisy Labels via Conditional Distributionally Robust Optimization

Nov 26, 2024

Hui Guo, Grace Y. Yi, Boyu Wang

Abstract:While crowdsourcing has emerged as a practical solution for labeling large datasets, it presents a significant challenge in learning accurate models due to noisy labels from annotators with varying levels of expertise. Existing methods typically estimate the true label posterior, conditioned on the instance and noisy annotations, to infer true labels or adjust loss functions. These estimates, however, often overlook potential misspecification in the true label posterior, which can degrade model performances, especially in high-noise scenarios. To address this issue, we investigate learning from noisy annotations with an estimated true label posterior through the framework of conditional distributionally robust optimization (CDRO). We propose formulating the problem as minimizing the worst-case risk within a distance-based ambiguity set centered around a reference distribution. By examining the strong duality of the formulation, we derive upper bounds for the worst-case risk and develop an analytical solution for the dual robust risk for each data point. This leads to a novel robust pseudo-labeling algorithm that leverages the likelihood ratio test to construct a pseudo-empirical distribution, providing a robust reference probability distribution in CDRO. Moreover, to devise an efficient algorithm for CDRO, we derive a closed-form expression for the empirical robust risk and the optimal Lagrange multiplier of the dual problem, facilitating a principled balance between robustness and model fitting. Our experimental results on both synthetic and real-world datasets demonstrate the superiority of our method.

Via

Access Paper or Ask Questions

Uncertainty-Aware Explainable Recommendation with Large Language Models

Jan 31, 2024

Yicui Peng, Hao Chen, Chingsheng Lin, Guo Huang, Jinrong Hu, Hui Guo, Bin Kong, Shu Hu, Xi Wu, Xin Wang

Figure 1 for Uncertainty-Aware Explainable Recommendation with Large Language Models

Figure 2 for Uncertainty-Aware Explainable Recommendation with Large Language Models

Figure 3 for Uncertainty-Aware Explainable Recommendation with Large Language Models

Figure 4 for Uncertainty-Aware Explainable Recommendation with Large Language Models

Abstract:Providing explanations within the recommendation system would boost user satisfaction and foster trust, especially by elaborating on the reasons for selecting recommended items tailored to the user. The predominant approach in this domain revolves around generating text-based explanations, with a notable emphasis on applying large language models (LLMs). However, refining LLMs for explainable recommendations proves impractical due to time constraints and computing resource limitations. As an alternative, the current approach involves training the prompt rather than the LLM. In this study, we developed a model that utilizes the ID vectors of user and item inputs as prompts for GPT-2. We employed a joint training mechanism within a multi-task learning framework to optimize both the recommendation task and explanation task. This strategy enables a more effective exploration of users' interests, improving recommendation effectiveness and user satisfaction. Through the experiments, our method achieving 1.59 DIV, 0.57 USR and 0.41 FCR on the Yelp, TripAdvisor and Amazon dataset respectively, demonstrates superior performance over four SOTA methods in terms of explainability evaluation metric. In addition, we identified that the proposed model is able to ensure stable textual quality on the three public datasets.

Via

Access Paper or Ask Questions

Object-Driven One-Shot Fine-tuning of Text-to-Image Diffusion with Prototypical Embedding

Jan 28, 2024

Jianxiang Lu, Cong Xie, Hui Guo

Abstract:As large-scale text-to-image generation models have made remarkable progress in the field of text-to-image generation, many fine-tuning methods have been proposed. However, these models often struggle with novel objects, especially with one-shot scenarios. Our proposed method aims to address the challenges of generalizability and fidelity in an object-driven way, using only a single input image and the object-specific regions of interest. To improve generalizability and mitigate overfitting, in our paradigm, a prototypical embedding is initialized based on the object's appearance and its class, before fine-tuning the diffusion model. And during fine-tuning, we propose a class-characterizing regularization to preserve prior knowledge of object classes. To further improve fidelity, we introduce object-specific loss, which can also use to implant multiple objects. Overall, our proposed object-driven method for implanting new objects can integrate seamlessly with existing concepts as well as with high fidelity and generalization. Our method outperforms several existing works. The code will be released.

Via

Access Paper or Ask Questions

FGSI: Distant Supervision for Relation Extraction method based on Fine-Grained Semantic Information

Feb 04, 2023

Chenghong Sun, Weidong Ji, Guohui Zhou, Hui Guo, Zengxiang Yin, Yuqi Yue

Abstract:The main purpose of relation extraction is to extract the semantic relationships between tagged pairs of entities in a sentence, which plays an important role in the semantic understanding of sentences and the construction of knowledge graphs. In this paper, we propose that the key semantic information within a sentence plays a key role in the relationship extraction of entities. We propose the hypothesis that the key semantic information inside the sentence plays a key role in entity relationship extraction. And based on this hypothesis, we split the sentence into three segments according to the location of the entity from the inside of the sentence, and find the fine-grained semantic features inside the sentence through the intra-sentence attention mechanism to reduce the interference of irrelevant noise information. The proposed relational extraction model can make full use of the available positive semantic information. The experimental results show that the proposed relation extraction model improves the accuracy-recall curves and P@N values compared with existing methods, which proves the effectiveness of this model.

Via

Access Paper or Ask Questions

Open-Eye: An Open Platform to Study Human Performance on Identifying AI-Synthesized Faces

May 13, 2022

Hui Guo, Shu Hu, Xin Wang, Ming-Ching Chang, Siwei Lyu

Figure 1 for Open-Eye: An Open Platform to Study Human Performance on Identifying AI-Synthesized Faces

Figure 2 for Open-Eye: An Open Platform to Study Human Performance on Identifying AI-Synthesized Faces

Figure 3 for Open-Eye: An Open Platform to Study Human Performance on Identifying AI-Synthesized Faces

Abstract:AI-synthesized faces are visually challenging to discern from real ones. They have been used as profile images for fake social media accounts, which leads to high negative social impacts. Although progress has been made in developing automatic methods to detect AI-synthesized faces, there is no open platform to study the human performance of AI-synthesized faces detection. In this work, we develop an online platform called Open-eye to study the human performance of AI-synthesized face detection. We describe the design and workflow of the Open-eye in this paper.

* Accepted by IEEE 5th International Conference on Multimedia Information Processing and Retrieval (MIPR), 2022

Via

Access Paper or Ask Questions

GAN-generated Faces Detection: A Survey and New Perspectives

Feb 15, 2022

Xin Wang, Hui Guo, Shu Hu, Ming-Ching Chang, Siwei Lyu

Figure 1 for GAN-generated Faces Detection: A Survey and New Perspectives

Figure 2 for GAN-generated Faces Detection: A Survey and New Perspectives

Figure 3 for GAN-generated Faces Detection: A Survey and New Perspectives

Abstract:Generative Adversarial Networks (GAN) have led to the generation of very realistic face images, which have been used in fake social media accounts and other disinformation matters that can generate profound impacts. Therefore, the corresponding GAN-face detection techniques are under active development that can examine and expose such fake faces. In this work, we aim to provide a comprehensive review of recent progress in GAN-face detection. We focus on methods that can detect face images that are generated or synthesized from GAN models. We classify the existing detection works into four categories: (1) deep learning-based, (2) physical-based, (3) physiological-based methods, and (4) evaluation and comparison against human visual performance. For each category, we summarize the key ideas and connect them with method implementations. We also discuss open problems and suggest future research directions.

Via

Access Paper or Ask Questions

Robust Attentive Deep Neural Network for Exposing GAN-generated Faces

Sep 05, 2021

Hui Guo, Shu Hu, Xin Wang, Ming-Ching Chang, Siwei Lyu

Figure 1 for Robust Attentive Deep Neural Network for Exposing GAN-generated Faces

Figure 2 for Robust Attentive Deep Neural Network for Exposing GAN-generated Faces

Figure 3 for Robust Attentive Deep Neural Network for Exposing GAN-generated Faces

Figure 4 for Robust Attentive Deep Neural Network for Exposing GAN-generated Faces

Abstract:GAN-based techniques that generate and synthesize realistic faces have caused severe social concerns and security problems. Existing methods for detecting GAN-generated faces can perform well on limited public datasets. However, images from existing public datasets do not represent real-world scenarios well enough in terms of view variations and data distributions (where real faces largely outnumber synthetic faces). The state-of-the-art methods do not generalize well in real-world problems and lack the interpretability of detection results. Performance of existing GAN-face detection models degrades significantly when facing imbalanced data distributions. To address these shortcomings, we propose a robust, attentive, end-to-end network that can spot GAN-generated faces by analyzing their eye inconsistencies. Specifically, our model learns to identify inconsistent eye components by localizing and comparing the iris artifacts between the two eyes automatically. Our deep network addresses the imbalance learning issues by considering the AUC loss and the traditional cross-entropy loss jointly. Comprehensive evaluations of the FFHQ dataset in terms of both balanced and imbalanced scenarios demonstrate the superiority of the proposed method.

Via

Access Paper or Ask Questions

Eyes Tell All: Irregular Pupil Shapes Reveal GAN-generated Faces

Sep 01, 2021

Hui Guo, Shu Hu, Xin Wang, Ming-Ching Chang, Siwei Lyu

Figure 1 for Eyes Tell All: Irregular Pupil Shapes Reveal GAN-generated Faces

Figure 2 for Eyes Tell All: Irregular Pupil Shapes Reveal GAN-generated Faces

Figure 3 for Eyes Tell All: Irregular Pupil Shapes Reveal GAN-generated Faces

Figure 4 for Eyes Tell All: Irregular Pupil Shapes Reveal GAN-generated Faces

Abstract:Generative adversary network (GAN) generated high-realistic human faces have been used as profile images for fake social media accounts and are visually challenging to discern from real ones. In this work, we show that GAN-generated faces can be exposed via irregular pupil shapes. This phenomenon is caused by the lack of physiological constraints in the GAN models. We demonstrate that such artifacts exist widely in high-quality GAN-generated faces and further describe an automatic method to extract the pupils from two eyes and analysis their shapes for exposing the GAN-generated faces. Qualitative and quantitative evaluations of our method suggest its simplicity and effectiveness in distinguishing GAN-generated faces.

Via

Access Paper or Ask Questions