Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuying Li

MathClean: A Benchmark for Synthetic Mathematical Data Cleaning

Feb 26, 2025

Hao Liang, Meiyi Qiang, Yuying Li, Zefeng He, Yongzhen Guo, Zhengzhou Zhu, Wentao Zhang, Bin Cui

Abstract:With the rapid development of large language models (LLMs), the quality of training data has become crucial. Among the various types of training data, mathematical data plays a key role in enabling LLMs to acquire strong reasoning abilities. While high-quality open-source data is important, it is often insufficient for pre-training, necessitating the addition of synthetic math problems. However, synthetic math questions and answers can introduce inaccuracies, which may degrade both the training data and web data. Therefore, an effective method for cleaning synthetic math data is essential. In this paper, we propose the MathClean benchmark to evaluate the effectiveness of math data cleaning models. The MathClean benchmark consists of 2,000 correct questions and 2,000 erroneous questions with additional 2,000 correct and erroneous answers sourced from augmented data based on GSM8K and MATH. Moreover, we also annotate error types for each question or answer, since it can assess whether models can correctly identify the error categories for future improvements. Finally, we present comprehensive evaluations using state-of-the-art (SOTA) models. Our results demonstrate that even strong models like GPT-o1 and DeepSeek-R1 perform poorly on this benchmark, highlighting the utility of MathClean. Our code and data is available at https://github.com/YuYingLi0/MathClean.

Via

Access Paper or Ask Questions

Seeing Is Believing: Black-Box Membership Inference Attacks Against Retrieval Augmented Generation

Jun 27, 2024

Yuying Li, Gaoyang Liu, Yang Yang, Chen Wang

Figure 1 for Seeing Is Believing: Black-Box Membership Inference Attacks Against Retrieval Augmented Generation

Figure 2 for Seeing Is Believing: Black-Box Membership Inference Attacks Against Retrieval Augmented Generation

Figure 3 for Seeing Is Believing: Black-Box Membership Inference Attacks Against Retrieval Augmented Generation

Figure 4 for Seeing Is Believing: Black-Box Membership Inference Attacks Against Retrieval Augmented Generation

Abstract:Retrieval-Augmented Generation (RAG) is a state-of-the-art technique that enhances Large Language Models (LLMs) by retrieving relevant knowledge from an external, non-parametric database. This approach aims to mitigate common LLM issues such as hallucinations and outdated knowledge. Although existing research has demonstrated security and privacy vulnerabilities within RAG systems, making them susceptible to attacks like jailbreaks and prompt injections, the security of the RAG system's external databases remains largely underexplored. In this paper, we employ Membership Inference Attacks (MIA) to determine whether a sample is part of the knowledge database of a RAG system, using only black-box API access. Our core hypothesis posits that if a sample is a member, it will exhibit significant similarity to the text generated by the RAG system. To test this, we compute the cosine similarity and the model's perplexity to establish a membership score, thereby building robust features. We then introduce two novel attack strategies: a Threshold-based Attack and a Machine Learning-based Attack, designed to accurately identify membership. Experimental validation of our methods has achieved a ROC AUC of 82%.

Via

Access Paper or Ask Questions

The Adversarial AI-Art: Understanding, Generation, Detection, and Benchmarking

Apr 22, 2024

Yuying Li, Zeyan Liu, Junyi Zhao, Liangqin Ren, Fengjun Li, Jiebo Luo, Bo Luo

Figure 1 for The Adversarial AI-Art: Understanding, Generation, Detection, and Benchmarking

Figure 2 for The Adversarial AI-Art: Understanding, Generation, Detection, and Benchmarking

Figure 3 for The Adversarial AI-Art: Understanding, Generation, Detection, and Benchmarking

Figure 4 for The Adversarial AI-Art: Understanding, Generation, Detection, and Benchmarking

Abstract:Generative AI models can produce high-quality images based on text prompts. The generated images often appear indistinguishable from images generated by conventional optical photography devices or created by human artists (i.e., real images). While the outstanding performance of such generative models is generally well received, security concerns arise. For instance, such image generators could be used to facilitate fraud or scam schemes, generate and spread misinformation, or produce fabricated artworks. In this paper, we present a systematic attempt at understanding and detecting AI-generated images (AI-art) in adversarial scenarios. First, we collect and share a dataset of real images and their corresponding artificial counterparts generated by four popular AI image generators. The dataset, named ARIA, contains over 140K images in five categories: artworks (painting), social media images, news photos, disaster scenes, and anime pictures. This dataset can be used as a foundation to support future research on adversarial AI-art. Next, we present a user study that employs the ARIA dataset to evaluate if real-world users can distinguish with or without reference images. In a benchmarking study, we further evaluate if state-of-the-art open-source and commercial AI image detectors can effectively identify the images in the ARIA dataset. Finally, we present a ResNet-50 classifier and evaluate its accuracy and transferability on the ARIA dataset.

Via

Access Paper or Ask Questions

A Composite T60 Regression and Classification Approach for Speech Dereverberation

Feb 09, 2023

Yuying Li, Yuchen Liu, Donald S. Williamson

Figure 1 for A Composite T60 Regression and Classification Approach for Speech Dereverberation

Figure 2 for A Composite T60 Regression and Classification Approach for Speech Dereverberation

Figure 3 for A Composite T60 Regression and Classification Approach for Speech Dereverberation

Figure 4 for A Composite T60 Regression and Classification Approach for Speech Dereverberation

Abstract:Dereverberation is often performed directly on the reverberant audio signal, without knowledge of the acoustic environment. Reverberation time, T60, however, is an essential acoustic factor that reflects how reverberation may impact a signal. In this work, we propose to perform dereverberation while leveraging key acoustic information from the environment. More specifically, we develop a joint learning approach that uses a composite T60 module and a separate dereverberation module to simultaneously perform reverberation time estimation and dereverberation. The reverberation time module provides key features to the dereverberation module during fine tuning. We evaluate our approach in simulated and real environments, and compare against several approaches. The results show that this composite framework improves performance in environments.

Via

Access Paper or Ask Questions

EREL Selection using Morphological Relation

Jun 10, 2018

Yuying Li, Mehdi Faraji

Figure 1 for EREL Selection using Morphological Relation

Figure 2 for EREL Selection using Morphological Relation

Figure 3 for EREL Selection using Morphological Relation

Figure 4 for EREL Selection using Morphological Relation

Abstract:This work concentrates on Extremal Regions of Extremum Level (EREL) selection. EREL is a recently proposed feature detector aiming at detecting regions from a set of extremal regions. This is a branching problem derived from segmentation of arterial wall boundaries from Intravascular Ultrasound (IVUS) images. For each IVUS frame, a set of EREL regions is generated to describe the luminal area of human coronary. Each EREL is then fitted by an ellipse to represent the luminal border. The goal is to assign the most appropriate EREL as the lumen. In this work, EREL selection carries out in two rounds. In the first round, the pattern in a set of EREL regions is analyzed and used to generate an approximate luminal region. Then, the two-dimensional (2D) correlation coefficients are computed between this approximate region and each EREL to keep the ones with tightest relevance. In the second round, a compactness measure is calculated for each EREL and its fitted ellipse to guarantee that the resulting EREL has not affected by the common artifacts such as bifurcations, shadows, and side branches. We evaluated the selected ERELs in terms of Hausdorff Distance (HD) and Jaccard Measure (JM) on the train and test set of a publicly available dataset. The results show that our selection strategy outperforms the current state-of-the-art.

* 6 pages, 8 figures, accepted to be published in International Conference on SMART MULTIMEDIA, 2018. The final authenticated publication is available online at https://doi.org/

Via

Access Paper or Ask Questions

Nonsmooth Frank-Wolfe using Uniform Affine Approximations

Mar 20, 2018

Edward Cheung, Yuying Li

Figure 1 for Nonsmooth Frank-Wolfe using Uniform Affine Approximations

Figure 2 for Nonsmooth Frank-Wolfe using Uniform Affine Approximations

Figure 3 for Nonsmooth Frank-Wolfe using Uniform Affine Approximations

Figure 4 for Nonsmooth Frank-Wolfe using Uniform Affine Approximations

Abstract:Frank-Wolfe methods (FW) have gained significant interest in the machine learning community due to its ability to efficiently solve large problems that admit a sparse structure (e.g. sparse vectors and low-rank matrices). However the performance of the existing FW method hinges on the quality of the linear approximation. This typically restricts FW to smooth functions for which the approximation quality, indicated by a global curvature measure, is reasonably good. In this paper, we propose a modified FW algorithm amenable to nonsmooth functions by optimizing for approximation quality over all affine approximations given a neighborhood of interest. We analyze theoretical properties of the proposed algorithm and demonstrate that it overcomes many issues associated with existing methods in the context of nonsmooth low-rank matrix estimation.

Via

Access Paper or Ask Questions

Projection Free Rank-Drop Steps

Jul 04, 2017

Edward Cheung, Yuying Li

Figure 1 for Projection Free Rank-Drop Steps

Figure 2 for Projection Free Rank-Drop Steps

Figure 3 for Projection Free Rank-Drop Steps

Figure 4 for Projection Free Rank-Drop Steps

Abstract:The Frank-Wolfe (FW) algorithm has been widely used in solving nuclear norm constrained problems, since it does not require projections. However, FW often yields high rank intermediate iterates, which can be very expensive in time and space costs for large problems. To address this issue, we propose a rank-drop method for nuclear norm constrained problems. The goal is to generate descent steps that lead to rank decreases, maintaining low-rank solutions throughout the algorithm. Moreover, the optimization problems are constrained to ensure that the rank-drop step is also feasible and can be readily incorporated into a projection-free minimization method, e.g., Frank-Wolfe. We demonstrate that by incorporating rank-drop steps into the Frank-Wolfe algorithm, the rank of the solution is greatly reduced compared to the original Frank-Wolfe or its common variants.

Via

Access Paper or Ask Questions