Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuanyuan Li

Safer Prompts: Reducing IP Risk in Visual Generative AI

May 06, 2025

Lena Reissinger, Yuanyuan Li, Anna-Carolina Haensch, Neeraj Sarna

Abstract:Visual Generative AI models have demonstrated remarkable capability in generating high-quality images from simple inputs like text prompts. However, because these models are trained on images from diverse sources, they risk memorizing and reproducing specific content, raising concerns about intellectual property (IP) infringement. Recent advances in prompt engineering offer a cost-effective way to enhance generative AI performance. In this paper, we evaluate the effectiveness of prompt engineering techniques in mitigating IP infringement risks in image generation. Our findings show that Chain of Thought Prompting and Task Instruction Prompting significantly reduce the similarity between generated images and the training data of diffusion models, thereby lowering the risk of IP infringement.

Via

Access Paper or Ask Questions

Quantifying Correlations of Machine Learning Models

Feb 06, 2025

Yuanyuan Li, Neeraj Sarna, Yang Lin

Abstract:Machine Learning models are being extensively used in safety critical applications where errors from these models could cause harm to the user. Such risks are amplified when multiple machine learning models, which are deployed concurrently, interact and make errors simultaneously. This paper explores three scenarios where error correlations between multiple models arise, resulting in such aggregated risks. Using real-world data, we simulate these scenarios and quantify the correlations in errors of different models. Our findings indicate that aggregated risks are substantial, particularly when models share similar algorithms, training datasets, or foundational models. Overall, we observe that correlations across models are pervasive and likely to intensify with increased reliance on foundational models and widely used public datasets, highlighting the need for effective mitigation strategies to address these challenges.

Via

Access Paper or Ask Questions

An In-Depth Examination of Risk Assessment in Multi-Class Classification Algorithms

Dec 05, 2024

Disha Ghandwani, Neeraj Sarna, Yuanyuan Li, Yang Lin

Abstract:Advanced classification algorithms are being increasingly used in safety-critical applications like health-care, engineering, etc. In such applications, miss-classifications made by ML algorithms can result in substantial financial or health-related losses. To better anticipate and prepare for such losses, the algorithm user seeks an estimate for the probability that the algorithm miss-classifies a sample. We refer to this task as the risk-assessment. For a variety of models and datasets, we numerically analyze the performance of different methods in solving the risk-assessment problem. We consider two solution strategies: a) calibration techniques that calibrate the output probabilities of classification models to provide accurate probability outputs; and b) a novel approach based upon the prediction interval generation technique of conformal prediction. Our conformal prediction based approach is model and data-distribution agnostic, simple to implement, and provides reasonable results for a variety of use-cases. We compare the different methods on a broad variety of models and datasets.

Via

Access Paper or Ask Questions

The sampling complexity of learning invertible residual neural networks

Nov 08, 2024

Yuanyuan Li, Philipp Grohs, Philipp Petersen

Abstract:In recent work it has been shown that determining a feedforward ReLU neural network to within high uniform accuracy from point samples suffers from the curse of dimensionality in terms of the number of samples needed. As a consequence, feedforward ReLU neural networks are of limited use for applications where guaranteed high uniform accuracy is required. We consider the question of whether the sampling complexity can be improved by restricting the specific neural network architecture. To this end, we investigate invertible residual neural networks which are foundational architectures in deep learning and are widely employed in models that power modern generative methods. Our main result shows that the residual neural network architecture and invertibility do not help overcome the complexity barriers encountered with simpler feedforward architectures. Specifically, we demonstrate that the computational complexity of approximating invertible residual neural networks from point samples in the uniform norm suffers from the curse of dimensionality. Similar results are established for invertible convolutional Residual neural networks.

Via

Access Paper or Ask Questions

Distribution-free risk assessment of regression-based machine learning algorithms

Oct 05, 2023

Sukrita Singh, Neeraj Sarna, Yuanyuan Li, Yang Li, Agni Orfanoudaki, Michael Berger

Figure 1 for Distribution-free risk assessment of regression-based machine learning algorithms

Figure 2 for Distribution-free risk assessment of regression-based machine learning algorithms

Figure 3 for Distribution-free risk assessment of regression-based machine learning algorithms

Figure 4 for Distribution-free risk assessment of regression-based machine learning algorithms

Abstract:Machine learning algorithms have grown in sophistication over the years and are increasingly deployed for real-life applications. However, when using machine learning techniques in practical settings, particularly in high-risk applications such as medicine and engineering, obtaining the failure probability of the predictive model is critical. We refer to this problem as the risk-assessment task. We focus on regression algorithms and the risk-assessment task of computing the probability of the true label lying inside an interval defined around the model's prediction. We solve the risk-assessment problem using the conformal prediction approach, which provides prediction intervals that are guaranteed to contain the true label with a given probability. Using this coverage property, we prove that our approximated failure probability is conservative in the sense that it is not lower than the true failure probability of the ML algorithm. We conduct extensive experiments to empirically study the accuracy of the proposed method for problems with and without covariate shift. Our analysis focuses on different modeling regimes, dataset sizes, and conformal prediction methodologies.

Via

Access Paper or Ask Questions

Model-agnostic network inference enhancement from noisy measurements via curriculum learning

Sep 05, 2023

Kai Wu, Yuanyuan Li, Jing Liu

Abstract:Noise is a pervasive element within real-world measurement data, significantly undermining the performance of network inference models. However, the quest for a comprehensive enhancement framework capable of bolstering noise resistance across a diverse array of network inference models has remained elusive. Here, we present an elegant and efficient framework tailored to amplify the capabilities of network inference models in the presence of noise. Leveraging curriculum learning, we mitigate the deleterious impact of noisy samples on network inference models. Our proposed framework is model-agnostic, seamlessly integrable into a plethora of model-based and model-free network inference methods. Notably, we utilize one model-based and three model-free network inference methods as the foundation. Extensive experimentation across various synthetic and real-world networks, encapsulating diverse nonlinear dynamic processes, showcases substantial performance augmentation under varied noise types, particularly thriving in scenarios enriched with clean samples. This framework's adeptness in fortifying both model-free and model-based network inference methodologies paves the avenue towards a comprehensive and unified enhancement framework, encompassing the entire spectrum of network inference models. Available Code: https://github.com/xiaoyuans/MANIE.

Via

Access Paper or Ask Questions

Discover governing differential equations from evolving systems

Jan 19, 2023

Yuanyuan Li, Kai Wu, Jing Liu

Figure 1 for Discover governing differential equations from evolving systems

Figure 2 for Discover governing differential equations from evolving systems

Figure 3 for Discover governing differential equations from evolving systems

Figure 4 for Discover governing differential equations from evolving systems

Abstract:Discovering the governing equations of evolving systems from available observations is essential and challenging. However, current methods does not capture the situation that underlying system dynamics can be changed.Evolving systems are changing over time, which invariably changes with system status. Thus, finding the exact change points is critical. We propose an online modeling method capable of handling samples one by one sequentially by modeling streaming data instead of processing the entire dataset. The proposed method performs well in discovering ordinary differential equations, partial differential equations (PDEs), and high-dimensional PDEs from streaming data. The measurement generated from a changed system is distributed dissimilarly to before; hence, the difference can be identified by the proposed method. Our proposal performs well in identifying the change points and discovering governing differential equations in two evolving systems.

* 12 pages, 3 figures

Via

Access Paper or Ask Questions

Reinforcement Learning Approaches for the Orienteering Problem with Stochastic and Dynamic Release Dates

Jul 02, 2022

Yuanyuan Li, Claudia Archetti, Ivana Ljubic

Figure 1 for Reinforcement Learning Approaches for the Orienteering Problem with Stochastic and Dynamic Release Dates

Figure 2 for Reinforcement Learning Approaches for the Orienteering Problem with Stochastic and Dynamic Release Dates

Figure 3 for Reinforcement Learning Approaches for the Orienteering Problem with Stochastic and Dynamic Release Dates

Figure 4 for Reinforcement Learning Approaches for the Orienteering Problem with Stochastic and Dynamic Release Dates

Abstract:In this paper, we study a sequential decision making problem faced by e-commerce carriers related to when to send out a vehicle from the central depot to serve customer requests, and in which order to provide the service, under the assumption that the time at which parcels arrive at the depot is stochastic and dynamic. The objective is to maximize the number of parcels that can be delivered during the service hours. We propose two reinforcement learning approaches for solving this problem, one based on a policy function approximation (PFA) and the second on a value function approximation (VFA). Both methods are combined with a look-ahead strategy, in which future release dates are sampled in a Monte-Carlo fashion and a tailored batch approach is used to approximate the value of future states. Our PFA and VFA make a good use of branch-and-cut-based exact methods to improve the quality of decisions. We also establish sufficient conditions for partial characterization of optimal policy and integrate them into PFA/VFA. In an empirical study based on 720 benchmark instances, we conduct a competitive analysis using upper bounds with perfect information and we show that PFA and VFA greatly outperform two alternative myopic approaches. Overall, PFA provides best solutions, while VFA (which benefits from a two-stage stochastic optimization model) achieves a better tradeoff between solution quality and computing time.

Via

Access Paper or Ask Questions

Revealing the real-world CO2 emission reduction of ridesplitting and its determinants based on machine learning

Apr 02, 2022

Wenxiang Li, Yuanyuan Li, Ziyuan Pu, Long Cheng, Lei Wang, Linchuan Yang

Figure 1 for Revealing the real-world CO2 emission reduction of ridesplitting and its determinants based on machine learning

Figure 2 for Revealing the real-world CO2 emission reduction of ridesplitting and its determinants based on machine learning

Figure 3 for Revealing the real-world CO2 emission reduction of ridesplitting and its determinants based on machine learning

Figure 4 for Revealing the real-world CO2 emission reduction of ridesplitting and its determinants based on machine learning

Abstract:Ridesplitting, which is a form of pooled ridesourcing service, has great potential to alleviate the negative impacts of ridesourcing on the environment. However, most existing studies only explored its theoretical environmental benefits based on optimization models and simulations. To put into practice, this study aims to reveal the real-world emission reduction of ridesplitting and its determinants based on the observed data of ridesourcing in Chengdu, China. Integrating the trip data with the COPERT model, this study calculates the CO2 emissions of shared rides (ridesplitting) and their substituted single rides (regular ridesourcing) to estimate the CO2 emission reduction of each ridesplitting trip. The results show that not all ridesplitting trips reduce emissions from ridesourcing in the real world. The CO2 emission reduction rate of ridesplitting varies from trip to trip, averaging at 43.15g/km. Then, the interpretable machine learning models, gradient boosting machines, are applied to explore the relationship between the CO2 emission reduction rate of ridesplitting and its determinants. Based on the SHapley Additive exPlanations method, the overlap rate and detour rate of shared rides are identified to be the most important factors that determine the CO2 emission reduction rate of ridesplitting. Increasing the overlap rate, the number of shared rides, average speed, and ride distance ratio and decreasing the detour rate, actual trip distance, ride distance gap can increase the CO2 emission reduction rate of ridesplitting. In addition, nonlinear effects and interactions of several key factors are examined through the partial dependence plots. This study provides a scientific method for the government and ridesourcing companies to better assess and optimize the environmental benefits of ridesplitting.

* 33 pages, 12 figures

Via

Access Paper or Ask Questions

Visual Semantic Reasoning for Image-Text Matching

Sep 06, 2019

Kunpeng Li, Yulun Zhang, Kai Li, Yuanyuan Li, Yun Fu

Figure 1 for Visual Semantic Reasoning for Image-Text Matching

Figure 2 for Visual Semantic Reasoning for Image-Text Matching

Figure 3 for Visual Semantic Reasoning for Image-Text Matching

Figure 4 for Visual Semantic Reasoning for Image-Text Matching

Abstract:Image-text matching has been a hot research topic bridging the vision and language areas. It remains challenging because the current representation of image usually lacks global semantic concepts as in its corresponding text caption. To address this issue, we propose a simple and interpretable reasoning model to generate visual representation that captures key objects and semantic concepts of a scene. Specifically, we first build up connections between image regions and perform reasoning with Graph Convolutional Networks to generate features with semantic relationships. Then, we propose to use the gate and memory mechanism to perform global semantic reasoning on these relationship-enhanced features, select the discriminative information and gradually generate the representation for the whole scene. Experiments validate that our method achieves a new state-of-the-art for the image-text matching on MS-COCO and Flickr30K datasets. It outperforms the current best method by 6.8% relatively for image retrieval and 4.8% relatively for caption retrieval on MS-COCO (Recall@1 using 1K test set). On Flickr30K, our model improves image retrieval by 12.6% relatively and caption retrieval by 5.8% relatively (Recall@1). Our code is available at https://github.com/KunpengLi1994/VSRN.

* Accepted to ICCV 2019 (Oral)

Via

Access Paper or Ask Questions