Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiaoxiao Sun

CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models

Mar 06, 2025

Shengzhuang Chen, Yikai Liao, Xiaoxiao Sun, Kede Ma, Ying Wei

Figure 1 for CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models

Figure 2 for CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models

Figure 3 for CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models

Figure 4 for CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models

Abstract:The advent of the foundation model era has sparked significant research interest in leveraging pre-trained representations for continual learning (CL), yielding a series of top-performing CL methods on standard evaluation benchmarks. Nonetheless, there are growing concerns regarding potential data contamination during the pre-training stage. Furthermore, standard evaluation benchmarks, which are typically static, fail to capture the complexities of real-world CL scenarios, resulting in saturated performance. To address these issues, we describe CL on dynamic benchmarks (CLDyB), a general computational framework based on Markov decision processes for evaluating CL methods reliably. CLDyB dynamically identifies inherently difficult and algorithm-dependent tasks for the given CL methods, and determines challenging task orders using Monte Carlo tree search. Leveraging CLDyB, we first conduct a joint evaluation of multiple state-of-the-art CL methods, leading to a set of commonly challenging and generalizable task sequences where existing CL methods tend to perform poorly. We then conduct separate evaluations of individual CL methods using CLDyB, discovering their respective strengths and weaknesses. The source code and generated task sequences are publicly accessible at https://github.com/szc12153/CLDyB.

Via

Access Paper or Ask Questions

Physiologically-Informed Predictability of a Teammate's Future Actions Forecasts Team Performance

Jan 25, 2025

Yinuo Qin, Richard T. Lee, Weijia Zhang, Xiaoxiao Sun, Paul Sajda

Figure 1 for Physiologically-Informed Predictability of a Teammate's Future Actions Forecasts Team Performance

Figure 2 for Physiologically-Informed Predictability of a Teammate's Future Actions Forecasts Team Performance

Figure 3 for Physiologically-Informed Predictability of a Teammate's Future Actions Forecasts Team Performance

Figure 4 for Physiologically-Informed Predictability of a Teammate's Future Actions Forecasts Team Performance

Abstract:In collaborative environments, a deep understanding of multi-human teaming dynamics is essential for optimizing performance. However, the relationship between individuals' behavioral and physiological markers and their combined influence on overall team performance remains poorly understood. To explore this, we designed a triadic human collaborative sensorimotor task in virtual reality (VR) and introduced a novel predictability metric to examine team dynamics and performance. Our findings reveal a strong connection between team performance and the predictability of a team member's future actions based on other team members' behavioral and physiological data. Contrary to conventional wisdom that high-performing teams are highly synchronized, our results suggest that physiological and behavioral synchronizations among team members have a limited correlation with team performance. These insights provide a new quantitative framework for understanding multi-human teaming, paving the way for deeper insights into team dynamics and performance.

Via

Access Paper or Ask Questions

Rendering-Refined Stable Diffusion for Privacy Compliant Synthetic Data

Dec 09, 2024

Kartik Patwari, David Schneider, Xiaoxiao Sun, Chen-Nee Chuah, Lingjuan Lyu, Vivek Sharma

Abstract:Growing privacy concerns and regulations like GDPR and CCPA necessitate pseudonymization techniques that protect identity in image datasets. However, retaining utility is also essential. Traditional methods like masking and blurring degrade quality and obscure critical context, especially in human-centric images. We introduce Rendering-Refined Stable Diffusion (RefSD), a pipeline that combines 3D-rendering with Stable Diffusion, enabling prompt-based control over human attributes while preserving posture. Unlike standard diffusion models that fail to retain posture or GANs that lack realism and flexible attribute control, RefSD balances posture preservation, realism, and customization. We also propose HumanGenAI, a framework for human perception and utility evaluation. Human perception assessments reveal attribute-specific strengths and weaknesses of RefSD. Our utility experiments show that models trained on RefSD pseudonymized data outperform those trained on real data in detection tasks, with further performance gains when combining RefSD with real data. For classification tasks, we consistently observe performance improvements when using RefSD data with real data, confirming the utility of our pseudonymized data.

Via

Access Paper or Ask Questions

EEG-estimated functional connectivity, and not behavior, differentiates Parkinson's patients from health controls during the Simon conflict task

Oct 09, 2024

Xiaoxiao Sun, Chongkun Zhao, Sharath Koorathota, Paul Sajda

Figure 1 for EEG-estimated functional connectivity, and not behavior, differentiates Parkinson's patients from health controls during the Simon conflict task

Figure 2 for EEG-estimated functional connectivity, and not behavior, differentiates Parkinson's patients from health controls during the Simon conflict task

Figure 3 for EEG-estimated functional connectivity, and not behavior, differentiates Parkinson's patients from health controls during the Simon conflict task

Figure 4 for EEG-estimated functional connectivity, and not behavior, differentiates Parkinson's patients from health controls during the Simon conflict task

Abstract:Neural biomarkers that can classify or predict disease are of broad interest to the neurological and psychiatric communities. Such biomarkers can be informative of disease state or treatment efficacy, even before there are changes in symptoms and/or behavior. This work investigates EEG-estimated functional connectivity (FC) as a Parkinson's Disease (PD) biomarker. Specifically, we investigate FC mediated via neural oscillations and consider such activity during the Simons conflict task. This task yields sensory-motor conflict, and one might expect differences in behavior between PD patients and healthy controls (HCs). In addition to considering spatially focused approaches, such as FC, as a biomarker, we also consider temporal biomarkers, which are more sensitive to ongoing changes in neural activity. We find that FC, estimated from delta (1-4Hz) and theta (4-7Hz) oscillations, yields spatial FC patterns significantly better at distinguishing PD from HC than temporal features or behavior. This study reinforces that FC in spectral bands is informative of differences in brain-wide processes and can serve as a biomarker distinguishing normal brain function from that seen in disease.

* This work is accepted at IEEE EMBC 2024. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information

Via

Access Paper or Ask Questions

CIFAR-10-Warehouse: Broad and More Realistic Testbeds in Model Generalization Analysis

Oct 09, 2023

Xiaoxiao Sun, Xingjian Leng, Zijian Wang, Yang Yang, Zi Huang, Liang Zheng

Abstract:Analyzing model performance in various unseen environments is a critical research problem in the machine learning community. To study this problem, it is important to construct a testbed with out-of-distribution test sets that have broad coverage of environmental discrepancies. However, existing testbeds typically either have a small number of domains or are synthesized by image corruptions, hindering algorithm design that demonstrates real-world effectiveness. In this paper, we introduce CIFAR-10-Warehouse, consisting of 180 datasets collected by prompting image search engines and diffusion models in various ways. Generally sized between 300 and 8,000 images, the datasets contain natural images, cartoons, certain colors, or objects that do not naturally appear. With CIFAR-10-W, we aim to enhance the evaluation and deepen the understanding of two generalization tasks: domain generalization and model accuracy prediction in various out-of-distribution environments. We conduct extensive benchmarking and comparison experiments and show that CIFAR-10-W offers new and interesting insights inherent to these tasks. We also discuss other fields that would benefit from CIFAR-10-W.

* 9 pages, 5 figures, 3 tables

Via

Access Paper or Ask Questions

Alice Benchmarks: Connecting Real World Object Re-Identification with the Synthetic

Oct 06, 2023

Xiaoxiao Sun, Yue Yao, Shengjin Wang, Hongdong Li, Liang Zheng

Abstract:For object re-identification (re-ID), learning from synthetic data has become a promising strategy to cheaply acquire large-scale annotated datasets and effective models, with few privacy concerns. Many interesting research problems arise from this strategy, e.g., how to reduce the domain gap between synthetic source and real-world target. To facilitate developing more new approaches in learning from synthetic data, we introduce the Alice benchmarks, large-scale datasets providing benchmarks as well as evaluation protocols to the research community. Within the Alice benchmarks, two object re-ID tasks are offered: person and vehicle re-ID. We collected and annotated two challenging real-world target datasets: AlicePerson and AliceVehicle, captured under various illuminations, image resolutions, etc. As an important feature of our real target, the clusterability of its training set is not manually guaranteed to make it closer to a real domain adaptation test scenario. Correspondingly, we reuse existing PersonX and VehicleX as synthetic source domains. The primary goal is to train models from synthetic data that can work effectively in the real world. In this paper, we detail the settings of Alice benchmarks, provide an analysis of existing commonly-used domain adaptation methods, and discuss some interesting future directions. An online server will be set up for the community to evaluate methods conveniently and fairly.

* 9 pages, 4 figures, 4 tables

Via

Access Paper or Ask Questions

Privacy Assessment on Reconstructed Images: Are Existing Evaluation Metrics Faithful to Human Perception?

Sep 22, 2023

Xiaoxiao Sun, Nidham Gazagnadou, Vivek Sharma, Lingjuan Lyu, Hongdong Li, Liang Zheng

Figure 1 for Privacy Assessment on Reconstructed Images: Are Existing Evaluation Metrics Faithful to Human Perception?

Figure 2 for Privacy Assessment on Reconstructed Images: Are Existing Evaluation Metrics Faithful to Human Perception?

Figure 3 for Privacy Assessment on Reconstructed Images: Are Existing Evaluation Metrics Faithful to Human Perception?

Figure 4 for Privacy Assessment on Reconstructed Images: Are Existing Evaluation Metrics Faithful to Human Perception?

Abstract:Hand-crafted image quality metrics, such as PSNR and SSIM, are commonly used to evaluate model privacy risk under reconstruction attacks. Under these metrics, reconstructed images that are determined to resemble the original one generally indicate more privacy leakage. Images determined as overall dissimilar, on the other hand, indicate higher robustness against attack. However, there is no guarantee that these metrics well reflect human opinions, which, as a judgement for model privacy leakage, are more trustworthy. In this paper, we comprehensively study the faithfulness of these hand-crafted metrics to human perception of privacy information from the reconstructed images. On 5 datasets ranging from natural images, faces, to fine-grained classes, we use 4 existing attack methods to reconstruct images from many different classification models and, for each reconstructed image, we ask multiple human annotators to assess whether this image is recognizable. Our studies reveal that the hand-crafted metrics only have a weak correlation with the human evaluation of privacy leakage and that even these metrics themselves often contradict each other. These observations suggest risks of current metrics in the community. To address this potential risk, we propose a learning-based measure called SemSim to evaluate the Semantic Similarity between the original and reconstructed images. SemSim is trained with a standard triplet loss, using an original image as an anchor, one of its recognizable reconstructed images as a positive sample, and an unrecognizable one as a negative. By training on human annotations, SemSim exhibits a greater reflection of privacy leakage on the semantic level. We show that SemSim has a significantly higher correlation with human judgment compared with existing metrics. Moreover, this strong correlation generalizes to unseen datasets, models and attack methods.

* 15 pages, 9 figures and 3 tables

Via

Access Paper or Ask Questions

Circular Clustering with Polar Coordinate Reconstruction

Sep 15, 2023

Xiaoxiao Sun, Paul Sajda

Abstract:There is a growing interest in characterizing circular data found in biological systems. Such data are wide ranging and varied, from signal phase in neural recordings to nucleotide sequences in round genomes. Traditional clustering algorithms are often inadequate due to their limited ability to distinguish differences in the periodic component. Current clustering schemes that work in a polar coordinate system have limitations, such as being only angle-focused or lacking generality. To overcome these limitations, we propose a new analysis framework that utilizes projections onto a cylindrical coordinate system to better represent objects in a polar coordinate system. Using the mathematical properties of circular data, we show our approach always finds the correct clustering result within the reconstructed dataset, given sufficient periodic repetitions of the data. Our approach is generally applicable and adaptable and can be incorporated into most state-of-the-art clustering algorithms. We demonstrate on synthetic and real data that our method generates more appropriate and consistent clustering results compared to standard methods. In summary, our proposed analysis framework overcomes the limitations of existing polar coordinate-based clustering methods and provides a more accurate and efficient way to cluster circular data.

* Manuscript is under review in IEEE Transactions on Computational Biology and Bioinformatics. Copyright holder is credited to IEEE

Via

Access Paper or Ask Questions

Fixating on Attention: Integrating Human Eye Tracking into Vision Transformers

Aug 26, 2023

Sharath Koorathota, Nikolas Papadopoulos, Jia Li Ma, Shruti Kumar, Xiaoxiao Sun, Arunesh Mittal, Patrick Adelman, Paul Sajda

Figure 1 for Fixating on Attention: Integrating Human Eye Tracking into Vision Transformers

Figure 2 for Fixating on Attention: Integrating Human Eye Tracking into Vision Transformers

Figure 3 for Fixating on Attention: Integrating Human Eye Tracking into Vision Transformers

Figure 4 for Fixating on Attention: Integrating Human Eye Tracking into Vision Transformers

Abstract:Modern transformer-based models designed for computer vision have outperformed humans across a spectrum of visual tasks. However, critical tasks, such as medical image interpretation or autonomous driving, still require reliance on human judgments. This work demonstrates how human visual input, specifically fixations collected from an eye-tracking device, can be integrated into transformer models to improve accuracy across multiple driving situations and datasets. First, we establish the significance of fixation regions in left-right driving decisions, as observed in both human subjects and a Vision Transformer (ViT). By comparing the similarity between human fixation maps and ViT attention weights, we reveal the dynamics of overlap across individual heads and layers. This overlap is exploited for model pruning without compromising accuracy. Thereafter, we incorporate information from the driving scene with fixation data, employing a "joint space-fixation" (JSF) attention setup. Lastly, we propose a "fixation-attention intersection" (FAX) loss to train the ViT model to attend to the same regions that humans fixated on. We find that the ViT performance is improved in accuracy and number of training epochs when using JSF and FAX. These results hold significant implications for human-guided artificial intelligence.

* 25 pages, 9 figures, 3 tables

Via

Access Paper or Ask Questions

Label-Free Model Evaluation with Semi-Structured Dataset Representations

Dec 01, 2021

Xiaoxiao Sun, Yunzhong Hou, Hongdong Li, Liang Zheng

Figure 1 for Label-Free Model Evaluation with Semi-Structured Dataset Representations

Figure 2 for Label-Free Model Evaluation with Semi-Structured Dataset Representations

Figure 3 for Label-Free Model Evaluation with Semi-Structured Dataset Representations

Figure 4 for Label-Free Model Evaluation with Semi-Structured Dataset Representations

Abstract:Label-free model evaluation, or AutoEval, estimates model accuracy on unlabeled test sets, and is critical for understanding model behaviors in various unseen environments. In the absence of image labels, based on dataset representations, we estimate model performance for AutoEval with regression. On the one hand, image feature is a straightforward choice for such representations, but it hampers regression learning due to being unstructured (\ie no specific meanings for component at certain location) and of large-scale. On the other hand, previous methods adopt simple structured representations (like average confidence or average feature), but insufficient to capture the data characteristics given their limited dimensions. In this work, we take the best of both worlds and propose a new semi-structured dataset representation that is manageable for regression learning while containing rich information for AutoEval. Based on image features, we integrate distribution shapes, clusters, and representative samples for a semi-structured dataset representation. Besides the structured overall description with distribution shapes, the unstructured description with clusters and representative samples include additional fine-grained information facilitating the AutoEval task. On three existing datasets and 25 newly introduced ones, we experimentally show that the proposed representation achieves competitive results. Code and dataset are available at https://github.com/sxzrt/Semi-Structured-Dataset-Representations.

* 10 pages, 8 figures, 3 tables

Via

Access Paper or Ask Questions