Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xuyang Zhao

MR-EEGWaveNet: Multiresolutional EEGWaveNet for Seizure Detection from Long EEG Recordings

May 23, 2025

Kazi Mahmudul Hassan, Xuyang Zhao, Hidenori Sugano, Toshihisa Tanaka

Abstract:Feature engineering for generalized seizure detection models remains a significant challenge. Recently proposed models show variable performance depending on the training data and remain ineffective at accurately distinguishing artifacts from seizure data. In this study, we propose a novel end-to-end model, ''Multiresolutional EEGWaveNet (MR-EEGWaveNet),'' which efficiently distinguishes seizure events from background electroencephalogram (EEG) and artifacts/noise by capturing both temporal dependencies across different time frames and spatial relationships between channels. The model has three modules: convolution, feature extraction, and predictor. The convolution module extracts features through depth-wise and spatio-temporal convolution. The feature extraction module individually reduces the feature dimension extracted from EEG segments and their sub-segments. Subsequently, the extracted features are concatenated into a single vector for classification using a fully connected classifier called the predictor module. In addition, an anomaly score-based post-classification processing technique was introduced to reduce the false-positive rates of the model. Experimental results were reported and analyzed using different parameter settings and datasets (Siena (public) and Juntendo (private)). The proposed MR-EEGWaveNet significantly outperformed the conventional non-multiresolution approach, improving the F1 scores from 0.177 to 0.336 on Siena and 0.327 to 0.488 on Juntendo, with precision gains of 15.9% and 20.62%, respectively.

* 26 pages, 6 figures, 12 tables

Via

Access Paper or Ask Questions

Unimodular Waveform Design for Integrated Sensing and Communication MIMO System via Manifold Optimization

Apr 08, 2025

Jiangtao Wang, Xuyang Zhao, Muyu Mei, Yongchao Wang

Abstract:Integrated sensing and communication (ISAC) has been widely recognized as one of the key technologies for 6G wireless networks. In this paper, we focus on the waveform design of ISAC system, which can realize radar sensing while also facilitate information transmission. The main content is as follows: first, we formulate the waveform design problem as a nonconvex and non-smooth model with a unimodulus constraint based on the measurement metric of the radar and communication system. Second, we transform the model into an unconstrained problem on the Riemannian manifold and construct the corresponding operators by analyzing the unimodulus constraint. Third, to achieve the solution efficiently, we propose a low-complexity non-smooth unimodulus manifold gradient descent (N-UMGD) algorithm with theoretical convergence guarantee. The simulation results show that the proposed algorithm can concentrate the energy of the sensing signal in the desired direction and realize information transmission with a low bit error rate.

Via

Access Paper or Ask Questions

Scaling Capability in Token Space: An Analysis of Large Vision Language Model

Dec 30, 2024

Tenghui Li, Guoxu Zhou, Xuyang Zhao, Qibin Zhao

Figure 1 for Scaling Capability in Token Space: An Analysis of Large Vision Language Model

Figure 2 for Scaling Capability in Token Space: An Analysis of Large Vision Language Model

Figure 3 for Scaling Capability in Token Space: An Analysis of Large Vision Language Model

Figure 4 for Scaling Capability in Token Space: An Analysis of Large Vision Language Model

Abstract:The scaling capability has been widely validated in neural language models with respect to the number of parameters and the size of training data. One important question is that does the scaling capability also exists similarly with respect to the number of vision tokens in large vision language Model? This study fills the gap by investigating the relationship between the number of vision tokens and the performance on vision-language models. Our theoretical analysis and empirical evaluations demonstrate that the model exhibits scalable performance $S(N_l)$ with respect to the number of vision tokens $N_l$, characterized by the relationship $S(N_l) \approx (c/N_l)^{\alpha}$. Furthermore, we also investigate the impact of a fusion mechanism that integrates the user's question with vision tokens. The results reveal two key findings. First, the scaling capability remains intact with the incorporation of the fusion mechanism. Second, the fusion mechanism enhances model performance, particularly when the user's question is task-specific and relevant. The analysis, conducted on fifteen diverse benchmarks spanning a broad range of tasks and domains, validates the effectiveness of the proposed approach.

Via

Access Paper or Ask Questions

Weak Scaling Capability in Token Space: An Observation from Large Vision Language Model

Dec 24, 2024

Tenghui Li, Guoxu Zhou, Xuyang Zhao, Qibin Zhao

Figure 1 for Weak Scaling Capability in Token Space: An Observation from Large Vision Language Model

Figure 2 for Weak Scaling Capability in Token Space: An Observation from Large Vision Language Model

Figure 3 for Weak Scaling Capability in Token Space: An Observation from Large Vision Language Model

Figure 4 for Weak Scaling Capability in Token Space: An Observation from Large Vision Language Model

Abstract:The scaling capability has been widely validated with respect to the number of parameters and the size of training data. One important question that is unexplored is that does scaling capability also exists similarly with respect to the number of vision tokens? This study fills the gap by investigating the relationship between the number of vision tokens and the performance of vision-language models. Our theoretical analysis and empirical evaluations reveal that the model exhibits weak scaling capabilities on the length $N_l$, with performance approximately $S(N_l) \approx (c/N_l)^{\alpha}$, where $c, \alpha$ are hyperparameters. Interestingly, this scaling behavior remains largely unaffected by the inclusion or exclusion of the user's question in the input. Furthermore, fusing the user's question with the vision token can enhance model performance when the question is relevant to the task. To address the computational challenges associated with large-scale vision tokens, we propose a novel architecture that efficiently reduces the token count while integrating user question tokens into the representation. Our findings may offer insights for developing more efficient and effective vision-language models under specific task constraints.

Via

Access Paper or Ask Questions

Designing Unimodular Waveforms with Good Correlation Properties for Large-Scale MIMO Radar via Manifold Optimization Method

Oct 10, 2024

Xuyang Zhao, Jiangtao Wang, Yongchao Wang

Figure 1 for Designing Unimodular Waveforms with Good Correlation Properties for Large-Scale MIMO Radar via Manifold Optimization Method

Abstract:In this paper, we design constant modulus probing waveforms with good correlation properties for large-scale collocated multi-input multi-output (MIMO) radar systems. The main content is as follows: First, we formulate the design problem as a fourth-order polynomial minimization problem with unimodulus constraints. Then, by analyzing the geometric properties of the unimodulus constraints through Riemannian geometry theory and embedding them into the search space, we transform the original non-convex optimization problem into an unconstrained problem on a Riemannian manifold for solution. Second, we convert the objective function into the form of a large but finite number of loss functions and employ a customized R-SVRG algorithm to solve it. Third, we prove that the customized R-SVRG algorithm is theoretically guaranteed to converge if appropriate parameters are chosen. Numerical examples demonstrate the effectiveness of the proposed R-SVRG algorithm.

Via

Access Paper or Ask Questions

A Statistical Theory of Regularization-Based Continual Learning

Jun 10, 2024

Xuyang Zhao, Huiyuan Wang, Weiran Huang, Wei Lin

Abstract:We provide a statistical analysis of regularization-based continual learning on a sequence of linear regression tasks, with emphasis on how different regularization terms affect the model performance. We first derive the convergence rate for the oracle estimator obtained as if all data were available simultaneously. Next, we consider a family of generalized $\ell_2$-regularization algorithms indexed by matrix-valued hyperparameters, which includes the minimum norm estimator and continual ridge regression as special cases. As more tasks are introduced, we derive an iterative update formula for the estimation error of generalized $\ell_2$-regularized estimators, from which we determine the hyperparameters resulting in the optimal algorithm. Interestingly, the choice of hyperparameters can effectively balance the trade-off between forward and backward knowledge transfer and adjust for data heterogeneity. Moreover, the estimation error of the optimal algorithm is derived explicitly, which is of the same order as that of the oracle estimator. In contrast, our lower bounds for the minimum norm estimator and continual ridge regression show their suboptimality. A byproduct of our theoretical analysis is the equivalence between early stopping and generalized $\ell_2$-regularization in continual learning, which may be of independent interest. Finally, we conduct experiments to complement our theory.

* Accepted by ICML 2024

Via

Access Paper or Ask Questions

EpilepsyLLM: Domain-Specific Large Language Model Fine-tuned with Epilepsy Medical Knowledge

Jan 11, 2024

Xuyang Zhao, Qibin Zhao, Toshihisa Tanaka

Abstract:With large training datasets and massive amounts of computing sources, large language models (LLMs) achieve remarkable performance in comprehensive and generative ability. Based on those powerful LLMs, the model fine-tuned with domain-specific datasets posseses more specialized knowledge and thus is more practical like medical LLMs. However, the existing fine-tuned medical LLMs are limited to general medical knowledge with English language. For disease-specific problems, the model's response is inaccurate and sometimes even completely irrelevant, especially when using a language other than English. In this work, we focus on the particular disease of Epilepsy with Japanese language and introduce a customized LLM termed as EpilepsyLLM. Our model is trained from the pre-trained LLM by fine-tuning technique using datasets from the epilepsy domain. The datasets contain knowledge of basic information about disease, common treatment methods and drugs, and important notes in life and work. The experimental results demonstrate that EpilepsyLLM can provide more reliable and specialized medical knowledge responses.

Via

Access Paper or Ask Questions

TDLE: 2-D LiDAR Exploration With Hierarchical Planning Using Regional Division

Jul 06, 2023

Xuyang Zhao, Chengpu Yu, Erpei Xu, Yixuan Liu

Abstract:Exploration systems are critical for enhancing the autonomy of robots. Due to the unpredictability of the future planning space, existing methods either adopt an inefficient greedy strategy or require a lot of resources to obtain a global solution. In this work, we address the challenge of obtaining global exploration routes with minimal computing resources. A hierarchical planning framework dynamically divides the planning space into subregions and arranges their orders to provide global guidance for exploration. Indicators that are compatible with the subregion order are used to choose specific exploration targets, thereby considering estimates of spatial structure and extending the planning space to unknown regions. Extensive simulations and field tests demonstrate the efficacy of our method in comparison to existing 2D LiDAR-based approaches. Our code has been made public for further investigation.

* Accepted in IEEE International Conference on Automation Science and Engineering (CASE) 2023

Via

Access Paper or Ask Questions

ArCL: Enhancing Contrastive Learning with Augmentation-Robust Representations

Mar 02, 2023

Xuyang Zhao, Tianqi Du, Yisen Wang, Jun Yao, Weiran Huang

Abstract:Self-Supervised Learning (SSL) is a paradigm that leverages unlabeled data for model training. Empirical studies show that SSL can achieve promising performance in distribution shift scenarios, where the downstream and training distributions differ. However, the theoretical understanding of its transferability remains limited. In this paper, we develop a theoretical framework to analyze the transferability of self-supervised contrastive learning, by investigating the impact of data augmentation on it. Our results reveal that the downstream performance of contrastive learning depends largely on the choice of data augmentation. Moreover, we show that contrastive learning fails to learn domain-invariant features, which limits its transferability. Based on these theoretical insights, we propose a novel method called Augmentation-robust Contrastive Learning (ArCL), which guarantees to learn domain-invariant features and can be easily integrated with existing contrastive learning algorithms. We conduct experiments on several datasets and show that ArCL significantly improves the transferability of contrastive learning.

* Accepted by ICLR 2023

Via

Access Paper or Ask Questions

Heterogeneous Federated Learning on a Graph

Sep 19, 2022

Huiyuan Wang, Xuyang Zhao, Wei Lin

Figure 1 for Heterogeneous Federated Learning on a Graph

Figure 2 for Heterogeneous Federated Learning on a Graph

Figure 3 for Heterogeneous Federated Learning on a Graph

Abstract:Federated learning, where algorithms are trained across multiple decentralized devices without sharing local data, is increasingly popular in distributed machine learning practice. Typically, a graph structure $G$ exists behind local devices for communication. In this work, we consider parameter estimation in federated learning with data distribution and communication heterogeneity, as well as limited computational capacity of local devices. We encode the distribution heterogeneity by parametrizing distributions on local devices with a set of distinct $p$-dimensional vectors. We then propose to jointly estimate parameters of all devices under the $M$-estimation framework with the fused Lasso regularization, encouraging an equal estimate of parameters on connected devices in $G$. We provide a general result for our estimator depending on $G$, which can be further calibrated to obtain convergence rates for various specific problem setups. Surprisingly, our estimator attains the optimal rate under certain graph fidelity condition on $G$, as if we could aggregate all samples sharing the same distribution. If the graph fidelity condition is not met, we propose an edge selection procedure via multiple testing to ensure the optimality. To ease the burden of local computation, a decentralized stochastic version of ADMM is provided, with convergence rate $O(T^{-1}\log T)$ where $T$ denotes the number of iterations. We highlight that, our algorithm transmits only parameters along edges of $G$ at each iteration, without requiring a central machine, which preserves privacy. We further extend it to the case where devices are randomly inaccessible during the training process, with a similar algorithmic convergence guarantee. The computational and statistical efficiency of our method is evidenced by simulation experiments and the 2020 US presidential election data set.

* 61 pages, 4 figures

Via

Access Paper or Ask Questions