IEEE
Abstract:Multimodal Large Language Model (MLLM) agents facilitate Graphical User Interface (GUI) automation but struggle with long-horizon, cross-application tasks due to limited context windows. While memory systems provide a viable solution, existing paradigms struggle to adapt to dynamic GUI environments, suffering from a granularity mismatch between high-level intent and low-level execution, and context pollution where the static accumulation of outdated experiences drives agents into hallucination. To address these bottlenecks, we propose the Darwinian Memory System (DMS), a self-evolving architecture that constructs memory as a dynamic ecosystem governed by the law of survival of the fittest. DMS decomposes complex trajectories into independent, reusable units for compositional flexibility, and implements Utility-driven Natural Selection to track survival value, actively pruning suboptimal paths and inhibiting high-risk plans. This evolutionary pressure compels the agent to derive superior strategies. Extensive experiments on real-world multi-app benchmarks validate that DMS boosts general-purpose MLLMs without training costs or architectural overhead, achieving average gains of 18.0% in success rate and 33.9% in execution stability, while reducing task latency, establishing it as an effective self-evolving memory system for GUI tasks.
Abstract:In this paper, we study the problem of uplink channel estimation for near-filed orthogonal frequency division multiplexing (OFDM) systems, where a base station (BS), equipped with an extremely large-scale antenna array (ELAA), serves multiple users over the same time-frequency resource block. A non-orthogonal pilot transmission scheme is considered to accommodate a larger number of users that can be supported by ELAA systems without incurring an excessive amount of training overhead. To facilitate efficient multi-user channel estimation, we express the received signal as a third-order low-rank tensor, which admits a canonical polyadic decomposition (CPD) model for line-of-sight (LoS) scenarios and a block term decomposition (BTD) model for non-line-of-sight (NLoS) scenarios. An alternating least squares (ALS) algorithm and a non-linear least squares (NLS) algorithm are employed to perform CPD and BTD, respectively. Channel parameters are then efficiently extracted from the recovered factor matrices. By exploiting the geometry of the propagation paths in the estimated channel, users' positions can be precisely determined in LoS scenarios. Moreover, our uniqueness analysis shows that the proposed tensor-based joint multi-user channel estimation framework is effective even when the number of pilot symbols is much smaller than the number of users, revealing its potential in training overhead reduction. Simulation results demonstrate that the proposed method achieves markedly higher channel estimation accuracy than compressed sensing (CS)-based approaches.
Abstract:Despite the intrinsic risk-awareness of Large Language Models (LLMs), current defenses often result in shallow safety alignment, rendering models vulnerable to disguised attacks (e.g., prefilling) while degrading utility. To bridge this gap, we propose SafeThinker, an adaptive framework that dynamically allocates defensive resources via a lightweight gateway classifier. Based on the gateway's risk assessment, inputs are routed through three distinct mechanisms: (i) a Standardized Refusal Mechanism for explicit threats to maximize efficiency; (ii) a Safety-Aware Twin Expert (SATE) module to intercept deceptive attacks masquerading as benign queries; and (iii) a Distribution-Guided Think (DDGT) component that adaptively intervenes during uncertain generation. Experiments show that SafeThinker significantly lowers attack success rates across diverse jailbreak strategies without compromising utility, demonstrating that coordinating intrinsic judgment throughout the generation process effectively balances robustness and practicality.
Abstract:We consider the channel acquisition problem for a wideband terahertz (THz) communication system, where an extremely large-scale array is deployed to mitigate severe path attenuation. In channel modeling, we account for both the near-field spherical wavefront and the wideband beam-splitting phenomena, resulting in a wideband near-field channel. We propose a frequency-independent orthogonal dictionary that generalizes the standard discrete Fourier transform (DFT) matrix by introducing an additional parameter to capture the near-field property. This dictionary enables the wideband near-field channel to be efficiently represented with a two-dimensional (2D) block-sparse structure. Leveraging this specific sparse structure, the wideband near-field channel estimation problem can be effectively addressed within a customized compressive sensing framework. Numerical results demonstrate the significant advantages of our proposed 2D block-sparsity-aware method over conventional polar-domain-based approaches for near-field wideband channel estimation.
Abstract:Reconfigurable antennas, including reconfigurable intelligent surface (RIS), movable antenna (MA), fluid antenna (FA), and other advanced antenna techniques, have been studied extensively in the context of reshaping wireless propagation environments for 6G and beyond wireless communications. Nevertheless, how to reconfigure/optimize the real-time controllable coefficients to achieve a favorable end-to-end wireless channel remains a substantial challenge, as it usually requires accurate modeling of the complex interaction between the reconfigurable devices and the electromagnetic waves, as well as knowledge of implicit channel propagation parameters. In this paper, we introduce a derivative-free optimization (a.k.a., zeroth-order (ZO) optimization) technique to directly optimize reconfigurable coefficients to shape the wireless end-to-end channel, without the need of channel modeling and estimation of the implicit environmental propagation parameters. We present the fundamental principles of ZO optimization and discuss its potential advantages in wireless channel reconfiguration. Two case studies for RIS and movable antenna-enabled single-input single-output (SISO) systems are provided to show the superiority of ZO-based methods as compared to state-of-the-art techniques. Finally, we outline promising future research directions and offer concluding insights on derivative-free optimization for reconfigurable antenna technologies.
Abstract:Movable antennas (MAs) have emerged as a disruptive technology in wireless communications for enhancing spatial degrees of freedom through continuous antenna repositioning within predefined regions, thereby creating favorable channel propagation conditions. In this paper, we study the problem of position optimization for MA-enabled multi-user MISO systems, where a base station (BS), equipped with multiple MAs, communicates with multiple users each equipped with a single fixed-position antenna (FPA). To circumvent the difficulty of acquiring the channel state information (CSI) from the transmitter to the receiver over the entire movable region, we propose a derivative-free approach for MA position optimization. The basic idea is to treat position optimization as a closed-box optimization problem and calculate the gradient of the unknown objective function using zeroth-order (ZO) gradient approximation techniques. Specifically, the proposed method does not need to explicitly estimate the global CSI. Instead, it adaptively refines its next movement based on previous measurements such that it eventually converges to an optimum or stationary solution. Simulation results show that the proposed derivative-free approach is able to achieve higher sample and computational efficiencies than the CSI estimation-based position optimization approach, particularly for challenging scenarios where the number of multi-path components (MPCs) is large or the number of pilot signals is limited.
Abstract:Compressing long chain-of-thought (CoT) from large language models (LLMs) is an emerging strategy to improve the reasoning efficiency of LLMs. Despite its promising benefits, existing studies equally compress all thoughts within a long CoT, hindering more concise and effective reasoning. To this end, we first investigate the importance of different thoughts by examining their effectiveness and efficiency in contributing to reasoning through automatic long CoT chunking and Monte Carlo rollouts. Building upon the insights, we propose a theoretically bounded metric to jointly measure the effectiveness and efficiency of different thoughts. We then propose Long$\otimes$Short, an efficient reasoning framework that enables two LLMs to collaboratively solve the problem: a long-thought LLM for more effectively generating important thoughts, while a short-thought LLM for efficiently generating remaining thoughts. Specifically, we begin by synthesizing a small amount of cold-start data to fine-tune LLMs for long-thought and short-thought reasoning styles, respectively. Furthermore, we propose a synergizing-oriented multi-turn reinforcement learning, focusing on the model self-evolution and collaboration between long-thought and short-thought LLMs. Experimental results show that our method enables Qwen2.5-7B and Llama3.1-8B to achieve comparable performance compared to DeepSeek-R1-Distill-Qwen-7B and DeepSeek-R1-Distill-Llama-8B, while reducing token length by over 80% across the MATH500, AIME24/25, AMC23, and GPQA Diamond benchmarks. Our data and code are available at https://github.com/yasNing/Long-otimes-Short/.
Abstract:Improving the fundamental performance trade-off in integrated sensing and communication (ISAC) systems has been deemed as one of the most significant challenges. To address it, we propose in this letter a novel ISAC system that leverages an unmanned aerial vehicle (UAV)-mounted intelligent reflecting surface (IRS) and the UAV's maneuverability in six-dimensional (6D) space, i.e., three-dimensional (3D) location and 3D rotation, thus referred to as passive 6D movable antenna (6DMA). We aim to maximize the signal-to-noise ratio (SNR) for sensing a single target while ensuring a minimum SNR at a communication user equipment (UE), by jointly optimizing the transmit beamforming at the ISAC base station (BS), the 3D location and orientation as well as the reflection coefficients of the IRS. To solve this challenging non-convex optimization problem, we propose a two-stage approach. In the first stage, we aim to optimize the IRS's 3D location, 3D orientation, and reflection coefficients to enhance both the channel correlations and power gains for sensing and communication. Given their optimized parameters, the optimal transmit beamforming at the ISAC BS is derived in closed form. Simulation results demonstrate that the proposed passive 6DMA-enabled ISAC system significantly improves the sensing and communication trade-off by simultaneously enhancing channel correlations and power gains, and outperforms other baseline schemes.
Abstract:Analog beamforming holds great potential for future terahertz (THz) communications due to its ability to generate high-gain directional beams with low-cost phase shifters.However, conventional analog beamforming may suffer substantial performance degradation in wideband systems due to the beam-squint effects. Instead of relying on high-cost true time delayers, we propose in this paper an efficient three-dimensional (3D) rotatable antenna technology to mitigate the beam-squint effects, motivated by the fact that beam squint disappears along the boresight direction. In particular, we focus on a wideband wide-beam coverage problem in this paper, aiming to maximize the minimum beamforming gain within a given angle and frequency range by jointly optimizing the analog beamforming vector and the 3D rotation angles of the antenna array. However, this problem is non-convex and difficult to be optimally solved due to the coupling of the spatial and frequency domains and that of the antenna weights and rotation. To tackle this issue, we first reformulate the problem into an equivalent form by merging the spatial and frequency domains into a single composite domain. Next, we combine alternating optimization (AO) and successive convex approximation (SCA) algorithms to optimize the analog beamforming and rotation angles within this composite domain. Simulation results demonstrate that the proposed scheme can significantly outperform conventional schemes without antenna rotation, thus offering a cost-effective solution for wideband transmission over THz bands.




Abstract:Time series forecasting (TSF) plays a crucial role in various domains, including web data analysis, energy consumption prediction, and weather forecasting. While Multi-Layer Perceptrons (MLPs) are lightweight and effective for capturing temporal dependencies, they are prone to overfitting when used to model inter-channel dependencies. In this paper, we investigate the overfitting problem in channel-wise MLPs using Rademacher complexity theory, revealing that extreme values in time series data exacerbate this issue. To mitigate this issue, we introduce a novel Simplex-MLP layer, where the weights are constrained within a standard simplex. This strategy encourages the model to learn simpler patterns and thereby reducing overfitting to extreme values. Based on the Simplex-MLP layer, we propose a novel \textbf{F}requency \textbf{S}implex \textbf{MLP} (FSMLP) framework for time series forecasting, comprising of two kinds of modules: \textbf{S}implex \textbf{C}hannel-\textbf{W}ise MLP (SCWM) and \textbf{F}requency \textbf{T}emporal \textbf{M}LP (FTM). The SCWM effectively leverages the Simplex-MLP to capture inter-channel dependencies, while the FTM is a simple yet efficient temporal MLP designed to extract temporal information from the data. Our theoretical analysis shows that the upper bound of the Rademacher Complexity for Simplex-MLP is lower than that for standard MLPs. Moreover, we validate our proposed method on seven benchmark datasets, demonstrating significant improvements in forecasting accuracy and efficiency, while also showcasing superior scalability. Additionally, we demonstrate that Simplex-MLP can improve other methods that use channel-wise MLP to achieve less overfitting and improved performance. Code are available \href{https://github.com/FMLYD/FSMLP}{\textcolor{red}{here}}.