Abstract:Orthogonal time frequency space (OTFS) modulation is anticipated to be a promising candidate for supporting integrated sensing and communications (ISAC) systems, which is considered as a pivotal technique for realizing next generation wireless networks. In this paper, we develop a minimum bit error rate (BER) precoder design for an OTFS-based ISAC system. In particular, the BER minimization problem takes into account the maximum available transmission power budget and the required sensing performance. Different from prior studies that considered ISAC in the time-frequency (TF) domain, we devise the precoder from the perspective of the delay-Doppler (DD) domain by exploiting the equivalent DD domain channel due to the fact that the DD domain channel generally tends to be sparse and quasi-static, which can facilitate a low-overhead ISAC system design. To address the non-convex optimization design problem, we resort to optimizing the lower bound of the derived average BER by adopting Jensen's inequality. Subsequently, the formulated problem is decoupled into two independent sub-problems via singular value decomposition (SVD) methodology. We then theoretically analyze the feasibility conditions of the proposed problem and present a low-complexity iterative solution via leveraging the Lagrangian duality approach. Simulation results verify the effectiveness of our proposed precoder compared to the benchmark schemes and reveal the interplay between sensing and communication for dual-functional precoder design, indicating a trade-off where transmission efficiency is sacrificed for increasing transmission reliability and sensing accuracy.
Abstract:Parkour presents a highly challenging task for legged robots, requiring them to traverse various terrains with agile and smooth locomotion. This necessitates comprehensive understanding of both the robot's own state and the surrounding terrain, despite the inherent unreliability of robot perception and actuation. Current state-of-the-art methods either rely on complex pre-trained high-level terrain reconstruction modules or limit the maximum potential of robot parkour to avoid failure due to inaccurate perception. In this paper, we propose a one-stage end-to-end learning-based parkour framework: Parkour with Implicit-Explicit learning framework for legged robots (PIE) that leverages dual-level implicit-explicit estimation. With this mechanism, even a low-cost quadruped robot equipped with an unreliable egocentric depth camera can achieve exceptional performance on challenging parkour terrains using a relatively simple training process and reward function. While the training process is conducted entirely in simulation, our real-world validation demonstrates successful zero-shot deployment of our framework, showcasing superior parkour performance on harsh terrains.
Abstract:Federated neuromorphic learning (FedNL) leverages event-driven spiking neural networks and federated learning frameworks to effectively execute intelligent analysis tasks over amounts of distributed low-power devices but also perform vulnerability to poisoning attacks. The threat of backdoor attacks on traditional deep neural networks typically comes from time-invariant data. However, in FedNL, unknown threats may be hidden in time-varying spike signals. In this paper, we start to explore a novel vulnerability of FedNL-based systems with the concept of time division multiplexing, termed Spikewhisper, which allows attackers to evade detection as much as possible, as multiple malicious clients can imperceptibly poison with different triggers at different timeslices. In particular, the stealthiness of Spikewhisper is derived from the time-domain divisibility of global triggers, in which each malicious client pastes only one local trigger to a certain timeslice in the neuromorphic sample, and also the polarity and motion of each local trigger can be configured by attackers. Extensive experiments based on two different neuromorphic datasets demonstrate that the attack success rate of Spikewispher is higher than the temporally centralized attacks. Besides, it is validated that the effect of Spikewispher is sensitive to the trigger duration.
Abstract:Accurate state estimation plays a critical role in ensuring the robust control of humanoid robots, particularly in the context of learning-based control policies for legged robots. However, there is a notable gap in analytical research concerning estimations. Therefore, we endeavor to further understand how various types of estimations influence the decision-making processes of policies. In this paper, we provide quantitative insight into the effectiveness of learned state estimations, employing saliency analysis to identify key estimation variables and optimize their combination for humanoid locomotion tasks. Evaluations assessing tracking precision and robustness are conducted on comparative groups of policies with varying estimation combinations in both simulated and real-world environments. Results validated that the proposed policy is capable of crossing the sim-to-real gap and demonstrating superior performance relative to alternative policy configurations.
Abstract:We focus on the task of unknown object rearrangement, where a robot is supposed to re-configure the objects into a desired goal configuration specified by an RGB-D image. Recent works explore unknown object rearrangement systems by incorporating learning-based perception modules. However, they are sensitive to perception error, and pay less attention to task-level performance. In this paper, we aim to develop an effective system for unknown object rearrangement amidst perception noise. We theoretically reveal the noisy perception impacts grasp and place in a decoupled way, and show such a decoupled structure is non-trivial to improve task optimality. We propose GSP, a dual-loop system with the decoupled structure as prior. For the inner loop, we learn an active seeing policy for self-confident object matching to improve the perception of place. For the outer loop, we learn a grasp policy aware of object matching and grasp capability guided by task-level rewards. We leverage the foundation model CLIP for object matching, policy learning and self-termination. A series of experiments indicate that GSP can conduct unknown object rearrangement with higher completion rate and less steps.
Abstract:This paper investigates the bit error rate (BER) minimum pre-coder design for an orthogonal time frequency space (OTFS)-based integrated sensing and communications (ISAC) system, which is considered as a promising technique for enabling future wireless networks. In particular, the BER minimum problem takes into account the maximized available transmission power and the required sensing performance. We devise the precoder from the perspective of delay-Doppler (DD) domain by exploiting the equivalent DD channel. To address the non-convex design problem, we resort to minimizing the lower bound of the derived average BER. Afterwards, we propose a computationally iterative method to solve the dual problem at low cost. Simulation results verify the effectiveness of our proposed precoder and reveal the interplay between sensing and communication for dual-functional precoder design.
Abstract:Beyond 5G networks provide solutions for next-generation communications, especially digital twins networks (DTNs) have gained increasing popularity for bridging physical space and digital space. However, current DTNs networking frameworks pose a number of challenges especially when applied in scenarios that require high communication efficiency and multimodal data processing. First, current DTNs frameworks are unavoidable regarding high resource consumption and communication congestion because of original bit-level communication and high-frequency computation, especially distributed learning-based DTNs. Second, current machine learning models for DTNs are domain-specific (e.g. E-health), making it difficult to handle DT scenarios with multimodal data processing requirements. Last but not least, current security schemes for DTNs, such as blockchain, introduce additional overheads that impair the efficiency of DTNs. To address the above challenges, we propose a large language model (LLM) empowered DTNs networking framework, LLM-Twin. First, we design the mini-giant model collaboration scheme to achieve efficient deployment of LLM in DTNs, since LLM are naturally conducive to processing multimodal data. Then, we design a semantic-level high-efficiency, and secure communication model for DTNs. The feasibility of LLM-Twin is demonstrated by numerical experiments and case studies. To our knowledge, this is the first to propose LLM-based semantic-level digital twin networking framework.
Abstract:With the application of high-frequency communication and extremely large MIMO (XL-MIMO), the near-field effect has become increasingly apparent. The near-field beam design now requires consideration not only of the angle of arrival (AoA) information but also the curvature of arrival (CoA) information. However, due to their mutual coupling, orthogonally decomposing the near-field space becomes challenging. In this paper, we propose a Joint Autocorrelation and Cross-correlation (JAC) scheme to address the coupling information between near-field CoA and AoA. First, we analyze the similarity between the near-field problem and the Doppler problem in digital signal processing, revealing that the autocorrelation function can effectively extract CoA information. Subsequently, utilizing the obtained CoA, we transform the near-field problem into a far-field form, enabling the direct application of beam training schemes designed for the far-field in the near-field scenario. Finally, we analyze the characteristics of the far and near-field signal subspaces from the perspective of matrix theory and discuss how the JAC algorithm handles them. Numerical results demonstrate that the JAC scheme outperforms traditional methods in the high signal-to-noise ratio (SNR) regime. Moreover, the time complexity of the JAC algorithm is $\mathcal O(N+1)$, significantly smaller than existing near-field beam training algorithms.
Abstract:Semantic communication has emerged as a new deep learning-based communication paradigm that drives the research of end-to-end data transmission in tasks like image classification, and image reconstruction. However, the security problem caused by semantic attacks has not been well explored, resulting in vulnerabilities within semantic communication systems exposed to potential semantic perturbations. In this paper, we propose a secure semantic communication system, DiffuSeC, which leverages the diffusion model and deep reinforcement learning (DRL) to address this issue. With the diffusing module in the sender end and the asymmetric denoising module in the receiver end, the DiffuSeC mitigates the perturbations added by semantic attacks, including data source attacks and channel attacks. To further improve the robustness under unstable channel conditions caused by semantic attacks, we developed a DRL-based channel-adaptive diffusion step selection scheme to achieve stable performance under fluctuating environments. A timestep synchronization scheme is designed for diffusion timestep coordination between the two ends. Simulation results demonstrate that the proposed DiffuSeC shows higher robust accuracy than previous works under a wide range of channel conditions, and can quickly adjust the model state according to signal-to-noise ratios (SNRs) in unstable environments.
Abstract:Decomposing a target object from a complex background while reconstructing is challenging. Most approaches acquire the perception for object instances through the use of manual labels, but the annotation procedure is costly. The recent advancements in 2D self-supervised learning have brought new prospects to object-aware representation, yet it remains unclear how to leverage such noisy 2D features for clean decomposition. In this paper, we propose a Decomposed Object Reconstruction (DORec) network based on neural implicit representations. Our key idea is to transfer 2D self-supervised features into masks of two levels of granularity to supervise the decomposition, including a binary mask to indicate the foreground regions and a K-cluster mask to indicate the semantically similar regions. These two masks are complementary to each other and lead to robust decomposition. Experimental results show the superiority of DORec in segmenting and reconstructing the foreground object on various datasets.