Abstract:Recently, the diffusion model has emerged as a powerful generative technique for robotic policy learning, capable of modeling multi-mode action distributions. Leveraging its capability for end-to-end autonomous driving is a promising direction. However, the numerous denoising steps in the robotic diffusion policy and the more dynamic, open-world nature of traffic scenes pose substantial challenges for generating diverse driving actions at a real-time speed. To address these challenges, we propose a novel truncated diffusion policy that incorporates prior multi-mode anchors and truncates the diffusion schedule, enabling the model to learn denoising from anchored Gaussian distribution to the multi-mode driving action distribution. Additionally, we design an efficient cascade diffusion decoder for enhanced interaction with conditional scene context. The proposed model, DiffusionDrive, demonstrates 10$\times$ reduction in denoising steps compared to vanilla diffusion policy, delivering superior diversity and quality in just 2 steps. On the planning-oriented NAVSIM dataset, with the aligned ResNet-34 backbone, DiffusionDrive achieves 88.1 PDMS without bells and whistles, setting a new record, while running at a real-time speed of 45 FPS on an NVIDIA 4090. Qualitative results on challenging scenarios further confirm that DiffusionDrive can robustly generate diverse plausible driving actions. Code and model will be available at https://github.com/hustvl/DiffusionDrive.
Abstract:Multi-connectivity (MC) in satellite-terrestrial integrated networks (STINs), included in 3GPP standards, is regarded as a promising technology for future networks. The significant advantages of MC in improving coverage, communication, and sensing through satellite-terrestrial collaboration have sparked widespread interest. In this article, we first introduce three fundamental deployment architectures of MC systems in STINs, including multi-satellite, single-satellite single-base-station, and multi-satellite multi-base-station configurations. Considering the emerging but still evolving satellite networking, we explore system design challenges such as satellite networking schemes, e.g., cell-free and multi-tier satellite networks. Then, key technical challenges that severely influence the quality of mutual communications, including beamforming, channel estimation, and synchronization, are discussed subsequently. Furthermore, typical applications such as coverage enhancement, traffic offloading, collaborative sensing, and low-altitude communication are demonstrated, followed by a case study comparing coverage performance in MC and single-connectivity (SC) configurations. Several essential future research directions for MC in STINs are presented to facilitate further exploration.
Abstract:Sensing is anticipated to have wider extensions in communication systems with the boom of non-terrestrial networks (NTNs) during the past years. In this paper, we study a bistatic sensing system by maximizing the signal-to-interference-plus-noise ration (SINR) from the target aircraft in the space-air-ground integrated network (SAGIN). We formulate a joint optimization problem for the transmit beamforming of low-earth orbit (LEO) satellite and the receive filtering of ground base station. To tackle this problem, we decompose the original problem into two sub-problems and use the alternating optimization to solve them iteratively. Using techniques of fractional programming and generalized Rayleigh quotient, the closed-form solution for each sub-problem is returned. Simulation results show that the proposed algorithm has good convergence performance.Moreover, the optimization of receive filtering dominates the optimality, especially when the satellite altitude becomes higher, which provides valuable network design insights.
Abstract:Fact-checking is the task of verifying the factuality of a given claim by examining the available evidence. High-quality evidence plays a vital role in enhancing fact-checking systems and facilitating the generation of explanations that are understandable to humans. However, the provision of both sufficient and relevant evidence for explainable fact-checking systems poses a challenge. To tackle this challenge, we propose a method based on a Large Language Model to automatically retrieve and summarize evidence from the Web. Furthermore, we construct RU22Fact, a novel multilingual explainable fact-checking dataset on the Russia-Ukraine conflict in 2022 of 16K samples, each containing real-world claims, optimized evidence, and referenced explanation. To establish a baseline for our dataset, we also develop an end-to-end explainable fact-checking system to verify claims and generate explanations. Experimental results demonstrate the prospect of optimized evidence in increasing fact-checking performance and also indicate the possibility of further progress in the end-to-end claim verification and explanation generation tasks.
Abstract:The task of stock earnings forecasting has received considerable attention due to the demand investors in real-world scenarios. However, compared with financial institutions, it is not easy for ordinary investors to mine factors and analyze news. On the other hand, although large language models in the financial field can serve users in the form of dialogue robots, it still requires users to have financial knowledge to ask reasonable questions. To serve the user experience, we aim to build an automatic system, FinReport, for ordinary investors to collect information, analyze it, and generate reports after summarizing. Specifically, our FinReport is based on financial news announcements and a multi-factor model to ensure the professionalism of the report. The FinReport consists of three modules: news factorization module, return forecasting module, risk assessment module. The news factorization module involves understanding news information and combining it with stock factors, the return forecasting module aim to analysis the impact of news on market sentiment, and the risk assessment module is adopted to control investment risk. Extensive experiments on real-world datasets have well verified the effectiveness and explainability of our proposed FinReport. Our codes and datasets are available at https://github.com/frinkleko/FinReport.
Abstract:Since the advent of personal computing devices, intelligent personal assistants (IPAs) have been one of the key technologies that researchers and engineers have focused on, aiming to help users efficiently obtain information and execute tasks, and provide users with more intelligent, convenient, and rich interaction experiences. With the development of smartphones and IoT, computing and sensing devices have become ubiquitous, greatly expanding the boundaries of IPAs. However, due to the lack of capabilities such as user intent understanding, task planning, tool using, and personal data management etc., existing IPAs still have limited practicality and scalability. Recently, the emergence of foundation models, represented by large language models (LLMs), brings new opportunities for the development of IPAs. With the powerful semantic understanding and reasoning capabilities, LLM can enable intelligent agents to solve complex problems autonomously. In this paper, we focus on Personal LLM Agents, which are LLM-based agents that are deeply integrated with personal data and personal devices and used for personal assistance. We envision that Personal LLM Agents will become a major software paradigm for end-users in the upcoming era. To realize this vision, we take the first step to discuss several important questions about Personal LLM Agents, including their architecture, capability, efficiency and security. We start by summarizing the key components and design choices in the architecture of Personal LLM Agents, followed by an in-depth analysis of the opinions collected from domain experts. Next, we discuss several key challenges to achieve intelligent, efficient and secure Personal LLM Agents, followed by a comprehensive survey of representative solutions to address these challenges.
Abstract:Low-earth orbit (LEO) satellite communication is one of the enabling key technologies in next-generation (6G) networks. However, single satellite-supported downlink communication may not meet user's needs due to limited signal strength, especially in emergent scenarios. In this letter, we investigate an architecture of cell-free (CF) LEO satellite (CFLS) networks from a system-level perspective, where a user can be served by multiple satellites to improve its quality-of-service (QoS). Furthermore, we analyze the coverage and rate of a typical user in the CFLS network. Simulation and numerical results show that the CFLS network achieves a higher coverage probability than the traditional single satellite-supported network. Moreover, user's ergodic rate is maximized by selecting an appropriate number of serving satellites.
Abstract:The increasing attention given to AI Generated Content (AIGC) has brought a profound impact on various aspects of daily life, industrial manufacturing, and the academic sector. Recognizing the global trends and competitiveness in AIGC development, this study aims to analyze China's current status in the field. The investigation begins with an overview of the foundational technologies and current applications of AIGC. Subsequently, the study delves into the market status, policy landscape, and development trajectory of AIGC in China, utilizing keyword searches to identify relevant scholarly papers. Furthermore, the paper provides a comprehensive examination of AIGC products and their corresponding ecosystem, emphasizing the ecological construction of AIGC. Finally, this paper discusses the challenges and risks faced by the AIGC industry while presenting a forward-looking perspective on the industry's future based on competitive insights in AIGC.
Abstract:False Base Station (FBS) attack has been a severe security problem for the cellular network since 2G era. During handover, the user equipment (UE) periodically receives state information from surrounding base stations (BSs) and uploads it to the source BS. The source BS compares the uploaded signal power and shifts UE to another BS that can provide the strongest signal. An FBS can transmit signal with the proper power and attract UE to connect to it. In this paper, based on the 3GPP standard, a Precheck Sequence-based Detection (PSD) Scheme is proposed to secure the transition of legal base station (LBS) for UE. This scheme first analyzes the structure of received signals in blocks and symbols. Several additional symbols are added to the current signal sequence for verification. By designing a long table of symbol sequence, every UE which needs handover will be allocated a specific sequence from this table. The simulation results show that the performance of this PSD Scheme is better than that of any existing ones, even when a specific transmit power is designed for FBS.
Abstract:One of the bottlenecks of modern communication is to enable sensing and communication simultaneously with causing scheduling conflicts, and how sensing may be leveraged to help directional communication accuracy. To this end, we propose and implement a novel peer-to-peer mmWave communication system to achieve joint beamforming and sensing. A radar and IMU assisted tracking and beamforming algorithm is designed and tested and the results show robust tracking capacity with an overall higher throughtput obtained. The results demonstrated promising future extensions where with refinements the design and implementation can be deployed in a more scalable manner.