Abstract:Theory of Mind (ToM) is the ability to reason about one's own and others' mental states. ToM plays a critical role in the development of intelligence, language understanding, and cognitive processes. While previous work has primarily focused on first and second-order ToM, we explore higher-order ToM, which involves recursive reasoning on others' beliefs. We introduce HI-TOM, a Higher Order Theory of Mind benchmark. Our experimental evaluation using various Large Language Models (LLMs) indicates a decline in performance on higher-order ToM tasks, demonstrating the limitations of current LLMs. We conduct a thorough analysis of different failure cases of LLMs, and share our thoughts on the implications of our findings on the future of NLP.
Abstract:In this paper, we study the problem of robust sparse mean estimation, where the goal is to estimate a $k$-sparse mean from a collection of partially corrupted samples drawn from a heavy-tailed distribution. Existing estimators face two critical challenges in this setting. First, they are limited by a conjectured computational-statistical tradeoff, implying that any computationally efficient algorithm needs $\tilde\Omega(k^2)$ samples, while its statistically-optimal counterpart only requires $\tilde O(k)$ samples. Second, the existing estimators fall short of practical use as they scale poorly with the ambient dimension. This paper presents a simple mean estimator that overcomes both challenges under moderate conditions: it runs in near-linear time and memory (both with respect to the ambient dimension) while requiring only $\tilde O(k)$ samples to recover the true mean. At the core of our method lies an incremental learning phenomenon: we introduce a simple nonconvex framework that can incrementally learn the top-$k$ nonzero elements of the mean while keeping the zero elements arbitrarily small. Unlike existing estimators, our method does not need any prior knowledge of the sparsity level $k$. We prove the optimality of our estimator by providing a matching information-theoretic lower bound. Finally, we conduct a series of simulations to corroborate our theoretical findings. Our code is available at https://github.com/huihui0902/Robust_mean_estimation.
Abstract:Integrated sensing and communication (ISAC) has recently been considered as a promising approach to save spectrum resources and reduce hardware cost. Meanwhile, as information security becomes increasingly more critical issue, government agencies urgently need to legitimately monitor suspicious communications via proactive eavesdropping. Thus, in this paper, we investigate a wireless legitimate surveillance system with radar function. We seek to jointly optimize the receive and transmit beamforming vectors to maximize the eavesdropping success probability which is transformed into the difference of signal-to-interference-plus-noise ratios (SINRs) subject to the performance requirements of radar and surveillance. The formulated problem is challenging to solve. By employing the Rayleigh quotient and fully exploiting the structure of the problem, we apply the divide-and-conquer principle to divide the formulated problem into two subproblems for two different cases. For the first case, we aim at minimizing the total transmit power, and for the second case we focus on maximizing the jamming power. For both subproblems, with the aid of orthogonal decomposition, we obtain the optimal solution of the receive and transmit beamforming vectors in closed-form. Performance analysis and discussion of some insightful results are also carried out. Finally, extensive simulation results demonstrate the effectiveness of our proposed algorithm in terms of eavesdropping success probability.
Abstract:Integrated sensing and communication (ISAC) has been regarded as one of the most promising technologies for future wireless communications. However, the mutual interference in the communication radar coexistence system cannot be ignored. Inspired by the studies of reconfigurable intelligent surface (RIS), we propose a double-RIS-assisted coexistence system where two RISs are deployed for enhancing communication signals and suppressing mutual interference. We aim to jointly optimize the beamforming of RISs and radar to maximize communication performance while maintaining radar detection performance. The investigated problem is challenging, and thus we transform it into an equivalent but more tractable form by introducing auxiliary variables. Then, we propose a penalty dual decomposition (PDD)-based algorithm to solve the resultant problem. Moreover, we consider two special cases: the large radar transmit power scenario and the low radar transmit power scenario. For the former, we prove that the beamforming design is only determined by the communication channel and the corresponding optimal joint beamforming strategy can be obtained in closed-form. For the latter, we minimize the mutual interference via the block coordinate descent (BCD) method. By combining the solutions of these two cases, a low-complexity algorithm is also developed. Finally, simulation results show that both the PDD-based and low-complexity algorithms outperform benchmark algorithms.
Abstract:In cellular federated edge learning (FEEL), multiple edge devices holding local data jointly train a learning algorithm by communicating learning updates with an access point without exchanging their data samples. With limited communication resources, it is beneficial to schedule the most informative local learning update. In this paper, a novel scheduling policy is proposed to exploit both diversity in multiuser channels and diversity in the importance of the edge devices' learning updates. First, a new probabilistic scheduling framework is developed to yield unbiased update aggregation in FEEL. The importance of a local learning update is measured by gradient divergence. If one edge device is scheduled in each communication round, the scheduling policy is derived in closed form to achieve the optimal trade-off between channel quality and update importance. The probabilistic scheduling framework is then extended to allow scheduling multiple edge devices in each communication round. Numerical results obtained using popular models and learning datasets demonstrate that the proposed scheduling policy can achieve faster model convergence and higher learning accuracy than conventional scheduling policies that only exploit a single type of diversity.