Abstract:Orthogonal Time Frequency Space (OTFS) modulation has recently attracted significant interest due to its potential for enabling reliable communication in high-mobility environments. One of the challenges for OTFS receivers is the fractional Doppler that occurs in practical systems, resulting in decreased channel sparsity, and then inaccurate channel estimation and high-complexity equalization. In this paper, we propose a novel unsupervised deep learning (DL)-based OTFS channel estimation and symbol detection scheme, capable of handling different channel conditions, even in the presence of fractional Doppler. In particular, we design a unified plug-and-play (PnP) framework, which can jointly exploit the flexibility of optimization-based methods and utilize the powerful data-driven capability of DL. A lightweight Unet is integrated into the framework as a powerful implicit channel prior for channel estimation, leading to better exploitation of the channel sparsity and the characteristic of the noise simultaneously. Furthermore, to mitigate the channel estimation errors, we realize the PnP framework with a fully connected (FC) network for symbol detection at different noise levels, thereby enhancing robustness. Finally, numerical results demonstrate the effectiveness and robustness of the algorithm.
Abstract:Chemistry and materials science are complex. Recently, there have been great successes in addressing this complexity using data-driven or computational techniques. Yet, the necessity of input structured in very specific forms and the fact that there is an ever-growing number of tools creates usability and accessibility challenges. Coupled with the reality that much data in these disciplines is unstructured, the effectiveness of these tools is limited. Motivated by recent works that indicated that large language models (LLMs) might help address some of these issues, we organized a hackathon event on the applications of LLMs in chemistry, materials science, and beyond. This article chronicles the projects built as part of this hackathon. Participants employed LLMs for various applications, including predicting properties of molecules and materials, designing novel interfaces for tools, extracting knowledge from unstructured data, and developing new educational applications. The diverse topics and the fact that working prototypes could be generated in less than two days highlight that LLMs will profoundly impact the future of our fields. The rich collection of ideas and projects also indicates that the applications of LLMs are not limited to materials science and chemistry but offer potential benefits to a wide range of scientific disciplines.
Abstract:We propose two improvements to target-speaker voice activity detection (TS-VAD), the core component in our proposed speaker diarization system that was submitted to the 2022 Multi-Channel Multi-Party Meeting Transcription (M2MeT) challenge. These techniques are designed to handle multi-speaker conversations in real-world meeting scenarios with high speaker-overlap ratios and under heavy reverberant and noisy condition. First, for data preparation and augmentation in training TS-VAD models, speech data containing both real meetings and simulated indoor conversations are used. Second, in refining results obtained after TS-VAD based decoding, we perform a series of post-processing steps to improve the VAD results needed to reduce diarization error rates (DERs). Tested on the ALIMEETING corpus, the newly released Mandarin meeting dataset used in M2MeT, we demonstrate that our proposed system can decrease the DER by up to 66.55/60.59% relatively when compared with classical clustering based diarization on the Eval/Test set.
Abstract:Convolutional Neural Networks(CNNs) has achieved remarkable performance breakthrough in Euclidean structure data. Recently, aggregation-transformation based Graph Neural networks(GNNs) gradually produce a powerful performance on non-Euclidean data. In this paper, we propose a cross-correlation based graph convolution method allowing to naturally generalize CNNs to non-Euclidean domains and inherit the excellent natures of CNNs, such as local filters, parameter sharing, flexible receptive field, etc. Meanwhile, it leverages dynamically generated convolution kernel and cross-correlation operators to address the shortcomings of prior methods based on aggregation-transformation or their approximations. Our method has achieved or matched popular state-of-the-art results across three established graph benchmarks: the Cora, Citeseer, and Pubmed citation network datasets.