Department of Electrical and Computer Engineering, The University of Texas at Dallas, Richardson, 75080, Texas, USA
Abstract:The identification and restoration of ancient watermarks have long been a major topic in codicology and history. Classifying historical documents based on watermarks can be difficult due to the diversity of watermarks, crowded and noisy samples, multiple modes of representation, and minor distinctions between classes and intra-class changes. This paper proposes a U-net-based conditional generative adversarial network (GAN) to translate noisy raw historical watermarked images into clean, handwriting-free images with just watermarks. Considering its ability to perform image translation from degraded (noisy) pixels to clean pixels, the proposed network is termed as Npix2Cpix. Instead of employing directly degraded watermarked images, the proposed network uses image-to-image translation using adversarial learning to create clutter and handwriting-free images for restoring and categorizing the watermarks for the first time. In order to learn the mapping from input noisy image to output clean image, the generator and discriminator of the proposed U-net-based GAN are trained using two separate loss functions, each of which is based on the distance between images. After using the proposed GAN to pre-process noisy watermarked images, Siamese-based one-shot learning is used to classify watermarks. According to experimental results on a large-scale historical watermark dataset, extracting watermarks from tainted images can result in high one-shot classification accuracy. The qualitative and quantitative evaluation of the retrieved watermarks illustrates the effectiveness of the proposed approach.
Abstract:As a next-generation wireless technology, the in-band full-duplex (IBFD) transmission enables simultaneous transmission and reception of signals over the same frequency, thereby doubling spectral efficiency. Further, a continuous up-scaling of wireless network carrier frequencies arising from ever-increasing data traffic is driving research on integrated sensing and communications (ISAC) systems. In this context, we study the co-design of common waveforms, precoders, and filters for an IBFD multi-user (MU) multiple-input multiple-output (MIMO) communications with a distributed MIMO radar. This paper, along with companion papers (Part I and II), proposes a comprehensive MRMC framework that addresses all these challenges. In the companion papers, we developed signal processing and joint design algorithms for this distributed system. In this paper, we tackle multi-target detection, localization, and tracking. This co-design problem that includes practical MU-MIMO constraints on power and quality-of-service is highly non-convex. We propose a low-complexity procedure based on Barzilai-Borwein gradient algorithm to obtain the design parameters and mixed-integer linear program for distributed target localization. Numerical experiments demonstrate the feasibility and accuracy of multi-target sensing of the distributed FD ISAC system. Finally, we localize and track multiple targets by adapting the joint probabilistic data association and extended Kalman filter for this system.
Abstract:We address the challenge of spectral sharing between a statistical multiple-input multiple-output (MIMO) radar and an in-band full-duplex (IBFD) multi-user MIMO (MU-MIMO) communications system operating simultaneously in the same frequency band. Existing research on joint MIMO-radar-MIMO-communications (MRMC) systems has limitations, such as focusing on colocated MIMO radars, half-duplex MIMO communications, single-user scenarios, neglecting practical constraints, or employing separate transmit/receive units for MRMC coexistence. This paper, along with companion papers (Part I and III), proposes a comprehensive MRMC framework that addresses all these challenges. In the previous companion paper (Part I), we presented signal processing techniques for a distributed IBFD MRMC system. In this paper, we introduce joint design of statistical MIMO radar codes, uplink/downlink precoders, and corresponding receive filters using a novel metric called compounded-and-weighted sum mutual information. To solve the resulting highly non-convex problem, we employ a combination of block coordinate descent (BCD) and alternating projection methods. Numerical experiments show convergence of our algorithm, mitigation of uplink interference, and stable data rates under varying noise levels, channel estimate imperfections, and self-interference. The subsequent companion paper (Part III) extends the discussion to multiple targets and evaluates the tracking performance of our MRMC system.
Abstract:This paper addresses the beam-selection challenges in Multi-User Multiple Input Multiple Output (MU-MIMO) beamforming for mm-wave and THz channels, focusing on the pivotal aspect of spectral efficiency (SE) and computational efficiency. We introduce a novel approach, the Greedy Interference-Optimized Singular Vector Beam-selection (G-IOSVB) algorithm, which offers a strategic balance between high SE and low computational complexity. Our study embarks on a comparative analysis of G-IOSVB against the traditional IOSVB and the exhaustive Singular-Vector Beamspace Search (SVBS) algorithms. The findings reveal that while SVBS achieves the highest SE, it incurs significant computational costs, approximately 162 seconds per channel realization. In contrast, G-IOSVB aligns closely with IOSVB in SE performance yet is markedly more computationally efficient. Heatmaps vividly demonstrate this efficiency, highlighting G-IOSVB's reduced computation time without sacrificing SE. We also delve into the mathematical intricacies of G-IOSVB, demonstrating its theoretical and practical superiority through rigorous expressions and detailed algorithmic analysis. The numerical results illustrate that G-IOSVB stands out as an efficient, practical solution for MU-MIMO systems, making it a promising candidate for high-speed, high-efficiency wireless communication networks.
Abstract:This paper investigates the potential of contrastive learning in 6G ultra-massive multiple-input multiple-output (UM-MIMO) communication systems, specifically focusing on hybrid beamforming under imperfect channel state information (CSI) conditions at THz. UM-MIMO systems are promising for future 6G wireless communication networks due to their high spectral efficiency and capacity. The accuracy of CSI significantly influences the performance of UM-MIMO systems. However, acquiring perfect CSI is challenging due to various practical constraints such as channel estimation errors, feedback delays, and hardware imperfections. To address this issue, we propose a novel self-supervised contrastive learning-based approach for hybrid beamforming, which is robust against imperfect CSI. We demonstrate the power of contrastive learning to tackle the challenges posed by imperfect CSI and show that our proposed method results in improved system performance in terms of achievable rate compared to traditional methods.
Abstract:A person's movement or relative positioning effectively generates raw electrical signals that can be read by computing machines to apply various manipulative techniques for the classification of different human activities. In this paper, a stratified multi-structural approach based on a Residual network ensembled with Residual MobileNet is proposed, termed as FusionActNet. The proposed method involves using carefully designed Residual blocks for classifying the static and dynamic activities separately because they have clear and distinct characteristics that set them apart. These networks are trained independently, resulting in two specialized and highly accurate models. These models excel at recognizing activities within a specific superclass by taking advantage of the unique algorithmic benefits of architectural adjustments. Afterward, these two ResNets are passed through a weighted ensemble-based Residual MobileNet. Subsequently, this ensemble proficiently discriminates between a specific static and a specific dynamic activity, which were previously identified based on their distinct feature characteristics in the earlier stage. The proposed model is evaluated using two publicly accessible datasets; namely, UCI HAR and Motion-Sense. Therein, it successfully handled the highly confusing cases of data overlap. Therefore, the proposed approach achieves a state-of-the-art accuracy of 96.71% and 95.35% in the UCI HAR and Motion-Sense datasets respectively.
Abstract:With the huge technological advances introduced by deep learning in audio & speech processing, many novel synthetic speech techniques achieved incredible realistic results. As these methods generate realistic fake human voices, they can be used in malicious acts such as people imitation, fake news, spreading, spoofing, media manipulations, etc. Hence, the ability to detect synthetic or natural speech has become an urgent necessity. Moreover, being able to tell which algorithm has been used to generate a synthetic speech track can be of preeminent importance to track down the culprit. In this paper, a novel strategy is proposed to attribute a synthetic speech track to the generator that is used to synthesize it. The proposed detector transforms the audio into log-mel spectrogram, extracts features using CNN, and classifies it between five known and unknown algorithms, utilizing semi-supervision and ensemble to improve its robustness and generalizability significantly. The proposed detector is validated on two evaluation datasets consisting of a total of 18,000 weakly perturbed (Eval 1) & 10,000 strongly perturbed (Eval 2) synthetic speeches. The proposed method outperforms other top teams in accuracy by 12-13% on Eval 2 and 1-2% on Eval 1, in the IEEE SP Cup challenge at ICASSP 2022.
Abstract:Multiple-input multiple-output (MIMO) systems play an essential role in direction-of-arrival (DOA) estimation. A large number of antennas used in a MIMO system imposes a huge complexity burden on the popular DOA estimation algorithms, such as MUSIC and ESPRIT due to the implementation of eigenvalue decomposition. This renders those algorithms impractical in applications requiring quick DOA estimation. Consequently, we theoretically derive several useful noise subspace vectors when the number of signal sources is less than the number of elements in both the transmitter and receiver sides. Those noise subspace vectors are then utilized to formulate a 2D-constrained minimization problem, solved iteratively to obtain the DOAs of all the sources in a scene. The convergence of the proposed iterative algorithm has been mathematically as well as numerically demonstrated. Depending on the number of iterations, our algorithm can provide significant complexity gain over the existing high-resolution 2D DOA estimation algorithms in MIMO systems, such as MUSIC, while exhibiting comparable performance for a moderate to high signal-to-noise ratio (SNR).
Abstract:The recent pandemic has refocused the medical world's attention on the diagnostic techniques associated with cardiovascular disease. Heart rate provides a real-time snapshot of cardiovascular health. A more precise heart rate reading provides a better understanding of cardiac muscle activity. Although many existing diagnostic techniques are approaching the limits of perfection, there remains potential for further development. In this paper, we propose MIBINET, a convolutional neural network for real-time proctoring of heart rate via inter-beat-interval (IBI) from millimeter wave (mm-wave) radar ballistocardiography signals. This network can be used in hospitals, homes, and passenger vehicles due to its lightweight and contactless properties. It employs classical signal processing prior to fitting the data into the network. Although MIBINET is primarily designed to work on mm-wave signals, it is found equally effective on signals of various modalities such as PCG, ECG, and PPG. Extensive experimental results and a thorough comparison with the current state-of-the-art on mm-wave signals demonstrate the viability and versatility of the proposed methodology. Keywords: Cardiovascular disease, contactless measurement, heart rate, IBI, mm-wave radar, neural network
Abstract:Apart from the conventional parameters (such as signal-to-noise ratio, array geometry and size, sample size), several other factors (e.g. alignment of the antenna elements, polarization parameters) influence the performance of direction of arrival (DOA) estimating algorithms. When all the antenna elements are identically aligned, the polarization parameters do not affect the steering vectors, which is the underlying assumption of all the conventional DOA algorithms. Unfortunately, in this case, for a given set of DOA angles there exists a range of polarization parameters which could result in a very low signal-to-noise ratio (SNR) across all the antenna elements in the array. To avoid this type of catastrophic event, different antenna element needs to be aligned differently. However, this fact will make almost all commonly used DOA estimation algorithms non-operable, since the steering vectors are contaminated by the polarization parameters. To the best of our knowledge, no work in the literature addresses this issue even for a single user environment. In this paper, that line of inquiry is pursued. We consider a circular array with the minimum number of antenna elements and propose an antenna alignment scheme to ensure that at any given point no more than one element will suffer from significantly low SNR due to the contribution of polarization. A low complexity algorithm that estimates the DOA angles in a closed-form manner is developed. We treat MUSIC as the baseline algorithm and demonstrate how it can reliably operate in all possible DOA and polarization environments. Finally, a thorough performance and complexity analysis are illustrated for the above two algorithms.