Sherman
Abstract:Segmentation of organs of interest in medical CT images is beneficial for diagnosis of diseases. Though recent methods based on Fully Convolutional Neural Networks (F-CNNs) have shown success in many segmentation tasks, fusing features from images with different scales is still a challenge: (1) Due to the lack of spatial awareness, F-CNNs share the same weights at different spatial locations. (2) F-CNNs can only obtain surrounding information through local receptive fields. To address the above challenge, we propose a new segmentation framework based on attention mechanisms, named MFA-Net (Multi-Scale Feature Fusion Attention Network). The proposed framework can learn more meaningful feature maps among multiple scales and result in more accurate automatic segmentation. We compare our proposed MFA-Net with SOTA methods on two 2D liver CT datasets. The experimental results show that our MFA-Net produces more precise segmentation on images with different scales.
Abstract:Fifth generation (5G) mobile communication systems have entered the stage of commercial development, providing users with new services and improved user experiences as well as offering a host of novel opportunities to various industries. However, 5G still faces many challenges. To address these challenges, international industrial, academic, and standards organizations have commenced research on sixth generation (6G) wireless communication systems. A series of white papers and survey papers have been published, which aim to define 6G in terms of requirements, application scenarios, key technologies, etc. Although ITU-R has been working on the 6G vision and it is expected to reach a consensus on what 6G will be by mid-2023, the related global discussions are still wide open and the existing literature has identified numerous open issues. This paper first provides a comprehensive portrayal of the 6G vision, technical requirements, and application scenarios, covering the current common understanding of 6G. Then, a critical appraisal of the 6G network architecture and key technologies is presented. Furthermore, existing testbeds and advanced 6G verification platforms are detailed for the first time. In addition, future research directions and open challenges are identified for stimulating the on-going global debate. Finally, lessons learned to date concerning 6G networks are discussed.
Abstract:Human beings have rich ways of emotional expressions, including facial action, voice, and natural languages. Due to the diversity and complexity of different individuals, the emotions expressed by various modalities may be semantically irrelevant. Directly fusing information from different modalities may inevitably make the model subject to the noise from semantically irrelevant modalities. To tackle this problem, we propose a multimodal relevance estimation network to capture the relevant semantics among modalities in multimodal emotions. Specifically, we take advantage of an attention mechanism to reflect the semantic relevance weights of each modality. Moreover, we propose a relevant semantic estimation loss to weakly supervise the semantics of each modality. Furthermore, we make use of contrastive learning to optimize the similarity of category-level modality-relevant semantics across different modalities in feature space, thereby bridging the semantic gap between heterogeneous modalities. In order to better reflect the emotional state in the real interactive scenarios and perform the semantic relevance analysis, we collect a single-label discrete multimodal emotion dataset named SDME, which enables researchers to conduct multimodal semantic relevance research with large category bias. Experiments on continuous and discrete emotion datasets show that our model can effectively capture the relevant semantics, especially for the large deviations in modal semantics. The code and SDME dataset will be publicly available.
Abstract:In the design of wireless systems, quantization plays a critical role in hardware, which directly affects both area efficiency and energy efficiency. Being an enabling technique, the wide applications of multiple-input multiple-output (MIMO) heavily relies on efficient implementations balancing both performance and complexity. However, most of the existing detectors uniformly quantize all variables, resulting in high redundancy and low flexibility. Requiring both expertise and efforts, an in-depth tailored quantization usually asks for prohibitive costs and is not considered by conventional MIMO detectors. In this paper, a general framework named the automatic hybrid-precision quantization (AHPQ) is proposed with two parts: integral quantization determined by probability density function (PDF), and fractional quantization by deep reinforcement learning (DRL). Being automatic, AHPQ demonstrates high efficiency in figuring out good quantizations for a set of algorithmic parameters. For the approximate message passing (AMP) detector, AHPQ achieves up to $58.7\%$ lower average bitwidth than the unified quantization (UQ) one with almost no performance sacrifice. The feasibility of AHPQ has been verified by implementation with $65$ nm CMOS technology. Compared with its UQ counterpart, AHPQ exhibits $2.97\times$ higher throughput-to-area ratio (TAR) with $19.3\%$ lower energy dissipation. Moreover, by node compression and strength reduction, the AHPQ detector outperforms the state-of-the-art (SOA) in both throughput ($17.92$ Gb/s) and energy efficiency ($7.93$ pJ/b). The proposed AHPQ framework is also applicable for other digital signal processing algorithms.
Abstract:Compared to the linear MIMO detectors, the Belief Propagation (BP) detector has shown greater capabilities in achieving near optimal performance and better nature to iteratively cooperate with channel decoders. Aiming at real applications, recent works mainly fall into the category of reducing the complexity by simplified calculations, at the expense of performance sacrifice. However, the complexity is still unsatisfactory with exponentially increasing complexity or required exponentiation operations. Furthermore, due to the inherent loopy structure, the existing BP detectors persistently encounter error floor in high signal-to-noise ratio (SNR) region, which becomes even worse with calculation approximation. This work aims at a revised BP detector, named {Belief-selective Propagation (BsP)} detector by selectively utilizing the \emph{trusted} incoming messages with sufficiently large \textit{a priori} probabilities for updates. Two proposed strategies: symbol-based truncation (ST) and edge-based simplification (ES) squeeze the complexity (orders lower than the Original-BP), while greatly relieving the error floor issue over a wide range of antenna and modulation combinations. For the $16$-QAM $8 \times 4$ MIMO system, the $\mathcal{B}(1,1)$ {BsP} detector achieves more than $4$\,dB performance gain (@$\text{BER}=10^{-4}$) with roughly $4$ orders lower complexity than the Original-BP detector. Trade-off between performance and complexity towards different application requirement can be conveniently obtained by configuring the ST and ES parameters.
Abstract:Similarity metric is crucial for massive MIMO positioning utilizing channel state information~(CSI). In this letter, we propose a novel massive MIMO CSI similarity learning method via deep convolutional neural network~(DCNN) and contrastive learning. A contrastive loss function is designed considering multiple positive and negative CSI samples drawn from a training dataset. The DCNN encoder is trained using the loss so that positive samples are mapped to points close to the anchor's encoding, while encodings of negative samples are kept away from the anchor's in the representation space. Evaluation results of fingerprint-based positioning on a real-world CSI dataset show that the learned similarity metric improves positioning accuracy significantly compared with other known state-of-the-art methods.
Abstract:Synthetic aperture radar (SAR) is considered being a good option for earth observation with its unique advantages. In this paper, we proposed an adaptive ship detector using full-polarization SAR images. First, by thoroughly investigating the scattering characteristics between ships and their background, and the wave polarization anisotropy, a novel ship detector is proposed by jointing the two characteristics, named Scattering-Anisotropy joint (joint-SA). Based on the theoretical analysis, we showed that the joint-SA is an effective physical quantity to show the difference between the ship and its background, and thus joint-SA can be used for ship detection of full-polarization image data. Second, the generalized Gamma distribution was used to characterize the joint-SA statistics of sea clutter with a large range of homogeneity. As a result, an adaptive constant false alarm rate (CFAR) method was implemented based on the joint-SA. Finally, RADARSAT-2 and GF-3 data in C-band and ALOS data in L-band are used for verification. We tested on five datasets, and the experimental results verify the correctness and superiority of the constant false alarm rate (CFAR) method based on the joint-SA. In addition, the experimental results also showed that the signal-clutter ratio (SCR) of the proposed ship detector joint-SA (33.17 dB, 35.98 dB, 57.25 dB) is better than that of DBSP (8.92 dB, 3.43 dB, 25.40 dB) and RsDVH (17.28 dB, 11.17 dB, 54.55 dB). More importantly, the proposed detector joint-SA has higher detection accuracy and a lower false alarm rate.
Abstract:We propose a novel soft-output joint channel estimation and data detection (JED) algorithm for multiuser (MU) multiple-input multiple-output (MIMO) wireless communication systems. Our algorithm approximately solves a maximum a-posteriori JED optimization problem using deep unfolding and generates soft-output information for the transmitted bits in every iteration. The parameters of the unfolded algorithm are computed by a hyper-network that is trained with a binary cross entropy (BCE) loss. We evaluate the performance of our algorithm in a coded MU-MIMO system with 8 basestation antennas and 4 user equipments and compare it to state-of-the-art algorithms separate channel estimation from soft-output data detection. Our results demonstrate that our JED algorithm outperforms such data detectors with as few as 10 iterations.
Abstract:We propose a joint channel estimation and data detection (JED) algorithm for densely-populated cell-free massive multiuser (MU) multiple-input multiple-output (MIMO) systems, which reduces the channel training overhead caused by the presence of hundreds of simultaneously transmitting user equipments (UEs). Our algorithm iteratively solves a relaxed version of a maximum a-posteriori JED problem and simultaneously exploits the sparsity of cell-free massive MU-MIMO channels as well as the boundedness of QAM constellations. In order to improve the performance and convergence of the algorithm, we propose methods that permute the access point and UE indices to form so-called virtual cells, which leads to better initial solutions. We assess the performance of our algorithm in terms of root-mean-squared-symbol error, bit error rate, and mutual information, and we demonstrate that JED significantly reduces the pilot overhead compared to orthogonal training, which enables reliable communication with short packets to a large number of UEs.
Abstract:In this paper, we present a sparse neural network decoder (SNND) of polar codes based on belief propagation (BP) and deep learning. At first, the conventional factor graph of polar BP decoding is converted to the bipartite Tanner graph similar to low-density parity-check (LDPC) codes. Then the Tanner graph is unfolded and translated into the graphical representation of deep neural network (DNN). The complex sum-product algorithm (SPA) is modified to min-sum (MS) approximation with low complexity. We dramatically reduce the number of weight by using single weight to parameterize the networks. Optimized by the training techniques of deep learning, proposed SNND achieves comparative decoding performance of SPA and obtains about $0.5$ dB gain over MS decoding on ($128,64$) and ($256,128$) codes. Moreover, $60 \%$ complexity reduction is achieved and the decoding latency is significantly lower than the conventional polar BP.