Abstract:Although single-task medical image restoration (MedIR) has witnessed remarkable success, the limited generalizability of these methods poses a substantial obstacle to wider application. In this paper, we focus on the task of all-in-one medical image restoration, aiming to address multiple distinct MedIR tasks with a single universal model. Nonetheless, due to significant differences between different MedIR tasks, training a universal model often encounters task interference issues, where different tasks with shared parameters may conflict with each other in the gradient update direction. This task interference leads to deviation of the model update direction from the optimal path, thereby affecting the model's performance. To tackle this issue, we propose a task-adaptive routing strategy, allowing conflicting tasks to select different network paths in spatial and channel dimensions, thereby mitigating task interference. Experimental results demonstrate that our proposed \textbf{A}ll-in-one \textbf{M}edical \textbf{I}mage \textbf{R}estoration (\textbf{AMIR}) network achieves state-of-the-art performance in three MedIR tasks: MRI super-resolution, CT denoising, and PET synthesis, both in single-task and all-in-one settings. The code and data will be available at \href{https://github.com/Yaziwel/All-In-One-Medical-Image-Restoration-via-Task-Adaptive-Routing.git}{https://github.com/Yaziwel/AMIR}.
Abstract:Robotic technology has been widely used in nowadays society, which has made great progress in various fields such as agriculture, manufacturing and entertainment. In this paper, we focus on the topic of drumming robots in entertainment. To this end, we introduce an improving drumming robot that can automatically complete music transcription based on the popular vision transformer network based on the attention mechanism. Equipped with the attention transformer network, our method can efficiently handle the sequential audio embedding input and model their global long-range dependencies. Massive experimental results demonstrate that the improving algorithm can help the drumming robot promote drum classification performance, which can also help the robot to enjoy a variety of smart applications and services.
Abstract:Knowledge graph embedding (KGE), which maps entities and relations in a knowledge graph into continuous vector spaces, has achieved great success in predicting missing links in knowledge graphs. However, knowledge graphs often contain incomplete triples that are difficult to inductively infer by KGEs. To address this challenge, we resort to analogical inference and propose a novel and general self-supervised framework AnKGE to enhance KGE models with analogical inference capability. We propose an analogical object retriever that retrieves appropriate analogical objects from entity-level, relation-level, and triple-level. And in AnKGE, we train an analogy function for each level of analogical inference with the original element embedding from a well-trained KGE model as input, which outputs the analogical object embedding. In order to combine inductive inference capability from the original KGE model and analogical inference capability enhanced by AnKGE, we interpolate the analogy score with the base model score and introduce the adaptive weights in the score function for prediction. Through extensive experiments on FB15k-237 and WN18RR datasets, we show that AnKGE achieves competitive results on link prediction task and well performs analogical inference.
Abstract:Spectral graph convolutional neural networks (GCNNs) have been producing encouraging results in graph classification tasks. However, most spectral GCNNs utilize fixed graphs when aggregating node features, while omitting edge feature learning and failing to get an optimal graph structure. Moreover, many existing graph datasets do not provide initialized edge features, further restraining the ability of learning edge features via spectral GCNNs. In this paper, we try to address this issue by designing an edge feature scheme and an add-on layer between every two stacked graph convolution layers in GCNN. Both are lightweight while effective in filling the gap between edge feature learning and performance enhancement of graph classification. The edge feature scheme makes edge features adapt to node representations at different graph convolution layers. The add-on layers help adjust the edge features to an optimal graph structure. To test the effectiveness of our method, we take Euclidean positions as initial node features and extract graphs with semantic information from point cloud objects. The node features of our extracted graphs are more scalable for edge feature learning than most existing graph datasets (in one-hot encoded label format). Three new graph datasets are constructed based on ModelNet40, ModelNet10 and ShapeNet Part datasets. Experimental results show that our method outperforms state-of-the-art graph classification methods on the new datasets by reaching 96.56% overall accuracy on Graph-ModelNet40, 98.79% on Graph-ModelNet10 and 97.91% on Graph-ShapeNet Part. The constructed graph datasets will be released to the community.
Abstract:This article introduces a neural network-based signal processing framework for intelligent reflecting surface (IRS) aided wireless communications systems. By modeling radio-frequency (RF) impairments inside the "meta-atoms" of IRS (including nonlinearity and memory effects), we present an approach that generalizes the entire IRS-aided system as a reservoir computing (RC) system, an efficient recurrent neural network (RNN) operating in a state near the "edge of chaos". This framework enables us to take advantage of the nonlinearity of this "fabricated" wireless environment to overcome link degradation due to model mismatch. Accordingly, the randomness of the wireless channel and RF imperfections are naturally embedded into the RC framework, enabling the internal RC dynamics lying on the edge of chaos. Furthermore, several practical issues, such as channel state information acquisition, passive beamforming design, and physical layer reference signal design, are discussed.
Abstract:The performance of federated learning systems is bottlenecked by communication costs and training variance. The communication overhead problem is usually addressed by three communication-reduction techniques, namely, model compression, partial device participation, and periodic aggregation, at the cost of increased training variance. Different from traditional distributed learning systems, federated learning suffers from data heterogeneity (since the devices sample their data from possibly different distributions), which induces additional variance among devices during training. Various variance-reduced training algorithms have been introduced to combat the effects of data heterogeneity, while they usually cost additional communication resources to deliver necessary control information. Additionally, data privacy remains a critical issue in FL, and thus there have been attempts at bringing Differential Privacy to this framework as a mediator between utility and privacy requirements. This paper investigates the trade-offs between communication costs and training variance under a resource-constrained federated system theoretically and experimentally, and how communication reduction techniques interplay in a differentially private setting. The results provide important insights into designing practical privacy-aware federated learning systems.
Abstract:Deep reinforcement learning (DRL) has been shown to be successful in many application domains. Combining recurrent neural networks (RNNs) and DRL further enables DRL to be applicable in non-Markovian environments by capturing temporal information. However, training of both DRL and RNNs is known to be challenging requiring a large amount of training data to achieve convergence. In many targeted applications, such as those used in the fifth generation (5G) cellular communication, the environment is highly dynamic while the available training data is very limited. Therefore, it is extremely important to develop DRL strategies that are capable of capturing the temporal correlation of the dynamic environment requiring limited training overhead. In this paper, we introduce the deep echo state Q-network (DEQN) that can adapt to the highly dynamic environment in a short period of time with limited training data. We evaluate the performance of the introduced DEQN method under the dynamic spectrum sharing (DSS) scenario, which is a promising technology in 5G and future 6G networks to increase the spectrum utilization. Compared to conventional spectrum management policy that grants a fixed spectrum band to a single system for exclusive access, DSS allows the secondary system to share the spectrum with the primary system. Our work sheds light on the application of an efficient DRL framework in highly dynamic environments with limited available training data.
Abstract:Relevance and diversity are both important to the success of recommender systems, as they help users to discover from a large pool of items a compact set of candidates that are not only interesting but exploratory as well. The challenge is that relevance and diversity usually act as two competing objectives in conventional recommender systems, which necessities the classic trade-off between exploitation and exploration. Traditionally, higher diversity often means sacrifice on relevance and vice versa. We propose a new approach, heterogeneous inference, which extends the general collaborative filtering (CF) by introducing a new way of CF inference, negative-to-positive. Heterogeneous inference achieves divergent relevance, where relevance and diversity support each other as two collaborating objectives in one recommendation model, and where recommendation diversity is an inherent outcome of the relevance inference process. Benefiting from its succinctness and flexibility, our approach is applicable to a wide range of recommendation scenarios/use-cases at various sophistication levels. Our analysis and experiments on public datasets and real-world production data show that our approach outperforms existing methods on relevance and diversity simultaneously.
Abstract:Fog nodes in the vicinity of IoT devices are promising to provision low latency services by offloading tasks from IoT devices to them. Mobile IoT is composed by mobile IoT devices such as vehicles, wearable devices and smartphones. Owing to the time-varying channel conditions, traffic loads and computing loads, it is challenging to improve the quality of service (QoS) of mobile IoT devices. As task delay consists of both the transmission delay and computing delay, we investigate the resource allocation (i.e., including both radio resource and computation resource) in both the wireless channel and fog node to minimize the delay of all tasks while their QoS constraints are satisfied. We formulate the resource allocation problem into an integer non-linear problem, where both the radio resource and computation resource are taken into account. As IoT tasks are dynamic, the resource allocation for different tasks are coupled with each other and the future information is impractical to be obtained. Therefore, we design an on-line reinforcement learning algorithm to make the sub-optimal decision in real time based on the historical data. The performance of the designed algorithm has been demonstrated by extensive simulation results.
Abstract:In this paper, we investigate learning-based MIMO-OFDM symbol detection strategies focusing on a special recurrent neural network (RNN) -- reservoir computing (RC). We first introduce the Time-Frequency RC to take advantage of the structural information inherent in OFDM signals. Using the time domain RC and the time-frequency RC as the building blocks, we provide two extensions of the shallow RC to RCNet: 1) Stacking multiple time domain RCs; 2) Stacking multiple time-frequency RCs into a deep structure. The combination of RNN dynamics, the time-frequency structure of MIMO-OFDM signals, and the deep network enables RCNet to handle the interference and nonlinear distortion of MIMO-OFDM signals to outperform existing methods. Unlike most existing NN-based detection strategies, RCNet is also shown to provide a good generalization performance even with a limited training set (i.e, similar amount of reference signals/training as standard model-based approaches). Numerical experiments demonstrate that the introduced RCNet can offer a faster learning convergence and as much as 20% gain in bit error rate over a shallow RC structure by compensating for the nonlinear distortion of the MIMO-OFDM signal, such as due to power amplifier compression in the transmitter or due to finite quantization resolution in the receiver.