Abstract:Model memorization has implications for both the generalization capacity of machine learning models and the privacy of their training data. This paper investigates label memorization in binary classification models through two novel passive label inference attacks (BLIA). These attacks operate passively, relying solely on the outputs of pre-trained models, such as confidence scores and log-loss values, without interacting with or modifying the training process. By intentionally flipping 50% of the labels in controlled subsets, termed "canaries," we evaluate the extent of label memorization under two conditions: models trained without label differential privacy (Label-DP) and those trained with randomized response-based Label-DP. Despite the application of varying degrees of Label-DP, the proposed attacks consistently achieve success rates exceeding 50%, surpassing the baseline of random guessing and conclusively demonstrating that models memorize training labels, even when these labels are deliberately uncorrelated with the features.
Abstract:The integrated sensing and communication (ISAC) has been envisioned as one representative usage scenario of sixth-generation (6G) network. However, the unprecedented characteristics of 6G, especially the doubly dispersive channel, make classical ISAC waveforms rather challenging to guarantee a desirable performance level. The recently proposed affine frequency division multiplexing (AFDM) can attain full diversity even under doubly dispersive effects, thus becoming a competitive candidate for next-generation ISAC waveforms. Relevant investigations are still at an early stage, which involve only straightforward design lacking explicit theoretical analysis. This paper provides an in-depth investigation on AFDM waveform design for ISAC applications. Specifically, the closed-form Cr\'{a}mer-Rao bounds of target detection for AFDM are derived, followed by a demonstration on its merits over existing counterparts. Furthermore, we formulate the ambiguity function of the pilot-assisted AFDM waveform for the first time, revealing conditions for stable sensing performance. To further enhance both the communication and sensing performance of the AFDM waveform, we propose a novel pilot design by exploiting the characteristics of AFDM signals. The proposed design is analytically validated to be capable of optimizing the ambiguity function property and channel estimation accuracy simultaneously as well as overcoming the sensing and channel estimation range limitation originated from the pilot spacing. Numerical results have verified the superiority of the proposed pilot design in terms of dual-functional performance.
Abstract:The emerging analog matrix computing technology based on memristive crossbar array (MCA) constitutes a revolutionary new computational paradigm applicable to a wide range of domains. Despite the proven applicability of MCA for massive multiple-input multiple-output (MIMO) detection, existing schemes do not take into account the unique characteristics of massive MIMO channel matrix. This oversight makes their computational accuracy highly sensitive to conductance errors of memristive devices, which is unacceptable for massive MIMO receivers. In this paper, we propose an MCA-based circuit design for massive MIMO zero forcing and minimum mean-square error detectors. Unlike the existing MCA-based detectors, we decompose the channel matrix into the product of small-scale and large-scale fading coefficient matrices, thus employing an MCA-based matrix computing module and amplifier circuits to process the two matrices separately. We present two conductance mapping schemes which are crucial but have been overlooked in all prior studies on MCA-based detector circuits. The proposed detector circuit exhibits significantly superior performance to the conventional MCA-based detector circuit, while only incurring negligible additional power consumption. Our proposed detector circuit maintains its advantage in energy efficiency over traditional digital approach by tens to hundreds of times.
Abstract:The memristive crossbar array (MCA) has been successfully applied to accelerate matrix computations of signal detection in massive multiple-input multiple-output (MIMO) systems. However, the unique property of massive MIMO channel matrix makes the detection performance of existing MCA-based detectors sensitive to conductance deviations of memristive devices, and the conductance deviations are difficult to be avoided. In this paper, we propose an MCA-based detector circuit, which is robust to conductance deviations, to compute massive MIMO zero forcing and minimum mean-square error algorithms. The proposed detector circuit comprises an MCA-based matrix computing module, utilized for processing the small-scale fading coefficient matrix, and amplifier circuits based on operational amplifiers (OAs), utilized for processing the large-scale fading coefficient matrix. We investigate the impacts of the open-loop gain of OAs, conductance mapping scheme, and conductance deviation level on detection performance and demonstrate the performance superiority of the proposed detector circuit over the conventional MCA-based detector circuit. The energy efficiency of the proposed detector circuit surpasses that of a traditional digital processor by several tens to several hundreds of times.
Abstract:In recent years, Artificial Intelligence Generated Content (AIGC) has advanced from text-to-image generation to text-to-video and multimodal video synthesis. However, generating playable games presents significant challenges due to the stringent requirements for real-time interaction, high visual quality, and accurate simulation of game mechanics. Existing approaches often fall short, either lacking real-time capabilities or failing to accurately simulate interactive mechanics. To tackle the playability issue, we propose a novel method called \emph{PlayGen}, which encompasses game data generation, an autoregressive DiT-based diffusion model, and a comprehensive playability-based evaluation framework. Validated on well-known 2D and 3D games, PlayGen achieves real-time interaction, ensures sufficient visual quality, and provides accurate interactive mechanics simulation. Notably, these results are sustained even after over 1000 frames of gameplay on an NVIDIA RTX 2060 GPU. Our code is publicly available: https://github.com/GreatX3/Playable-Game-Generation. Our playable demo generated by AI is: http://124.156.151.207.
Abstract:The large models, as predicted by scaling raw forecasts, have made groundbreaking progress in many fields, particularly in natural language generation tasks, where they have approached or even surpassed human levels. However, the unprecedented scale of their parameters brings significant computational and storage costs. These large models require substantial computational resources and GPU memory to operate. When adapting large models to specific downstream tasks, their massive parameter scale poses a significant challenge in fine-tuning on hardware platforms with limited computational power and GPU memory. To address this issue, Parameter-Efficient Fine-Tuning (PEFT) offers a practical solution by efficiently adjusting the parameters of large pre-trained models to suit various downstream tasks. Specifically, PEFT adjusts the parameters of pre-trained large models to adapt to specific tasks or domains, minimizing the introduction of additional parameters and the computational resources required. This review mainly introduces the preliminary knowledge of PEFT, the core ideas and principles of various PEFT algorithms, the applications of PEFT, and potential future research directions. By reading this review, we believe that interested parties can quickly grasp the PEFT methodology, thereby accelerating its development and innovation.
Abstract:Diffusion magnetic resonance imaging (dMRI) is a crucial technique in neuroimaging studies, allowing for the non-invasive probing of the underlying structures of brain tissues. Clinical dMRI data is susceptible to various artifacts during acquisition, which can lead to unreliable subsequent analyses. Therefore, dMRI preprocessing is essential for improving image quality, and manual inspection is often required to ensure that the preprocessed data is sufficiently corrected. However, manual inspection requires expertise and is time-consuming, especially with large-scale dMRI datasets. Given these challenges, an automated dMRI artifact detection tool is necessary to increase the productivity and reliability of dMRI data analysis. To this end, we propose a novel unsupervised deep learning framework called $\textbf{U}$nsupervised $\textbf{d}$MRI $\textbf{A}$rtifact $\textbf{D}$etection via $\textbf{A}$ngular Resolution Enhancement and $\textbf{C}$ycle Consistency Learning (UdAD-AC). UdAD-AC leverages dMRI angular resolution enhancement and cycle consistency learning to capture the effective representation of artifact-free dMRI data during training, and it identifies data containing artifacts using designed confidence score during inference. To assess the capability of UdAD-AC, several commonly reported dMRI artifacts, including bias field, susceptibility distortion, and corrupted volume, were added to the testing data. Experimental results demonstrate that UdAD-AC achieves the best performance compared to competitive methods in unsupervised dMRI artifact detection.
Abstract:Diffusion-weighted imaging (DWI) is a type of Magnetic Resonance Imaging (MRI) technique sensitised to the diffusivity of water molecules, offering the capability to inspect tissue microstructures and is the only in-vivo method to reconstruct white matter fiber tracts non-invasively. The DWI signal can be analysed with the diffusion tensor imaging (DTI) model to estimate the directionality of water diffusion within voxels. Several scalar metrics, including axial diffusivity (AD), mean diffusivity (MD), radial diffusivity (RD), and fractional anisotropy (FA), can be further derived from DTI to quantitatively summarise the microstructural integrity of brain tissue. These scalar metrics have played an important role in understanding the organisation and health of brain tissue at a microscopic level in clinical studies. However, reliable DTI metrics rely on DWI acquisitions with high gradient directions, which often go beyond the commonly used clinical protocols. To enhance the utility of clinically acquired DWI and save scanning time for robust DTI analysis, this work proposes DirGeo-DTI, a deep learning-based method to estimate reliable DTI metrics even from a set of DWIs acquired with the minimum theoretical number (6) of gradient directions. DirGeo-DTI leverages directional encoding and geometric constraints to facilitate the training process. Two public DWI datasets were used for evaluation, demonstrating the effectiveness of the proposed method. Extensive experimental results show that the proposed method achieves the best performance compared to existing DTI enhancement methods and potentially reveals further clinical insights with routine clinical DWI scans.
Abstract:Despite recent advances in stereo matching, the extension to intricate underwater settings remains unexplored, primarily owing to: 1) the reduced visibility, low contrast, and other adverse effects of underwater images; 2) the difficulty in obtaining ground truth data for training deep learning models, i.e. simultaneously capturing an image and estimating its corresponding pixel-wise depth information in underwater environments. To enable further advance in underwater stereo matching, we introduce a large synthetic dataset called UWStereo. Our dataset includes 29,568 synthetic stereo image pairs with dense and accurate disparity annotations for left view. We design four distinct underwater scenes filled with diverse objects such as corals, ships and robots. We also induce additional variations in camera model, lighting, and environmental effects. In comparison with existing underwater datasets, UWStereo is superior in terms of scale, variation, annotation, and photo-realistic image quality. To substantiate the efficacy of the UWStereo dataset, we undertake a comprehensive evaluation compared with nine state-of-the-art algorithms as benchmarks. The results indicate that current models still struggle to generalize to new domains. Hence, we design a new strategy that learns to reconstruct cross domain masked images before stereo matching training and integrate a cross view attention enhancement module that aggregates long-range content information to enhance the generalization ability.
Abstract:Cooperative satellite-aerial-terrestrial networks (CSATNs), where unmanned aerial vehicles (UAVs) are utilized as nomadic aerial relays (A), are highly valuable for many important applications, such as post-disaster urban reconstruction. In this scenario, direct communication between terrestrial terminals (T) and satellites (S) is often unavailable due to poor propagation conditions for satellite signals, and users tend to congregate in regions of finite size. There is a current dearth in the open literature regarding the uplink performance analysis of CSATN operating under the above constraints, and the few contributions on the uplink model terrestrial terminals by a Poisson point process (PPP) relying on the unrealistic assumption of an infinite area. This paper aims to fill the above research gap. First, we propose a stochastic geometry based innovative model to characterize the impact of the finite-size distribution region of terrestrial terminals in the CSATN by jointly using a binomial point process (BPP) and a type-II Mat{\'e}rn hard-core point process (MHCPP). Then, we analyze the relationship between the spatial distribution of the coverage areas of aerial nodes and the finite-size distribution region of terrestrial terminals, thereby deriving the distance distribution of the T-A links. Furthermore, we consider the stochastic nature of the spatial distributions of terrestrial terminals and UAVs, and conduct a thorough analysis of the coverage probability and average ergodic rate of the T-A links under Nakagami fading and the A-S links under shadowed-Rician fading. Finally, the accuracy of our theoretical derivations are confirmed by Monte Carlo simulations. Our research offers fundamental insights into the system-level performance optimization for the realistic CSATNs involving nomadic aerial relays and terrestrial terminals confined in a finite-size region.