Abstract:Graph-structured data is foundational to numerous web applications, and watermarking is crucial for protecting their intellectual property and ensuring data provenance. Existing watermarking methods primarily operate on graph structures or entangled graph representations, which compromise the transparency and robustness of watermarks due to the information coupling in representing graphs and uncontrollable discretization in transforming continuous numerical representations into graph structures. This motivates us to propose DRGW, the first graph watermarking framework that addresses these issues through disentangled representation learning. Specifically, we design an adversarially trained encoder that learns an invariant structural representation against diverse perturbations and derives a statistically independent watermark carrier, ensuring both robustness and transparency of watermarks. Meanwhile, we devise a graph-aware invertible neural network to provide a lossless channel for watermark embedding and extraction, guaranteeing high detectability and transparency of watermarks. Additionally, we develop a structure-aware editor that resolves the issue of latent modifications into discrete graph edits, ensuring robustness against structural perturbations. Experiments on diverse benchmark datasets demonstrate the superior effectiveness of DRGW.
Abstract:The fine-tuning technique in deep learning gives rise to an emerging lineage relationship among models. This lineage provides a promising perspective for addressing security concerns such as unauthorized model redistribution and false claim of model provenance, which are particularly pressing in \textcolor{blue}{open-weight model} libraries where robust lineage verification mechanisms are often lacking. Existing approaches to model lineage detection primarily rely on static architectural similarities, which are insufficient to capture the dynamic evolution of knowledge that underlies true lineage relationships. Drawing inspiration from the genetic mechanism of human evolution, we tackle the problem of model lineage attestation by verifying the joint trajectory of knowledge evolution and parameter modification. To this end, we propose a novel model lineage attestation framework. In our framework, model editing is first leveraged to quantify parameter-level changes introduced by fine-tuning. Subsequently, we introduce a novel knowledge vectorization mechanism that refines the evolved knowledge within the edited models into compact representations by the assistance of probe samples. The probing strategies are adapted to different types of model families. These embeddings serve as the foundation for verifying the arithmetic consistency of knowledge relationships across models, thereby enabling robust attestation of model lineage. Extensive experimental evaluations demonstrate the effectiveness and resilience of our approach in a variety of adversarial scenarios in the real world. Our method consistently achieves reliable lineage verification across a broad spectrum of model types, including classifiers, diffusion models, and large language models.
Abstract:TeleChat3-MoE is the latest series of TeleChat large language models, featuring a Mixture-of-Experts (MoE) architecture with parameter counts ranging from 105 billion to over one trillion,trained end-to-end on Ascend NPU cluster. This technical report mainly presents the underlying training infrastructure that enables reliable and efficient scaling to frontier model sizes. We detail systematic methodologies for operator-level and end-to-end numerical accuracy verification, ensuring consistency across hardware platforms and distributed parallelism strategies. Furthermore, we introduce a suite of performance optimizations, including interleaved pipeline scheduling, attention-aware data scheduling for long-sequence training,hierarchical and overlapped communication for expert parallelism, and DVM-based operator fusion. A systematic parallelization framework, leveraging analytical estimation and integer linear programming, is also proposed to optimize multi-dimensional parallelism configurations. Additionally, we present methodological approaches to cluster-level optimizations, addressing host- and device-bound bottlenecks during large-scale training tasks. These infrastructure advancements yield significant throughput improvements and near-linear scaling on clusters comprising thousands of devices, providing a robust foundation for large-scale language model development on hardware ecosystems.
Abstract:A subset of Human Activity Classification (HAC) systems are based on AI algorithms that use passively collected wireless signals. This paper presents the micro-Doppler attack targeting HAC from wireless orthogonal frequency division multiplexing (OFDM) signals. The attack is executed by inserting artificial variations in a transmitted OFDM waveform to alter its micro-Doppler signature when it reflects off a human target. We investigate two variants of our scheme that manipulate the waveform at different time scales resulting in altered receiver spectrograms. HAC accuracy with a deep convolutional neural network (CNN) can be reduced to less than 10%.
Abstract:Holographic-type communication brings an immersive tele-holography experience by delivering holographic contents to users. As the direct representation of holographic contents, hologram videos are naturally three-dimensional representation, which consist of a huge volume of data. Advanced multi-connectivity (MC) millimeter-wave (mmWave) networks are now available to transmit hologram videos by providing the necessary bandwidth. However, the existing link selection schemes in MC-based mmWave networks neglect the source content characteristics of hologram videos and the coordination among the parameters of different protocol layers in each link, leading to sub-optimal streaming performance. To address this issue, we propose a cross-layer-optimized link selection scheme for hologram video streaming over mmWave networks. This scheme optimizes link selection by jointly adjusting the video coding bitrate, the modulation and channel coding schemes (MCS), and link power allocation to minimize the end-to-end hologram distortion while guaranteeing the synchronization and quality balance between real and imaginary components of the hologram. Results show that the proposed scheme can effectively improve the hologram video streaming performance in terms of PSNR by 1.2dB to 6.4dB against the non-cross-layer scheme.
Abstract:Eavesdroppers of wireless signals want to infer as much as possible regarding the transmitter (Tx). Popular methods to minimize information leakage to the eavesdropper include covert communication, directional modulation, and beamforming with nulling. In this paper we do not attempt to prevent information leakage to the eavesdropper like the previous methods. Instead we propose to beamform the wireless signal at the Tx in such a way that it incorporates deceptive information. The beamformed orthogonal frequency division multiplexing (OFDM) signal includes a deceptive value for the Doppler (velocity) and range of the Tx. To design the optimal baseband waveform with these characteristics, we define and solve an optimization problem for power-efficient deceptive wireless beamforming (DWB). The relaxed convex Quadratic Program (QP) is solved using a heuristic algorithm. Our simulation results indicate that our DWB scheme can successfully inject deceptive information with low power consumption, while preserving the shape of the created beam.




Abstract:For general users, training a neural network from scratch is usually challenging and labor-intensive. Fortunately, neural network zoos enable them to find a well-performing model for directly use or fine-tuning it in their local environments. Although current model retrieval solutions attempt to convert neural network models into vectors to avoid complex multiple inference processes required for model selection, it is still difficult to choose a suitable model due to inaccurate vectorization and biased correlation alignment between the query dataset and models. From the perspective of knowledge consistency, i.e., whether the knowledge possessed by the model can meet the needs of query tasks, we propose a model retrieval scheme, named Know2Vec, that acts as a black-box retrieval proxy for model zoo. Know2Vec first accesses to models via a black-box interface in advance, capturing vital decision knowledge from models while ensuring their privacy. Next, it employs an effective encoding technique to transform the knowledge into precise model vectors. Secondly, it maps the user's query task to a knowledge vector by probing the semantic relationships within query samples. Furthermore, the proxy ensures the knowledge-consistency between query vector and model vectors within their alignment space, which is optimized through the supervised learning with diverse loss functions, and finally it can identify the most suitable model for a given task during the inference stage. Extensive experiments show that our Know2Vec achieves superior retrieval accuracy against the state-of-the-art methods in diverse neural network retrieval tasks.