Abstract:Recently, with the development of Neural Radiance Fields and Gaussian Splatting, 3D reconstruction techniques have achieved remarkably high fidelity. However, the latent representations learnt by these methods are highly entangled and lack interpretability. In this paper, we propose a novel part-aware compositional reconstruction method, called GaussianBlock, that enables semantically coherent and disentangled representations, allowing for precise and physical editing akin to building blocks, while simultaneously maintaining high fidelity. Our GaussianBlock introduces a hybrid representation that leverages the advantages of both primitives, known for their flexible actionability and editability, and 3D Gaussians, which excel in reconstruction quality. Specifically, we achieve semantically coherent primitives through a novel attention-guided centering loss derived from 2D semantic priors, complemented by a dynamic splitting and fusion strategy. Furthermore, we utilize 3D Gaussians that hybridize with primitives to refine structural details and enhance fidelity. Additionally, a binding inheritance strategy is employed to strengthen and maintain the connection between the two. Our reconstructed scenes are evidenced to be disentangled, compositional, and compact across diverse benchmarks, enabling seamless, direct and precise editing while maintaining high quality.
Abstract:Unlimited sampling was recently introduced to deal with the clipping or saturation of measurements where a modulo operator is applied before sampling. In this paper, we investigate the identifiability of the model where measurements are acquired under a discrete Fourier transform (DFT) sensing matrix first followed by a modulo operator (modulo-DFT). Firstly, based on the theorems of cyclotomic polynomials, we derive a sufficient condition for uniquely identifying the original signal in modulo-DFT. Additionally, for periodic bandlimited signals (PBSs) under unlimited sampling which can be viewed as a special case of modulo-DFT, the necessary and sufficient condition for the unique recovery of the original signal are provided. Moreover, we show that when the oversampling factor exceeds $3(1+1/P)$, PBS is always identifiable from the modulo samples, where $P$ is the number of harmonics including the fundamental component in the positive frequency part.
Abstract:Contrastive learning has been proven to be effective in learning better sentence representations. However, to train a contrastive learning model, large numbers of labeled sentences are required to construct positive and negative pairs explicitly, such as those in natural language inference (NLI) datasets. Unfortunately, acquiring sufficient high-quality labeled data can be both time-consuming and resource-intensive, leading researchers to focus on developing methods for learning unsupervised sentence representations. As there is no clear relationship between these unstructured randomly-sampled sentences, building positive and negative pairs over them is tricky and problematic. To tackle these challenges, in this paper, we propose SemCSR, a semantic-aware contrastive sentence representation framework. By leveraging the generation and evaluation capabilities of large language models (LLMs), we can automatically construct a high-quality NLI-style corpus without any human annotation, and further incorporate the generated sentence pairs into learning a contrastive sentence representation model. Extensive experiments and comprehensive analyses demonstrate the effectiveness of our proposed framework for learning a better sentence representation with LLMs.
Abstract:Modulo sampling or unlimited sampling has recently drawn a great deal of attention for cutting-edge applications, due to overcoming the barrier of information loss through sensor saturation and clipping. This is a significant problem, especially when the range of signal amplitudes is unknown or in the near-far case. To overcome this fundamental bottleneck, we propose a one-bit-aided (1bit-aided) modulo sampling scheme for direction-of-arrival (DOA) estimation. On the one hand, one-bit quantization involving a simple comparator offers the advantages of low-cost and low-complexity implementation. On the other hand, one-bit quantization provides an estimate of the normalized covariance matrix of the unquantized measurements via the arcsin law. The estimate of the normalized covariance matrix is used to implement blind integer-forcing (BIF) decoder to unwrap the modulo samples to construct the covariance matrix, and subspace methods can be used to perform the DOA estimation. Our approach named as 1bit-aided-BIF addresses the near-far problem well and overcomes the intrinsic low dynamic range of one-bit quantization. Numerical experiments validate the excellent performance of the proposed algorithm compared to using a high-precision ADC directly in the given set up.
Abstract:Recently, data augmentation (DA) methods have been proven to be effective for pre-trained language models (PLMs) in low-resource settings, including few-shot named entity recognition (NER). However, conventional NER DA methods are mostly aimed at sequence labeling models, i.e., token-level classification, and few are compatible with unified autoregressive generation frameworks, which can handle a wider range of NER tasks, such as nested NER. Furthermore, these generation frameworks have a strong assumption that the entities will appear in the target sequence with the same left-to-right order as the source sequence. In this paper, we claim that there is no need to keep this strict order, and more diversified but reasonable target entity sequences can be provided during the training stage as a novel DA method. Nevertheless, a naive mixture of augmented data can confuse the model since one source sequence will then be paired with different target sequences. Therefore, we propose a simple but effective Prompt Ordering based Data Augmentation (PODA) method to improve the training of unified autoregressive generation frameworks under few-shot NER scenarios. Experimental results on three public NER datasets and further analyses demonstrate the effectiveness of our approach.
Abstract:Line spectral estimation (LSE) is a fundamental problem in signal processing due to its wide applications. For signals that are orders of magnitude larger than the dynamic range of the analog-to-digital (ADC) threshold, conventional ADC will clip or saturate, leading to significant information loss. The Unlimited Sensing Framework (USF) was introduced to avoid saturation through sampling of the signal modulo. Motivated by the USF, we study the LSE from modulo samples. By exploiting oversampling and the bounded spectral property, the US-LSE is proposed to recover the folding instants and perform LSE. Our numerical simulations show that for oversampling factor $\gamma\geq 10$, the US-LSE is more stable in a lower signal to noise scenario, in the range of $20\sim30$ dB, compared to the existing algorithm. Besides, we process the real data generated by AWR1642, and show that US-LSE estimates the ranges of two corners with SNRs of $12$ dB and $23$ dB for oversampling factor $\gamma= 10$, and the normalized dynamic range ${\rm DR}=\beta_g/\lambda\approx 3$, where $\beta_g$ is the infinity norm of the signal.
Abstract:Joint network topology inference represents a canonical problem of jointly learning multiple graph Laplacian matrices from heterogeneous graph signals. In such a problem, a widely employed assumption is that of a simple common component shared among multiple networks. However, in practice, a more intricate topological pattern, comprising simultaneously of sparse, homogeneity and heterogeneity components, would exhibit in multiple networks. In this paper, we propose a general graph estimator based on a novel structured fusion regularization that enables us to jointly learn multiple graph Laplacian matrices with such complex topological patterns, and enjoys both high computational efficiency and rigorous theoretical guarantee. Moreover, in the proposed regularization term, the topological pattern among networks is characterized by a Gram matrix, endowing our graph estimator with the ability of flexible modelling different types of topological patterns by different choices of the Gram matrix. Computationally, the regularization term, coupling the parameters together, makes the formulated optimization problem intractable and thus, we develop a computationally-scalable algorithm based on the alternating direction method of multipliers (ADMM) to solve it efficiently. Theoretically, we provide a theoretical analysis of the proposed graph estimator, which establishes a non-asymptotic bound of the estimation error under the high-dimensional setting and reflects the effect of several key factors on the convergence rate of our algorithm. Finally, the superior performance of the proposed method is illustrated through simulated and real data examples.