Abstract:Learning the unknown interactions that govern a quantum system is crucial for quantum information processing, device benchmarking, and quantum sensing. The problem, known as Hamiltonian learning, is well understood under the assumption that interactions are local, but this assumption may not hold for arbitrary Hamiltonians. Previous methods all require high-order inverse polynomial dependency with precision, unable to surpass the standard quantum limit and reach the gold standard Heisenberg-limited scaling. Whether Heisenberg-limited Hamiltonian learning is possible without prior assumptions about the interaction structures, a challenge we term \emph{ansatz-free Hamiltonian learning}, remains an open question. In this work, we present a quantum algorithm to learn arbitrary sparse Hamiltonians without any structure constraints using only black-box queries of the system's real-time evolution and minimal digital controls to attain Heisenberg-limited scaling in estimation error. Our method is also resilient to state-preparation-and-measurement errors, enhancing its practical feasibility. Moreover, we establish a fundamental trade-off between total evolution time and quantum control on learning arbitrary interactions, revealing the intrinsic interplay between controllability and total evolution time complexity for any learning algorithm. These results pave the way for further exploration into Heisenberg-limited Hamiltonian learning in complex quantum systems under minimal assumptions, potentially enabling new benchmarking and verification protocols.
Abstract:Medical quality control indicators are essential to assess the qualifications of healthcare institutions for medical services. With the impressive performance of large language models (LLMs) like GPT-4 in the medical field, leveraging these technologies for the Medical Quality Control Indicator Calculation (MQCIC) presents a promising approach. In this work, (1) we introduce a real-world task MQCIC and propose an open-source Chinese electronic medical records (EMRs)-based dataset (CMQCIC-Bench) comprising 785 instances and 76 indicators. (2) We propose a semi-automatic method to enhance the rule representation. Then we propose the Clinical Facts-based Inferential Rule (CF-IR) method that disentangles the clinical fact verification and inferential rule reasoning actions. (3) We conduct comprehensive experiments on 20 representative LLMs, covering general and medical models. Our findings reveal that CF-IR outperforms Chain-of-Thought methods in MQCIC tasks. (4) We conduct an error analysis and investigate the capabilities of clinical fact verification and inferential rule reasoning, providing insights to improve performance in the MQCIC further. The dataset and code is available in this repo https://anonymous.4open.science/r/C-MQCIC-1151.
Abstract:Machine learning is widely believed to be one of the most promising practical applications of quantum computing. Existing quantum machine learning schemes typically employ a quantum-classical hybrid approach that relies crucially on gradients of model parameters. Such an approach lacks provable convergence to global minima and will become infeasible as quantum learning models scale up. Here, we introduce quantum automated learning, where no variational parameter is involved and the training process is converted to quantum state preparation. In particular, we encode training data into unitary operations and iteratively evolve a random initial state under these unitaries and their inverses, with a target-oriented perturbation towards higher prediction accuracy sandwiched in between. Under reasonable assumptions, we rigorously prove that the evolution converges exponentially to the desired state corresponding to the global minimum of the loss function. We show that such a training process can be understood from the perspective of preparing quantum states by imaginary time evolution, where the data-encoded unitaries together with target-oriented perturbations would train the quantum learning model in an automated fashion. We further prove that the quantum automated learning paradigm features good generalization ability with the generalization error upper bounded by the ratio between a logarithmic function of the Hilbert space dimension and the number of training samples. In addition, we carry out extensive numerical simulations on real-life images and quantum data to demonstrate the effectiveness of our approach and validate the assumptions. Our results establish an unconventional quantum learning strategy that is gradient-free with provable and explainable trainability, which would be crucial for large-scale practical applications of quantum computing in machine learning scenarios.
Abstract:Bimanual dexterous manipulation remains significant challenges in robotics due to the high DoFs of each hand and their coordination. Existing single-hand manipulation techniques often leverage human demonstrations to guide RL methods but fail to generalize to complex bimanual tasks involving multiple sub-skills. In this paper, we introduce VTAO-BiManip, a novel framework that combines visual-tactile-action pretraining with object understanding to facilitate curriculum RL to enable human-like bimanual manipulation. We improve prior learning by incorporating hand motion data, providing more effective guidance for dual-hand coordination than binary tactile feedback. Our pretraining model predicts future actions as well as object pose and size using masked multimodal inputs, facilitating cross-modal regularization. To address the multi-skill learning challenge, we introduce a two-stage curriculum RL approach to stabilize training. We evaluate our method on a bottle-cap unscrewing task, demonstrating its effectiveness in both simulated and real-world environments. Our approach achieves a success rate that surpasses existing visual-tactile pretraining methods by over 20%.
Abstract:Large language models (LLMs) exhibit remarkable generative capabilities but often suffer from hallucinations. Retrieval-augmented generation (RAG) offers an effective solution by incorporating external knowledge, but existing methods still face several limitations: additional deployment costs of separate retrievers, redundant input tokens from retrieved text chunks, and the lack of joint optimization of retrieval and generation. To address these issues, we propose \textbf{RetroLLM}, a unified framework that integrates retrieval and generation into a single, cohesive process, enabling LLMs to directly generate fine-grained evidence from the corpus with constrained decoding. Moreover, to mitigate false pruning in the process of constrained evidence generation, we introduce (1) hierarchical FM-Index constraints, which generate corpus-constrained clues to identify a subset of relevant documents before evidence generation, reducing irrelevant decoding space; and (2) a forward-looking constrained decoding strategy, which considers the relevance of future sequences to improve evidence accuracy. Extensive experiments on five open-domain QA datasets demonstrate RetroLLM's superior performance across both in-domain and out-of-domain tasks. The code is available at \url{https://github.com/sunnynexus/RetroLLM}.
Abstract:Tensor network machine learning models have shown remarkable versatility in tackling complex data-driven tasks, ranging from quantum many-body problems to classical pattern recognitions. Despite their promising performance, a comprehensive understanding of the underlying assumptions and limitations of these models is still lacking. In this work, we focus on the rigorous formulation of their no-free-lunch theorem -- essential yet notoriously challenging to formalize for specific tensor network machine learning models. In particular, we rigorously analyze the generalization risks of learning target output functions from input data encoded in tensor network states. We first prove a no-free-lunch theorem for machine learning models based on matrix product states, i.e., the one-dimensional tensor network states. Furthermore, we circumvent the challenging issue of calculating the partition function for two-dimensional Ising model, and prove the no-free-lunch theorem for the case of two-dimensional projected entangled-pair state, by introducing the combinatorial method associated to the "puzzle of polyominoes". Our findings reveal the intrinsic limitations of tensor network-based learning models in a rigorous fashion, and open up an avenue for future analytical exploration of both the strengths and limitations of quantum-inspired machine learning frameworks.
Abstract:Implicit neural representations and 3D Gaussian splatting (3DGS) have shown great potential for scene reconstruction. Recent studies have expanded their applications in autonomous reconstruction through task assignment methods. However, these methods are mainly limited to single robot, and rapid reconstruction of large-scale scenes remains challenging. Additionally, task-driven planning based on surface uncertainty is prone to being trapped in local optima. To this end, we propose the first 3DGS-based centralized multi-robot autonomous 3D reconstruction framework. To further reduce time cost of task generation and improve reconstruction quality, we integrate online open-vocabulary semantic segmentation with surface uncertainty of 3DGS, focusing view sampling on regions with high instance uncertainty. Finally, we develop a multi-robot collaboration strategy with mode and task assignments improving reconstruction quality while ensuring planning efficiency. Our method demonstrates the highest reconstruction quality among all planning methods and superior planning efficiency compared to existing multi-robot methods. We deploy our method on multiple robots, and results show that it can effectively plan view paths and reconstruct scenes with high quality.
Abstract:We study the sample complexity of the prototypical tasks quantum purity estimation and quantum inner product estimation. In purity estimation, we are to estimate $tr(\rho^2)$ of an unknown quantum state $\rho$ to additive error $\epsilon$. Meanwhile, for quantum inner product estimation, Alice and Bob are to estimate $tr(\rho\sigma)$ to additive error $\epsilon$ given copies of unknown quantum state $\rho$ and $\sigma$ using classical communication and restricted quantum communication. In this paper, we show a strong connection between the sample complexity of purity estimation with bounded quantum memory and inner product estimation with bounded quantum communication and unentangled measurements. We propose a protocol that solves quantum inner product estimation with $k$-qubit one-way quantum communication and unentangled local measurements using $O(median\{1/\epsilon^2,2^{n/2}/\epsilon,2^{n-k}/\epsilon^2\})$ copies of $\rho$ and $\sigma$. Our protocol can be modified to estimate the purity of an unknown quantum state $\rho$ using $k$-qubit quantum memory with the same complexity. We prove that arbitrary protocols with $k$-qubit quantum memory that estimate purity to error $\epsilon$ require $\Omega(median\{1/\epsilon^2,2^{n/2}/\sqrt{\epsilon},2^{n-k}/\epsilon^2\})$ copies of $\rho$. This indicates the same lower bound for quantum inner product estimation with one-way $k$-qubit quantum communication and classical communication, and unentangled local measurements. For purity estimation, we further improve the lower bound to $\Omega(\max\{1/\epsilon^2,2^{n/2}/\epsilon\})$ for any protocols using an identical single-copy projection-valued measurement. Additionally, we investigate a decisional variant of quantum distributed inner product estimation without quantum communication for mixed state and provide a lower bound on the sample complexity.
Abstract:Recent advancements in sensor technology and deep learning have led to significant progress in 3D human body reconstruction. However, most existing approaches rely on data from a specific sensor, which can be unreliable due to the inherent limitations of individual sensing modalities. On the other hand, existing multi-modal fusion methods generally require customized designs based on the specific sensor combinations or setups, which limits the flexibility and generality of these methods. Furthermore, conventional point-image projection-based and Transformer-based fusion networks are susceptible to the influence of noisy modalities and sensor poses. To address these limitations and achieve robust 3D human body reconstruction in various conditions, we propose AdaptiveFusion, a generic adaptive multi-modal multi-view fusion framework that can effectively incorporate arbitrary combinations of uncalibrated sensor inputs. By treating different modalities from various viewpoints as equal tokens, and our handcrafted modality sampling module by leveraging the inherent flexibility of Transformer models, AdaptiveFusion is able to cope with arbitrary numbers of inputs and accommodate noisy modalities with only a single training network. Extensive experiments on large-scale human datasets demonstrate the effectiveness of AdaptiveFusion in achieving high-quality 3D human body reconstruction in various environments. In addition, our method achieves superior accuracy compared to state-of-the-art fusion methods.
Abstract:We study the task of agnostic tomography: given copies of an unknown $n$-qubit state $\rho$ which has fidelity $\tau$ with some state in a given class $C$, find a state which has fidelity $\ge \tau - \epsilon$ with $\rho$. We give a new framework, stabilizer bootstrapping, for designing computationally efficient protocols for this task, and use this to get new agnostic tomography protocols for the following classes: Stabilizer states: We give a protocol that runs in time $\mathrm{poly}(n,1/\epsilon)\cdot (1/\tau)^{O(\log(1/\tau))}$, answering an open question posed by Grewal, Iyer, Kretschmer, Liang [40] and Anshu and Arunachalam [6]. Previous protocols ran in time $\mathrm{exp}(\Theta(n))$ or required $\tau>\cos^2(\pi/8)$. States with stabilizer dimension $n - t$: We give a protocol that runs in time $n^3\cdot(2^t/\tau)^{O(\log(1/\epsilon))}$, extending recent work on learning quantum states prepared by circuits with few non-Clifford gates, which only applied in the realizable setting where $\tau = 1$ [30, 37, 46, 61]. Discrete product states: If $C = K^{\otimes n}$ for some $\mu$-separated discrete set $K$ of single-qubit states, we give a protocol that runs in time $(n/\mu)^{O((1 + \log (1/\tau))/\mu)}/\epsilon^2$. This strictly generalizes a prior guarantee which applied to stabilizer product states [39]. For stabilizer product states, we give a further improved protocol that runs in time $(n^2/\epsilon^2)\cdot (1/\tau)^{O(\log(1/\tau))}$. As a corollary, we give the first protocol for estimating stabilizer fidelity, a standard measure of magic for quantum states, to error $\epsilon$ in $n^3 \mathrm{quasipoly}(1/\epsilon)$ time.