Abstract:With the help of Score Distillation Sampling(SDS) and the rapid development of various trainable 3D representations, Text-to-Image(T2I) diffusion models have been applied to 3D generation tasks and achieved considerable results. There are also some attempts toward the task of editing 3D objects leveraging this Text-to-3D pipeline. However, most methods currently focus on adding additional geometries, overwriting textures or both. But few of them can perform non-rigid transformation of 3D objects. For those who can perform non-rigid editing, on the other hand, suffer from low-resolution, lack of fidelity and poor flexibility. In order to address these issues, we present: Plasticine3D, a general, high-fidelity, photo-realistic and controllable non-rigid editing pipeline. Firstly, our work divides the editing process into a geometry editing stage and a texture editing stage to achieve more detailed and photo-realistic results ; Secondly, in order to perform non-rigid transformation with controllable results while maintain the fidelity towards original 3D models in the same time, we propose a multi-view-embedding(MVE) optimization strategy to ensure that the diffusion model learns the overall features of the original object and an embedding-fusion(EF) to control the degree of editing by adjusting the value of the fusing rate. We also design a geometry processing step before optimizing on the base geometry to cope with different needs of various editing tasks. Further more, to fully leverage the geometric prior from the original 3D object, we provide an optional replacement of score distillation sampling named score projection sampling(SPS) which enables us to directly perform optimization from the origin 3D mesh in most common median non-rigid editing scenarios. We demonstrate the effectiveness of our method on both the non-rigid 3D editing task and general 3D editing task.
Abstract:Positioning and sensing over wireless networks are imperative for many emerging applications. However, traditional wireless channel models cannot be used for sensing the attitude of the user equipment (UE), since they over-simplify the UE as a point target. In this paper, a comprehensive electromagnetic propagation modeling (EPM) based on electromagnetic theory is developed to precisely model the near-field channel. For the noise-free case, the EPM model establishes the non-linear functional dependence of observed signals on both the position and attitude of the UE. To address the difficulty in the non-linear coupling, we first propose to divide the distance domain into three regions, separated by the defined Phase ambiguity distance and Spacing constraint distance. Then, for each region, we obtain the closed-form solutions for joint position and attitude estimation with low complexity. Next, to investigate the impact of random noise on the joint estimation performance, the Ziv-Zakai bound (ZZB) is derived to yield useful insights. The expected Cram\'er-Rao bound (ECRB) is further provided to obtain the simplified closed-form expressions for the performance lower bounds. Our numerical results demonstrate that the derived ZZB can provide accurate predictions of the performance of estimators in all signal-to-noise ratio (SNR) regimes. More importantly, we achieve the millimeter-level accuracy in position estimation and attain the 0.1-level accuracy in attitude estimation.
Abstract:Recent advances in TCP congestion control (CC) have achieved tremendous success with deep reinforcement learning (RL) approaches, which use feedforward neural networks (NN) to learn complex environment conditions and make better decisions. However, such "black-box" policies lack interpretability and reliability, and often, they need to operate outside the traditional TCP datapath due to the use of complex NNs. This paper proposes a novel two-stage solution to achieve the best of both worlds: first to train a deep RL agent, then distill its (over-)parameterized NN policy into white-box, light-weight rules in the form of symbolic expressions that are much easier to understand and to implement in constrained environments. At the core of our proposal is a novel symbolic branching algorithm that enables the rule to be aware of the context in terms of various network conditions, eventually converting the NN policy into a symbolic tree. The distilled symbolic rules preserve and often improve performance over state-of-the-art NN policies while being faster and simpler than a standard neural network. We validate the performance of our distilled symbolic rules on both simulation and emulation environments. Our code is available at https://github.com/VITA-Group/SymbolicPCC.
Abstract:The use of larger antenna arrays at higher frequency bands is envisioned in the beyond 5G wireless networks. This takes advantage of the near-field propagation regime where the wavefront is no longer plane but spherical, bringing both new opportunities and challenges for the high-precision positioning. In this paper, a generic near-field positioning model with different observation capabilities for three electric fields (vector, scalar, and overall scalar electric field) is proposed. For these three electric field types, the Cram\'er-Rao bound (CRB) is adopted to evaluate the achievable estimation accuracy. The expressions of the CRBs using different electric field observations are derived by combining electromagnetic theory with estimation theory. Closed-form expressions can be further obtained if the terminal is located on the central perpendicular line (CPL) of the receiving antenna surface. In addition, the above discussions are extended to the system with multiple distributed receiving antennas under the CPL assumption. The CRBs using various electric fields in this case are derived and the effect of different numbers of receiving antennas on estimation accuracy is investigated. Numerical results are provided to quantify the CRBs and validate the analytical results. Also, the impact of various system parameters, including different electric fields and multiple antennas, on the near-field positioning performance is evaluated.
Abstract:Flying vertebrates exhibit sophisticated wingbeat kinematics. Their specialized forelimbs allow for the wing morphing motion to couple with the flapping motion during their level flight, Previous flyable bionic platforms have successfully applied bio-inspired wing morphing but cannot yet be propelled by the morphing-coupled wingbeat pattern. Spurred by this, we develop a bio-inspired flapping-wing aerial vehicle (FWAV) entitled RoboFalcon, which is equipped with a novel mechanism to drive the bat-style morphing wings, performs a morphing-coupled wingbeat pattern, and overall manages an appealing flight. The novel mechanism of RoboFalcon allows coupling the morphing and flapping during level flight and decoupling these when maneuvering is required, producing a bilateral asymmetric downstroke affording high rolling agility. The bat-style morphing wing is designed with a tilted mounting angle around the radius at the wrist joint to mimic the wrist supination and pronation effect of flying vertebrates' forelimbs. The agility of RoboFalcon is assessed through several rolling maneuver flight tests, and we demonstrate its well-performing agility capability compared to flying creatures and current flapping-wing platforms. Wind tunnel tests indicate that the roll moment of the asymmetric downstroke is correlated with the flapping frequency, and the wrist mounting angle can be used for tuning the angle of attack and lift-thrust configuration of the equilibrium flight state. We believe that this work yields a well-performing bionic platform and provides a new actuation strategy for the morphing-coupled flapping flight.
Abstract:Today's auto-tuners (e.g., AutoTVM, Ansor) generate efficient tensor programs by navigating a large search space to identify effective implementations, but they do so with opaque hardware details. Thus, their performance could fall behind that of hardware-native libraries (e.g., cuBLAS, cuDNN), which are hand-optimized by device vendors to extract high performance. On the other hand, these vendor libraries have a fixed set of supported functions and lack the customization and automation support afforded by auto-tuners. Bolt is based on the recent trend that vendor libraries are increasingly modularized and reconfigurable via declarative control (e.g., CUTLASS). It enables a novel approach that bridges this gap and achieves the best of both worlds, via hardware-native templated search. Bolt provides new opportunities to rethink end-to-end tensor optimizations at the graph, operator, and model levels. Bolt demonstrates this concept by prototyping on a popular auto-tuner in TVM and a class of widely-used platforms (i.e., NVIDIA GPUs) -- both in large deployment in our production environment. Bolt improves the inference speed of common convolutional neural networks by 2.5x on average over the state of the art, and it auto-tunes these models within 20 minutes.