Abstract:We derive universal approximation results for the class of (countably) $m$-rectifiable measures. Specifically, we prove that $m$-rectifiable measures can be approximated as push-forwards of the one-dimensional Lebesgue measure on $[0,1]$ using ReLU neural networks with arbitrarily small approximation error in terms of Wasserstein distance. What is more, the weights in the networks under consideration are quantized and bounded and the number of ReLU neural networks required to achieve an approximation error of $\varepsilon$ is no larger than $2^{b(\varepsilon)}$ with $b(\varepsilon)=\mathcal{O}(\varepsilon^{-m}\log^2(\varepsilon))$. This result improves Lemma IX.4 in Perekrestenko et al. as it shows that the rate at which $b(\varepsilon)$ tends to infinity as $\varepsilon$ tends to zero equals the rectifiability parameter $m$, which can be much smaller than the ambient dimension. We extend this result to countably $m$-rectifiable measures and show that this rate still equals the rectifiability parameter $m$ provided that, among other technical assumptions, the measure decays exponentially on the individual components of the countably $m$-rectifiable support set.
Abstract:The automated extraction of rural roads is pivotal for rural development and transportation planning, serving as a cornerstone for socio-economic progress. Current research primarily focuses on road extraction in urban areas. However, rural roads present unique challenges due to their narrow and irregular nature, posing significant difficulties for road extraction. In this article, a reverse refinement network (R2-Net) is proposed to extract narrow rural roads, enhancing their connectivity and distinctiveness from the background. Specifically, to preserve the fine details of roads within high-resolution feature maps, R2-Net utilizes an axis context aware module (ACAM) to capture the long-distance spatial context information in various layers. Subsequently, the multi-level features are aggregated through a global aggregation module (GAM). Moreover, in the decoder stage, R2-Net employs a reverse-aware module (RAM) to direct the attention of the network to the complex background, thus amplifying its separability. In experiments, we compare R2-Net with several state-of-the-art methods using the DeepGlobe road extraction dataset and the WHU-RuR+ global large-scale rural road dataset. R2-Net achieved superior performance and especially excelled in accurately detecting narrow roads. Furthermore, we explored the applicability of R2-Net for large-scale rural road mapping. The results show that the proposed R2-Net has significant performance advantages for large-scale rural road mapping applications.
Abstract:This paper is concerned with the fundamental limits of nonlinear dynamical system learning from input-output traces. Specifically, we show that recurrent neural networks (RNNs) are capable of learning nonlinear systems that satisfy a Lipschitz property and forget past inputs fast enough in a metric-entropy optimal manner. As the sets of sequence-to-sequence maps realized by the dynamical systems we consider are significantly more massive than function classes generally considered in deep neural network approximation theory, a refined metric-entropy characterization is needed, namely in terms of order, type, and generalized dimension. We compute these quantities for the classes of exponentially-decaying and polynomially-decaying Lipschitz fading-memory systems and show that RNNs can achieve them.
Abstract:A lensless camera is an imaging system that uses a mask in place of a lens, making it thinner, lighter, and less expensive than a lensed camera. However, additional complex computation and time are required for image reconstruction. This work proposes a deep learning model named Raw3dNet that recognizes hand gestures directly on raw videos captured by a lensless camera without the need for image restoration. In addition to conserving computational resources, the reconstruction-free method provides privacy protection. Raw3dNet is a novel end-to-end deep neural network model for the recognition of hand gestures in lensless imaging systems. It is created specifically for raw video captured by a lensless camera and has the ability to properly extract and combine temporal and spatial features. The network is composed of two stages: 1. spatial feature extractor (SFE), which enhances the spatial features of each frame prior to temporal convolution; 2. 3D-ResNet, which implements spatial and temporal convolution of video streams. The proposed model achieves 98.59% accuracy on the Cambridge Hand Gesture dataset in the lensless optical experiment, which is comparable to the lensed-camera result. Additionally, the feasibility of physical object recognition is assessed. Furtherly, we show that the recognition can be achieved with respectable accuracy using only a tiny portion of the original raw data, indicating the potential for reducing data traffic in cloud computing scenarios.
Abstract:Soft actuators have shown great advantages in compliance and morphology matched for manipulation of delicate objects and inspection in a confined space. There is an unmet need for a soft actuator that can provide torsional motion to e.g. enlarge working space and increase degrees of freedom. Towards this goal, we present origami-inspired soft pneumatic actuators (OSPAs) made from silicone. The prototype can output a rotation of more than one revolution (up to 435{\deg}), larger than previous counterparts. We describe the design and fabrication method, build the kinematics models and simulation models, and analyze and optimize the parameters. Finally, we demonstrate the potentially extensive utility of OSPAs through their integration into a gripper capable of simultaneously grasping and lifting fragile or flat objects, a versatile robot arm capable of picking and placing items at the right angle with the twisting actuators, and a soft snake robot capable of changing attitude and directions by torsion of the twisting actuators.
Abstract:In this paper, we study a novel problem: "automatic prescription recommendation for PD patients." To realize this goal, we first build a dataset by collecting 1) symptoms of PD patients, and 2) their prescription drug provided by neurologists. Then, we build a novel computer-aided prescription model by learning the relation between observed symptoms and prescription drug. Finally, for the new coming patients, we could recommend (predict) suitable prescription drug on their observed symptoms by our prescription model. From the methodology part, our proposed model, namely Prescription viA Learning lAtent Symptoms (PALAS), could recommend prescription using the multi-modality representation of the data. In PALAS, a latent symptom space is learned to better model the relationship between symptoms and prescription drug, as there is a large semantic gap between them. Moreover, we present an efficient alternating optimization method for PALAS. We evaluated our method using the data collected from 136 PD patients at Nanjing Brain Hospital, which can be regarded as a large dataset in PD research community. The experimental results demonstrate the effectiveness and clinical potential of our method in this recommendation task, if compared with other competing methods.