Abstract:In this paper, we formulate the precoding problem of integrated sensing and communication (ISAC) waveform as a non-convex quadratically constrainted quadratic program (QCQP), in which the weighted sum of communication multi-user interference (MUI) and the gap between dual-use waveform and ideal radar waveform is minimized with peak-to-average power ratio (PAPR) constraints. We propose an efficient algorithm based on alternating direction method of multipliers (ADMM), which is able to decouple multiple variables and provide a closed-form solution for each subproblem. In addition, to improve the sensing performance in both spatial and temporal domains, we propose a new criteria to design the ideal radar waveform, in which the beam pattern is made similar to the ideal one and the integrated sidelobe level of the ambiguity function in each target direction is minimized in the region of interest. The limited memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm is applied to the design of the ideal radar waveform which works as a reference in the design of the dual-function waveform. Numerical results indicate that the designed dual-function waveform is capable of offering good communication quality of service (QoS) and sensing performance.
Abstract:Proprioception is the "sixth sense" that detects limb postures with motor neurons. It requires a natural integration between the musculoskeletal systems and sensory receptors, which is challenging among modern robots that aim for lightweight, adaptive, and sensitive designs at a low cost. Here, we present the Soft Polyhedral Network with an embedded vision for physical interactions, capable of adaptive kinesthesia and viscoelastic proprioception by learning kinetic features. This design enables passive adaptations to omni-directional interactions, visually captured by a miniature high-speed motion tracking system embedded inside for proprioceptive learning. The results show that the soft network can infer real-time 6D forces and torques with accuracies of 0.25/0.24/0.35 N and 0.025/0.034/0.006 Nm in dynamic interactions. We also incorporate viscoelasticity in proprioception during static adaptation by adding a creep and relaxation modifier to refine the predicted results. The proposed soft network combines simplicity in design, omni-adaptation, and proprioceptive sensing with high accuracy, making it a versatile solution for robotics at a low cost with more than 1 million use cycles for tasks such as sensitive and competitive grasping, and touch-based geometry reconstruction. This study offers new insights into vision-based proprioception for soft robots in adaptive grasping, soft manipulation, and human-robot interaction.
Abstract:This paper reports on the NTIRE 2023 Quality Assessment of Video Enhancement Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2023. This challenge is to address a major challenge in the field of video processing, namely, video quality assessment (VQA) for enhanced videos. The challenge uses the VQA Dataset for Perceptual Video Enhancement (VDPVE), which has a total of 1211 enhanced videos, including 600 videos with color, brightness, and contrast enhancements, 310 videos with deblurring, and 301 deshaked videos. The challenge has a total of 167 registered participants. 61 participating teams submitted their prediction results during the development phase, with a total of 3168 submissions. A total of 176 submissions were submitted by 37 participating teams during the final testing phase. Finally, 19 participating teams submitted their models and fact sheets, and detailed the methods they used. Some methods have achieved better results than baseline methods, and the winning methods have demonstrated superior prediction performance.
Abstract:Recently, occluded person re-identification(Re-ID) remains a challenging task that people are frequently obscured by other people or obstacles, especially in a crowd massing situation. In this paper, we propose a self-supervised deep learning method to improve the location performance for human parts through occluded person Re-ID. Unlike previous works, we find that motion information derived from the photos of various human postures can help identify major human body components. Firstly, a motion-aware transformer encoder-decoder architecture is designed to obtain keypoints heatmaps and part-segmentation maps. Secondly, an affine transformation module is utilized to acquire motion information from the keypoint detection branch. Then the motion information will support the segmentation branch to achieve refined human part segmentation maps, and effectively divide the human body into reasonable groups. Finally, several cases demonstrate the efficiency of the proposed model in distinguishing different representative parts of the human body, which can avoid the background and occlusion disturbs. Our method consistently achieves state-of-the-art results on several popular datasets, including occluded, partial, and holistic.
Abstract:Convolutional neural networks have a significant improvement in the accuracy of Object detection. As convolutional neural networks become deeper, the accuracy of detection is also obviously improved, and more floating-point calculations are needed. Many researchers use the knowledge distillation method to improve the accuracy of student networks by transferring knowledge from a deeper and larger teachers network to a small student network, in object detection. Most methods of knowledge distillation need to designed complex cost functions and they are aimed at the two-stage object detection algorithm. This paper proposes a clean and effective knowledge distillation method for the one-stage object detection. The feature maps generated by teacher network and student network are used as true samples and fake samples respectively, and generate adversarial training for both to improve the performance of the student network in one-stage object detection.