Abstract:Holographic-type communication brings an immersive tele-holography experience by delivering holographic contents to users. As the direct representation of holographic contents, hologram videos are naturally three-dimensional representation, which consist of a huge volume of data. Advanced multi-connectivity (MC) millimeter-wave (mmWave) networks are now available to transmit hologram videos by providing the necessary bandwidth. However, the existing link selection schemes in MC-based mmWave networks neglect the source content characteristics of hologram videos and the coordination among the parameters of different protocol layers in each link, leading to sub-optimal streaming performance. To address this issue, we propose a cross-layer-optimized link selection scheme for hologram video streaming over mmWave networks. This scheme optimizes link selection by jointly adjusting the video coding bitrate, the modulation and channel coding schemes (MCS), and link power allocation to minimize the end-to-end hologram distortion while guaranteeing the synchronization and quality balance between real and imaginary components of the hologram. Results show that the proposed scheme can effectively improve the hologram video streaming performance in terms of PSNR by 1.2dB to 6.4dB against the non-cross-layer scheme.
Abstract:Deformable object manipulation in robotics presents significant challenges due to uncertainties in component properties, diverse configurations, visual interference, and ambiguous prompts. These factors complicate both perception and control tasks. To address these challenges, we propose a novel method for One-Shot Affordance Grounding of Deformable Objects (OS-AGDO) in egocentric organizing scenes, enabling robots to recognize previously unseen deformable objects with varying colors and shapes using minimal samples. Specifically, we first introduce the Deformable Object Semantic Enhancement Module (DefoSEM), which enhances hierarchical understanding of the internal structure and improves the ability to accurately identify local features, even under conditions of weak component information. Next, we propose the ORB-Enhanced Keypoint Fusion Module (OEKFM), which optimizes feature extraction of key components by leveraging geometric constraints and improves adaptability to diversity and visual interference. Additionally, we propose an instance-conditional prompt based on image data and task context, effectively mitigates the issue of region ambiguity caused by prompt words. To validate these methods, we construct a diverse real-world dataset, AGDDO15, which includes 15 common types of deformable objects and their associated organizational actions. Experimental results demonstrate that our approach significantly outperforms state-of-the-art methods, achieving improvements of 6.2%, 3.2%, and 2.9% in KLD, SIM, and NSS metrics, respectively, while exhibiting high generalization performance. Source code and benchmark dataset will be publicly available at https://github.com/Dikay1/OS-AGDO.
Abstract:Distributed estimation in a blockchain-aided Internet of Things (BIoT) is considered, where the integrated blockchain secures data exchanges across the BIoT and the storage of data at BIoT agents. This paper focuses on developing a performance guarantee for the distributed estimation in a BIoT in the presence of malicious attacks which jointly exploits vulnerabilities present in both IoT devices and the employed blockchain within the BIoT. To achieve this, we adopt the Cramer-Rao Bound (CRB) as the performance metric, and maximize the CRB for estimating the parameter of interest over the attack domain. However, the maximization problem is inherently non-convex, making it infeasible to obtain the globally optimal solution in general. To address this issue, we develop a relaxation method capable of transforming the original non-convex optimization problem into a convex optimization problem. Moreover, we derive the analytical expression for the optimal solution to the relaxed optimization problem. The optimal value of the relaxed optimization problem can be used to provide a valid estimation performance guarantee for the BIoT in the presence of attacks.
Abstract:Adam-type algorithms have become a preferred choice for optimisation in the deep learning setting, however, despite success, their convergence is still not well understood. To this end, we introduce a unified framework for Adam-type algorithms (called UAdam). This is equipped with a general form of the second-order moment, which makes it possible to include Adam and its variants as special cases, such as NAdam, AMSGrad, AdaBound, AdaFom, and Adan. This is supported by a rigorous convergence analysis of UAdam in the non-convex stochastic setting, showing that UAdam converges to the neighborhood of stationary points with the rate of $\mathcal{O}(1/T)$. Furthermore, the size of neighborhood decreases as $\beta$ increases. Importantly, our analysis only requires the first-order momentum factor to be close enough to 1, without any restrictions on the second-order momentum factor. Theoretical results also show that vanilla Adam can converge by selecting appropriate hyperparameters, which provides a theoretical guarantee for the analysis, applications, and further developments of the whole class of Adam-type algorithms.
Abstract:In this work, we focus on improving the robot's dexterous capability by exploiting visual sensing and adaptive force control. TeachNet, a vision-based teleoperation learning framework, is exploited to map human hand postures to a multi-fingered robot hand. We augment TeachNet, which is originally based on an imprecise kinematic mapping and position-only servoing, with a biomimetic learning-based compliance control algorithm for dexterous manipulation tasks. This compliance controller takes the mapped robotic joint angles from TeachNet as the desired goal, computes the desired joint torques. It is derived from a computational model of the biomimetic control strategy in human motor learning, which allows adapting the control variables (impedance and feedforward force) online during the execution of the reference joint angle trajectories. The simultaneous adaptation of the impedance and feedforward profiles enables the robot to interact with the environment in a compliant manner. Our approach has been verified in multiple tasks in physics simulation, i.e., grasping, opening-a-door, turning-a-cap, and touching-a-mouse, and has shown more reliable performances than the existing position control and the fixed-gain-based force control approaches.