Abstract:Prevalent lossy image compression schemes can be divided into: 1) explicit image compression (EIC), including traditional standards and neural end-to-end algorithms; 2) implicit image compression (IIC) based on implicit neural representations (INR). The former is encountering impasses of either leveling off bitrate reduction at a cost of tremendous complexity while the latter suffers from excessive smoothing quality as well as lengthy decoder models. In this paper, we propose an innovative paradigm, which we dub \textbf{Unicorn} (\textbf{U}nified \textbf{N}eural \textbf{I}mage \textbf{C}ompression with \textbf{O}ne \textbf{N}number \textbf{R}econstruction). By conceptualizing the images as index-image pairs and learning the inherent distribution of pairs in a subtle neural network model, Unicorn can reconstruct a visually pleasing image from a randomly generated noise with only one index number. The neural model serves as the unified decoder of images while the noises and indexes corresponds to explicit representations. As a proof of concept, we propose an effective and efficient prototype of Unicorn based on latent diffusion models with tailored model designs. Quantitive and qualitative experimental results demonstrate that our prototype achieves significant bitrates reduction compared with EIC and IIC algorithms. More impressively, benefitting from the unified decoder, our compression ratio escalates as the quantity of images increases. We envision that more advanced model designs will endow Unicorn with greater potential in image compression. We will release our codes in \url{https://github.com/uniqzheng/Unicorn-Laduree}.
Abstract:Video quality assessment (VQA) is an important processing task, aiming at predicting the quality of videos in a manner highly consistent with human judgments of perceived quality. Traditional VQA models based on natural image and/or video statistics, which are inspired both by models of projected images of the real world and by dual models of the human visual system, deliver only limited prediction performances on real-world user-generated content (UGC), as exemplified in recent large-scale VQA databases containing large numbers of diverse video contents crawled from the web. Fortunately, recent advances in deep neural networks and Large Multimodality Models (LMMs) have enabled significant progress in solving this problem, yielding better results than prior handcrafted models. Numerous deep learning-based VQA models have been developed, with progress in this direction driven by the creation of content-diverse, large-scale human-labeled databases that supply ground truth psychometric video quality data. Here, we present a comprehensive survey of recent progress in the development of VQA algorithms and the benchmarking studies and databases that make them possible. We also analyze open research directions on study design and VQA algorithm architectures.
Abstract:The recent advancements in nanoscale 3D printing and microfabrication techniques have reinvigorated research on microrobotics and nanomachines. However, precise control of the robot motion and navigation on biological environments have remained challenging to date. This work presents the first demonstration of magnetic microscale rocker robot (microrocker bot) capable of bidirectional movement on flat as well as biological surfaces, when actuated by a single compact electromagnet. The 100um by 113um by 36um robot was 3D printed via two-photon lithography and subsequently coated with a nickel (Ni) thin film. When actuated by an externally applied magnetic sawtooth field, the robot demonstrated stick-slip motion enabled by its rockers. The controllable bidirectional motion is enabled by adjusting the DC offset of the waveform, which tilts the robot and biases it towards either forward or backward motion. The microrocker bots are further equipped with sharp tips that can get engaged via application of DC-only or low frequency magnetic fields. This novel control method offers an attractive solution to replace the multiple bulky coils traditionally used for magnetic actuation and control, as well as allows for a more flexible and simple approach towards microrobotics motion control. When the frequency and offset of the sawtooth waveform are optimized, the robot travels up to 87ums (0.87 body length per second) forward and backward with minor deviance from linear trajectories. Finally, to prove the robot's capabilities in direct contact with biological environments, we demonstrate the microbot's ability to traverse forward and backward on the surface of a Dracaena Fragrans (corn plant), as well as upend on its mechanical tip.
Abstract:This works presents the theoretical analysis and experimental observations of bidirectional motion of a millimeter-scale bristle robot (milli-bristle-bot) with an on-board piezoelectric actuator. First, the theory of the motion, based on the dry-friction model, is developed and the frequency regions of the forward and backward motion, along with resonant frequencies of the system are predicted. Secondly, milli-bristle-bots with two different bristle tilt angles are fabricated, and their bidirectional motions are experimentally investigated. The dependency of the robot speed on the actuation frequency is studied,which reveals two distinct frequency regions for the forward and backward motions that well matches our theoretical predictions. Furthermore, the dependencies of the resonance frequency and robot speed on the bristle tilt angle are experimentally studied and tied to the theoretical model. This work marks the first demonstration of bidirectional motion at millimeter-scales, achieved for bristle-bots with a single on-board actuator.