Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhiyang Guo

A Novel Deep Learning-Based Coarse-to-Fine Frame Synchronization Method for OTFS Systems

Jan 09, 2026

Meiwen Men, Tao Zhou, Kaifeng Bao, Zhiyang Guo, Yongning Qi, Liu Liu, Bo Ai

Abstract:Orthogonal time frequency space (OTFS) modulation is a robust candidate waveform for future wireless systems, particularly in high-mobility scenarios, as it effectively mitigates the impact of rapidly time-varying channels by mapping symbols in the delay-Doppler (DD) domain. However, accurate frame synchronization in OTFS systems remains a challenge due to the performance limitations of conventional algorithms. To address this, we propose a low-complexity synchronization method based on a coarse-to-fine deep residual network (ResNet) architecture. Unlike traditional approaches relying on high-overhead preamble structures, our method exploits the intrinsic periodic features of OTFS pilots in the delay-time (DT) domain to formulate synchronization as a hierarchical classification problem. Specifically, the proposed architecture employs a two-stage strategy to first narrow the search space and then pinpoint the precise symbol timing offset (STO), thereby significantly reducing computational complexity while maintaining high estimation accuracy. We construct a comprehensive simulation dataset incorporating diverse channel models and randomized STO to validate the method. Extensive simulation results demonstrate that the proposed method achieves robust signal start detection and superior accuracy compared to conventional benchmarks, particularly in low signal-to-noise ratio (SNR) regimes and high-mobility scenarios.

Via

Access Paper or Ask Questions

Make-It-Poseable: Feed-forward Latent Posing Model for 3D Humanoid Character Animation

Dec 18, 2025

Zhiyang Guo, Ori Zhang, Jax Xiang, Alan Zhao, Wengang Zhou, Houqiang Li

Figure 1 for Make-It-Poseable: Feed-forward Latent Posing Model for 3D Humanoid Character Animation

Figure 2 for Make-It-Poseable: Feed-forward Latent Posing Model for 3D Humanoid Character Animation

Figure 3 for Make-It-Poseable: Feed-forward Latent Posing Model for 3D Humanoid Character Animation

Figure 4 for Make-It-Poseable: Feed-forward Latent Posing Model for 3D Humanoid Character Animation

Abstract:Posing 3D characters is a fundamental task in computer graphics and vision. However, existing methods like auto-rigging and pose-conditioned generation often struggle with challenges such as inaccurate skinning weight prediction, topological imperfections, and poor pose conformance, limiting their robustness and generalizability. To overcome these limitations, we introduce Make-It-Poseable, a novel feed-forward framework that reformulates character posing as a latent-space transformation problem. Instead of deforming mesh vertices as in traditional pipelines, our method reconstructs the character in new poses by directly manipulating its latent representation. At the core of our method is a latent posing transformer that manipulates shape tokens based on skeletal motion. This process is facilitated by a dense pose representation for precise control. To ensure high-fidelity geometry and accommodate topological changes, we also introduce a latent-space supervision strategy and an adaptive completion module. Our method demonstrates superior performance in posing quality. It also naturally extends to 3D editing applications like part replacement and refinement.

* Project page: https://jasongzy.github.io/Make-It-Poseable/

Via

Access Paper or Ask Questions

Make-It-Animatable: An Efficient Framework for Authoring Animation-Ready 3D Characters

Nov 27, 2024

Zhiyang Guo, Jinxu Xiang, Kai Ma, Wengang Zhou, Houqiang Li, Ran Zhang

Figure 1 for Make-It-Animatable: An Efficient Framework for Authoring Animation-Ready 3D Characters

Figure 2 for Make-It-Animatable: An Efficient Framework for Authoring Animation-Ready 3D Characters

Figure 3 for Make-It-Animatable: An Efficient Framework for Authoring Animation-Ready 3D Characters

Figure 4 for Make-It-Animatable: An Efficient Framework for Authoring Animation-Ready 3D Characters

Abstract:3D characters are essential to modern creative industries, but making them animatable often demands extensive manual work in tasks like rigging and skinning. Existing automatic rigging tools face several limitations, including the necessity for manual annotations, rigid skeleton topologies, and limited generalization across diverse shapes and poses. An alternative approach is to generate animatable avatars pre-bound to a rigged template mesh. However, this method often lacks flexibility and is typically limited to realistic human shapes. To address these issues, we present Make-It-Animatable, a novel data-driven method to make any 3D humanoid model ready for character animation in less than one second, regardless of its shapes and poses. Our unified framework generates high-quality blend weights, bones, and pose transformations. By incorporating a particle-based shape autoencoder, our approach supports various 3D representations, including meshes and 3D Gaussian splats. Additionally, we employ a coarse-to-fine representation and a structure-aware modeling strategy to ensure both accuracy and robustness, even for characters with non-standard skeleton structures. We conducted extensive experiments to validate our framework's effectiveness. Compared to existing methods, our approach demonstrates significant improvements in both quality and speed.

* Project Page: https://jasongzy.github.io/Make-It-Animatable/

Via

Access Paper or Ask Questions

EG4D: Explicit Generation of 4D Object without Score Distillation

May 28, 2024

Qi Sun, Zhiyang Guo, Ziyu Wan, Jing Nathan Yan, Shengming Yin, Wengang Zhou, Jing Liao, Houqiang Li

Abstract:In recent years, the increasing demand for dynamic 3D assets in design and gaming applications has given rise to powerful generative pipelines capable of synthesizing high-quality 4D objects. Previous methods generally rely on score distillation sampling (SDS) algorithm to infer the unseen views and motion of 4D objects, thus leading to unsatisfactory results with defects like over-saturation and Janus problem. Therefore, inspired by recent progress of video diffusion models, we propose to optimize a 4D representation by explicitly generating multi-view videos from one input image. However, it is far from trivial to handle practical challenges faced by such a pipeline, including dramatic temporal inconsistency, inter-frame geometry and texture diversity, and semantic defects brought by video generation results. To address these issues, we propose DG4D, a novel multi-stage framework that generates high-quality and consistent 4D assets without score distillation. Specifically, collaborative techniques and solutions are developed, including an attention injection strategy to synthesize temporal-consistent multi-view videos, a robust and efficient dynamic reconstruction method based on Gaussian Splatting, and a refinement stage with diffusion prior for semantic restoration. The qualitative results and user preference study demonstrate that our framework outperforms the baselines in generation quality by a considerable margin. Code will be released at \url{https://github.com/jasongzy/EG4D}.

Via

Access Paper or Ask Questions

Motion-aware 3D Gaussian Splatting for Efficient Dynamic Scene Reconstruction

Mar 18, 2024

Zhiyang Guo, Wengang Zhou, Li Li, Min Wang, Houqiang Li

Figure 1 for Motion-aware 3D Gaussian Splatting for Efficient Dynamic Scene Reconstruction

Figure 2 for Motion-aware 3D Gaussian Splatting for Efficient Dynamic Scene Reconstruction

Figure 3 for Motion-aware 3D Gaussian Splatting for Efficient Dynamic Scene Reconstruction

Figure 4 for Motion-aware 3D Gaussian Splatting for Efficient Dynamic Scene Reconstruction

Abstract:3D Gaussian Splatting (3DGS) has become an emerging tool for dynamic scene reconstruction. However, existing methods focus mainly on extending static 3DGS into a time-variant representation, while overlooking the rich motion information carried by 2D observations, thus suffering from performance degradation and model redundancy. To address the above problem, we propose a novel motion-aware enhancement framework for dynamic scene reconstruction, which mines useful motion cues from optical flow to improve different paradigms of dynamic 3DGS. Specifically, we first establish a correspondence between 3D Gaussian movements and pixel-level flow. Then a novel flow augmentation method is introduced with additional insights into uncertainty and loss collaboration. Moreover, for the prevalent deformation-based paradigm that presents a harder optimization problem, a transient-aware deformation auxiliary module is proposed. We conduct extensive experiments on both multi-view and monocular scenes to verify the merits of our work. Compared with the baselines, our method shows significant superiority in both rendering quality and efficiency.

Via

Access Paper or Ask Questions

HandNeRF: Neural Radiance Fields for Animatable Interacting Hands

Mar 24, 2023

Zhiyang Guo, Wengang Zhou, Min Wang, Li Li, Houqiang Li

Abstract:We propose a novel framework to reconstruct accurate appearance and geometry with neural radiance fields (NeRF) for interacting hands, enabling the rendering of photo-realistic images and videos for gesture animation from arbitrary views. Given multi-view images of a single hand or interacting hands, an off-the-shelf skeleton estimator is first employed to parameterize the hand poses. Then we design a pose-driven deformation field to establish correspondence from those different poses to a shared canonical space, where a pose-disentangled NeRF for one hand is optimized. Such unified modeling efficiently complements the geometry and texture cues in rarely-observed areas for both hands. Meanwhile, we further leverage the pose priors to generate pseudo depth maps as guidance for occlusion-aware density learning. Moreover, a neural feature distillation method is proposed to achieve cross-domain alignment for color optimization. We conduct extensive experiments to verify the merits of our proposed HandNeRF and report a series of state-of-the-art results both qualitatively and quantitatively on the large-scale InterHand2.6M dataset.

* CVPR 2023

Via

Access Paper or Ask Questions