Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jingbo Zhang

A Comprehensive Survey on Joint Resource Allocation Strategies in Federated Edge Learning

Oct 10, 2024

Jingbo Zhang, Qiong Wu, Pingyi Fan, Qiang Fan

Abstract:Federated Edge Learning (FEL), an emerging distributed Machine Learning (ML) paradigm, enables model training in a distributed environment while ensuring user privacy by using physical separation for each user data. However, with the development of complex application scenarios such as the Internet of Things (IoT) and Smart Earth, the conventional resource allocation schemes can no longer effectively support these growing computational and communication demands. Therefore, joint resource optimization may be the key solution to the scaling problem. This paper simultaneously addresses the multifaceted challenges of computation and communication, with the growing multiple resource demands. We systematically review the joint allocation strategies for different resources (computation, data, communication, and network topology) in FEL, and summarize the advantages in improving system efficiency, reducing latency, enhancing resource utilization and enhancing robustness. In addition, we present the potential ability of joint optimization to enhance privacy preservation by reducing communication requirements, indirectly. This work not only provides theoretical support for resource management in federated learning (FL) systems, but also provides ideas for potential optimal deployment in multiple real-world scenarios. By thoroughly discussing the current challenges and future research directions, it also provides some important insights into multi-resource optimization in complex application environments.

* This paper has been submitted to CMC-Computers Materials & Continua

Via

Access Paper or Ask Questions

Generative Object Insertion in Gaussian Splatting with a Multi-View Diffusion Model

Sep 25, 2024

Hongliang Zhong, Can Wang, Jingbo Zhang, Jing Liao

Abstract:Generating and inserting new objects into 3D content is a compelling approach for achieving versatile scene recreation. Existing methods, which rely on SDS optimization or single-view inpainting, often struggle to produce high-quality results. To address this, we propose a novel method for object insertion in 3D content represented by Gaussian Splatting. Our approach introduces a multi-view diffusion model, dubbed MVInpainter, which is built upon a pre-trained stable video diffusion model to facilitate view-consistent object inpainting. Within MVInpainter, we incorporate a ControlNet-based conditional injection module to enable controlled and more predictable multi-view generation. After generating the multi-view inpainted results, we further propose a mask-aware 3D reconstruction technique to refine Gaussian Splatting reconstruction from these sparse inpainted views. By leveraging these fabricate techniques, our approach yields diverse results, ensures view-consistent and harmonious insertions, and produces better object quality. Extensive experiments demonstrate that our approach outperforms existing methods.

* Project Page: https://github.com/JiuTongBro/MultiView_Inpaint

Via

Access Paper or Ask Questions

Hyper-Compression: Model Compression via Hyperfunction

Sep 01, 2024

Fenglei Fan, Juntong Fan, Dayang Wang, Jingbo Zhang, Zelin Dong, Shijun Zhang, Ge Wang, Tieyong Zeng

Figure 1 for Hyper-Compression: Model Compression via Hyperfunction

Figure 2 for Hyper-Compression: Model Compression via Hyperfunction

Figure 3 for Hyper-Compression: Model Compression via Hyperfunction

Figure 4 for Hyper-Compression: Model Compression via Hyperfunction

Abstract:The rapid growth of large models' size has far outpaced that of GPU memory. To bridge this gap, inspired by the succinct relationship between genotype and phenotype, we turn the model compression problem into the issue of parameter representation to propose the so-called hyper-compression. The hyper-compression uses a hyperfunction to represent the parameters of the target network, and notably, here the hyperfunction is designed per ergodic theory that relates to a problem: if a low-dimensional dynamic system can fill the high-dimensional space eventually. Empirically, the proposed hyper-compression enjoys the following merits: 1) \textbf{P}referable compression ratio; 2) \textbf{N}o post-hoc retraining; 3) \textbf{A}ffordable inference time; and 4) \textbf{S}hort compression time. It compresses LLaMA2-7B in an hour and achieves close-to-int4-quantization performance, without retraining and with a performance drop of less than 1\%. Our work has the potential to invigorate the field of model compression, towards a harmony between the scaling law and the stagnation of hardware upgradation.

Via

Access Paper or Ask Questions

Advances in 3D Generation: A Survey

Jan 31, 2024

Xiaoyu Li, Qi Zhang, Di Kang, Weihao Cheng, Yiming Gao, Jingbo Zhang, Zhihao Liang, Jing Liao, Yan-Pei Cao, Ying Shan

Figure 1 for Advances in 3D Generation: A Survey

Figure 2 for Advances in 3D Generation: A Survey

Figure 3 for Advances in 3D Generation: A Survey

Figure 4 for Advances in 3D Generation: A Survey

Abstract:Generating 3D models lies at the core of computer graphics and has been the focus of decades of research. With the emergence of advanced neural representations and generative models, the field of 3D content generation is developing rapidly, enabling the creation of increasingly high-quality and diverse 3D models. The rapid growth of this field makes it difficult to stay abreast of all recent developments. In this survey, we aim to introduce the fundamental methodologies of 3D generation methods and establish a structured roadmap, encompassing 3D representation, generation methods, datasets, and corresponding applications. Specifically, we introduce the 3D representations that serve as the backbone for 3D generation. Furthermore, we provide a comprehensive overview of the rapidly growing literature on generation methods, categorized by the type of algorithmic paradigms, including feedforward generation, optimization-based generation, procedural generation, and generative novel view synthesis. Lastly, we discuss available datasets, applications, and open challenges. We hope this survey will help readers explore this exciting topic and foster further advancements in the field of 3D content generation.

* 33 pages, 12 figures

Via

Access Paper or Ask Questions

HumanRef: Single Image to 3D Human Generation via Reference-Guided Diffusion

Nov 28, 2023

Jingbo Zhang, Xiaoyu Li, Qi Zhang, Yanpei Cao, Ying Shan, Jing Liao

Abstract:Generating a 3D human model from a single reference image is challenging because it requires inferring textures and geometries in invisible views while maintaining consistency with the reference image. Previous methods utilizing 3D generative models are limited by the availability of 3D training data. Optimization-based methods that lift text-to-image diffusion models to 3D generation often fail to preserve the texture details of the reference image, resulting in inconsistent appearances in different views. In this paper, we propose HumanRef, a 3D human generation framework from a single-view input. To ensure the generated 3D model is photorealistic and consistent with the input image, HumanRef introduces a novel method called reference-guided score distillation sampling (Ref-SDS), which effectively incorporates image guidance into the generation process. Furthermore, we introduce region-aware attention to Ref-SDS, ensuring accurate correspondence between different body regions. Experimental results demonstrate that HumanRef outperforms state-of-the-art methods in generating 3D clothed humans with fine geometry, photorealistic textures, and view-consistent appearances.

* Homepage: https://eckertzhang.github.io/HumanRef.github.io/

Via

Access Paper or Ask Questions

VQ-NeRF: Neural Reflectance Decomposition and Editing with Vector Quantization

Nov 05, 2023

Hongliang Zhong, Jingbo Zhang, Jing Liao

Figure 1 for VQ-NeRF: Neural Reflectance Decomposition and Editing with Vector Quantization

Figure 2 for VQ-NeRF: Neural Reflectance Decomposition and Editing with Vector Quantization

Figure 3 for VQ-NeRF: Neural Reflectance Decomposition and Editing with Vector Quantization

Figure 4 for VQ-NeRF: Neural Reflectance Decomposition and Editing with Vector Quantization

Abstract:We propose VQ-NeRF, a two-branch neural network model that incorporates Vector Quantization (VQ) to decompose and edit reflectance fields in 3D scenes. Conventional neural reflectance fields use only continuous representations to model 3D scenes, despite the fact that objects are typically composed of discrete materials in reality. This lack of discretization can result in noisy material decomposition and complicated material editing. To address these limitations, our model consists of a continuous branch and a discrete branch. The continuous branch follows the conventional pipeline to predict decomposed materials, while the discrete branch uses the VQ mechanism to quantize continuous materials into individual ones. By discretizing the materials, our model can reduce noise in the decomposition process and generate a segmentation map of discrete materials. Specific materials can be easily selected for further editing by clicking on the corresponding area of the segmentation outcomes. Additionally, we propose a dropout-based VQ codeword ranking strategy to predict the number of materials in a scene, which reduces redundancy in the material segmentation process. To improve usability, we also develop an interactive interface to further assist material editing. We evaluate our model on both computer-generated and real-world scenes, demonstrating its superior performance. To the best of our knowledge, our model is the first to enable discrete material editing in 3D scenes.

* Accepted by TVCG. Project Page: https://jtbzhl.github.io/VQ-NeRF.github.io/

Via

Access Paper or Ask Questions

Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance Fields

May 19, 2023

Jingbo Zhang, Xiaoyu Li, Ziyu Wan, Can Wang, Jing Liao

Figure 1 for Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance Fields

Figure 2 for Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance Fields

Figure 3 for Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance Fields

Figure 4 for Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance Fields

Abstract:Text-driven 3D scene generation is widely applicable to video gaming, film industry, and metaverse applications that have a large demand for 3D scenes. However, existing text-to-3D generation methods are limited to producing 3D objects with simple geometries and dreamlike styles that lack realism. In this work, we present Text2NeRF, which is able to generate a wide range of 3D scenes with complicated geometric structures and high-fidelity textures purely from a text prompt. To this end, we adopt NeRF as the 3D representation and leverage a pre-trained text-to-image diffusion model to constrain the 3D reconstruction of the NeRF to reflect the scene description. Specifically, we employ the diffusion model to infer the text-related image as the content prior and use a monocular depth estimation method to offer the geometric prior. Both content and geometric priors are utilized to update the NeRF model. To guarantee textured and geometric consistency between different views, we introduce a progressive scene inpainting and updating strategy for novel view synthesis of the scene. Our method requires no additional training data but only a natural language description of the scene as the input. Extensive experiments demonstrate that our Text2NeRF outperforms existing methods in producing photo-realistic, multi-view consistent, and diverse 3D scenes from a variety of natural language prompts.

* Homepage: https://eckertzhang.github.io/Text2NeRF.github.io/

Via

Access Paper or Ask Questions

AvatarCraft: Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control

Mar 30, 2023

Ruixiang Jiang, Can Wang, Jingbo Zhang, Menglei Chai, Mingming He, Dongdong Chen, Jing Liao

Abstract:Neural implicit fields are powerful for representing 3D scenes and generating high-quality novel views, but it remains challenging to use such implicit representations for creating a 3D human avatar with a specific identity and artistic style that can be easily animated. Our proposed method, AvatarCraft, addresses this challenge by using diffusion models to guide the learning of geometry and texture for a neural avatar based on a single text prompt. We carefully design the optimization framework of neural implicit fields, including a coarse-to-fine multi-bounding box training strategy, shape regularization, and diffusion-based constraints, to produce high-quality geometry and texture. Additionally, we make the human avatar animatable by deforming the neural implicit field with an explicit warping field that maps the target human mesh to a template human mesh, both represented using parametric human models. This simplifies animation and reshaping of the generated avatar by controlling pose and shape parameters. Extensive experiments on various text descriptions show that AvatarCraft is effective and robust in creating human avatars and rendering novel views, poses, and shapes. Our project page is: \url{https://avatar-craft.github.io/}.

* Project page is: https://avatar-craft.github.io/

Via

Access Paper or Ask Questions

Adaptive Joint Optimization for 3D Reconstruction with Differentiable Rendering

Aug 15, 2022

Jingbo Zhang, Ziyu Wan, Jing Liao

Figure 1 for Adaptive Joint Optimization for 3D Reconstruction with Differentiable Rendering

Figure 2 for Adaptive Joint Optimization for 3D Reconstruction with Differentiable Rendering

Figure 3 for Adaptive Joint Optimization for 3D Reconstruction with Differentiable Rendering

Figure 4 for Adaptive Joint Optimization for 3D Reconstruction with Differentiable Rendering

Abstract:Due to inevitable noises introduced during scanning and quantization, 3D reconstruction via RGB-D sensors suffers from errors both in geometry and texture, leading to artifacts such as camera drifting, mesh distortion, texture ghosting, and blurriness. Given an imperfect reconstructed 3D model, most previous methods have focused on the refinement of either geometry, texture, or camera pose. Or different optimization schemes and objectives for optimizing each component have been used in previous joint optimization methods, forming a complicated system. In this paper, we propose a novel optimization approach based on differentiable rendering, which integrates the optimization of camera pose, geometry, and texture into a unified framework by enforcing consistency between the rendered results and the corresponding RGB-D inputs. Based on the unified framework, we introduce a joint optimization approach to fully exploit the inter-relationships between geometry, texture, and camera pose, and describe an adaptive interleaving strategy to improve optimization stability and efficiency. Using differentiable rendering, an image-level adversarial loss is applied to further improve the 3D model, making it more photorealistic. Experiments on synthetic and real data using quantitative and qualitative evaluation demonstrated the superiority of our approach in recovering both fine-scale geometry and high-fidelity texture.

* 12 pages, published on TVCG

Via

Access Paper or Ask Questions

FDNeRF: Few-shot Dynamic Neural Radiance Fields for Face Reconstruction and Expression Editing

Aug 11, 2022

Jingbo Zhang, Xiaoyu Li, Ziyu Wan, Can Wang, Jing Liao

Figure 1 for FDNeRF: Few-shot Dynamic Neural Radiance Fields for Face Reconstruction and Expression Editing

Figure 2 for FDNeRF: Few-shot Dynamic Neural Radiance Fields for Face Reconstruction and Expression Editing

Figure 3 for FDNeRF: Few-shot Dynamic Neural Radiance Fields for Face Reconstruction and Expression Editing

Figure 4 for FDNeRF: Few-shot Dynamic Neural Radiance Fields for Face Reconstruction and Expression Editing

Abstract:We propose a Few-shot Dynamic Neural Radiance Field (FDNeRF), the first NeRF-based method capable of reconstruction and expression editing of 3D faces based on a small number of dynamic images. Unlike existing dynamic NeRFs that require dense images as input and can only be modeled for a single identity, our method enables face reconstruction across different persons with few-shot inputs. Compared to state-of-the-art few-shot NeRFs designed for modeling static scenes, the proposed FDNeRF accepts view-inconsistent dynamic inputs and supports arbitrary facial expression editing, i.e., producing faces with novel expressions beyond the input ones. To handle the inconsistencies between dynamic inputs, we introduce a well-designed conditional feature warping (CFW) module to perform expression conditioned warping in 2D feature space, which is also identity adaptive and 3D constrained. As a result, features of different expressions are transformed into the target ones. We then construct a radiance field based on these view-consistent features and use volumetric rendering to synthesize novel views of the modeled faces. Extensive experiments with quantitative and qualitative evaluation demonstrate that our method outperforms existing dynamic and few-shot NeRFs on both 3D face reconstruction and expression editing tasks. Our code and model will be available upon acceptance.

* 8 pages

Via

Access Paper or Ask Questions