Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yingjie Xu

BuildSTG: A Multi-building Energy Load Forecasting Method using Spatio-Temporal Graph Neural Network

Jul 28, 2025

Yongzheng Liu, Yiming Wang, Po Xu, Yingjie Xu, Yuntian Chen, Dongxiao Zhang

Abstract:Due to the extensive availability of operation data, data-driven methods show strong capabilities in predicting building energy loads. Buildings with similar features often share energy patterns, reflected by spatial dependencies in their operational data, which conventional prediction methods struggle to capture. To overcome this, we propose a multi-building prediction approach using spatio-temporal graph neural networks, comprising graph representation, graph learning, and interpretation. First, a graph is built based on building characteristics and environmental factors. Next, a multi-level graph convolutional architecture with attention is developed for energy prediction. Lastly, a method interpreting the optimized graph structure is introduced. Experiments on the Building Data Genome Project 2 dataset confirm superior performance over baselines such as XGBoost, SVR, FCNN, GRU, and Naive, highlighting the method's robustness, generalization, and interpretability in capturing meaningful building similarities and spatial relationships.

Via

Access Paper or Ask Questions

Advancing high-fidelity 3D and Texture Generation with 2.5D latents

May 28, 2025

Xin Yang, Jiantao Lin, Yingjie Xu, Haodong Li, Yingcong Chen

Abstract:Despite the availability of large-scale 3D datasets and advancements in 3D generative models, the complexity and uneven quality of 3D geometry and texture data continue to hinder the performance of 3D generation techniques. In most existing approaches, 3D geometry and texture are generated in separate stages using different models and non-unified representations, frequently leading to unsatisfactory coherence between geometry and texture. To address these challenges, we propose a novel framework for joint generation of 3D geometry and texture. Specifically, we focus in generate a versatile 2.5D representations that can be seamlessly transformed between 2D and 3D. Our approach begins by integrating multiview RGB, normal, and coordinate images into a unified representation, termed as 2.5D latents. Next, we adapt pre-trained 2D foundation models for high-fidelity 2.5D generation, utilizing both text and image conditions. Finally, we introduce a lightweight 2.5D-to-3D refiner-decoder framework that efficiently generates detailed 3D representations from 2.5D images. Extensive experiments demonstrate that our model not only excels in generating high-quality 3D objects with coherent structure and color from text and image inputs but also significantly outperforms existing methods in geometry-conditioned texture generation.

Via

Access Paper or Ask Questions

Sparse Arrays Enable Near-Field Constant-Distance Focusing with Reduced Focal Shift

May 12, 2025

Jiawang Li, Yingjie Xu, Hanieh Aliakbari

Abstract:In near-field beam focusing for finite-sized arrays, focal shift is a non-negligible issue. The actual focal point often appears closer to the array than the predefined focal distance, significantly degrading the focusing performance of finite aperture arrays. Moreover, when the focus point is scanned across different locations, the degradation becomes even more pronounced, leading not only to positional deviation but also to substantial energy loss. To address this issue, we revisit the problem from the perspective of communication degrees of freedom. We demonstrate that a properly designed sparse array with optimized element spacing can effectively mitigate focal shift while enabling stable control of the focusing height during beam scanning. Simulation results based on dipole antennas with different polarizations and patch antennas validate our findings. Notably, with optimized inter-element distances, the energy distribution across focal points becomes nearly uniform, and highly accurate focusing positions are achieved.

Via

Access Paper or Ask Questions

Interacting Object-Enabled Clustering and Characterization of Distributed MIMO Channels

Apr 16, 2025

Yingjie Xu, Michiel Sandra, Xuesong Cai, Sara Willhammar, Fredrik Tufvesson

Abstract:Distributed multiple-input multiple-output (MIMO), also known as cell-free massive MIMO, emerges as a promising technology for sixth-generation (6G) systems to support uniform coverage and reliable communication. For the design and optimization of such systems, measurement-based investigations of real-world distributed MIMO channels are essential. In this paper, we present an indoor channel measurement campaign, featuring eight distributed antenna arrays with 128 elements in total. Multi-link channels are measured at 50 positions along a 12-meter user route. A clustering algorithm enabled by interacting objects is proposed to identify clusters in the measured channels. The algorithm jointly clusters the multipath components for all links, effectively capturing the dynamic contributions of common clusters to different links. In addition, a Kalman filter-based tracking framework is introduced for cluster prediction, tracking, and updating along the user movement. Using the clustering and tracking results, cluster-level characterization of the measured channels is performed. First, the number of clusters and their visibility at both link ends are analyzed. Next, a maximum-likelihood estimator is utilized to determine the entire cluster visibility region length. Finally, key cluster-level properties, including the common cluster ratio, cluster power, shadowing, spread, among others, are statistically investigated. The results provide valuable insights into cluster behavior in typical multi-link channels, necessary for accurate modeling of distributed MIMO channels.

* This paper has been submitted to IEEE Transactions on Wireless Communications. 13 pages, 13 figures, 2 tables

Via

Access Paper or Ask Questions

Experimental Analysis of Multipath Characteristics in Indoor Distributed Massive MIMO Channels

Apr 16, 2025

Yingjie Xu, Xuesong Cai, Sara Willhammar, Fredrik Tufvesson

Figure 1 for Experimental Analysis of Multipath Characteristics in Indoor Distributed Massive MIMO Channels

Figure 2 for Experimental Analysis of Multipath Characteristics in Indoor Distributed Massive MIMO Channels

Figure 3 for Experimental Analysis of Multipath Characteristics in Indoor Distributed Massive MIMO Channels

Figure 4 for Experimental Analysis of Multipath Characteristics in Indoor Distributed Massive MIMO Channels

Abstract:Distributed massive multiple-input multiple-output (MIMO), also known as cell-free massive MIMO, has emerged as a promising technology for sixth-generation (6G) wireless networks. This letter introduces an indoor channel measurement campaign designed to explore the behavior of multipath components (MPCs) in distributed massive MIMO channels. Fully coherent channels were measured between eight distributed uniform planar arrays (128 elements in total) and a 12-meter user equipment route. Furthermore, a method is introduced to determine the order (single- or multi-bounce) of MPC interaction by leveraging map information and MPC parameters. In addition, a Kalman filter-based framework is used for identifying the MPC interaction mechanisms (reflection or scattering/diffraction/mixed). Finally, a comprehensive MPC-level characterization is performed based on the measured channels, including the significance of the single-bounce MPCs, the spherical wavefront features, the birth-and-death processes of the MPCs, and the spatial distribution of reflections. The findings serve as a valuable reference for understanding MPC propagation behavior, which is necessary for accurate modeling of indoor distributed massive MIMO channels.

* This paper has been submitted to IEEE Antenna and Wireless Propagation Letters. 5 pages, 7 figures, 1 table

Via

Access Paper or Ask Questions

Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation

Mar 03, 2025

Jiantao Lin, Xin Yang, Meixi Chen, Yingjie Xu, Dongyu Yan, Leyi Wu, Xinli Xu, Lie XU, Shunsi Zhang, Ying-Cong Chen

Abstract:Diffusion models have achieved great success in generating 2D images. However, the quality and generalizability of 3D content generation remain limited. State-of-the-art methods often require large-scale 3D assets for training, which are challenging to collect. In this work, we introduce Kiss3DGen (Keep It Simple and Straightforward in 3D Generation), an efficient framework for generating, editing, and enhancing 3D objects by repurposing a well-trained 2D image diffusion model for 3D generation. Specifically, we fine-tune a diffusion model to generate ''3D Bundle Image'', a tiled representation composed of multi-view images and their corresponding normal maps. The normal maps are then used to reconstruct a 3D mesh, and the multi-view images provide texture mapping, resulting in a complete 3D model. This simple method effectively transforms the 3D generation problem into a 2D image generation task, maximizing the utilization of knowledge in pretrained diffusion models. Furthermore, we demonstrate that our Kiss3DGen model is compatible with various diffusion model techniques, enabling advanced features such as 3D editing, mesh and texture enhancement, etc. Through extensive experiments, we demonstrate the effectiveness of our approach, showcasing its ability to produce high-quality 3D models efficiently.

* The first three authors contributed equally to this work

Via

Access Paper or Ask Questions

Spatial separation of closely-spaced users in measured distributed massive MIMO channels

Nov 27, 2024

Yingjie Xu, Michiel Sandra, Xuesong Cai, Sara Willhammar, Fredrik Tufvesson

Abstract:Aiming for the sixth generation (6G) wireless communications, distributed massive multiple-input multiple-output (MIMO) systems hold significant potential for spatial multiplexing. In order to evaluate the ability of a distributed massive MIMO system to spatially separate closely spaced users, this paper presents an indoor channel measurement campaign. The measurements are carried out at a carrier frequency of 5.6 GHz with a bandwidth of 400 MHz, employing distributed antenna arrays with a total of 128 elements. Multiple scalar metrics are selected to evaluate spatial separability in line-of-sight, non line-of-sight, and mixed conditions. Firstly, through studying the singular value spread, it is shown that in line-of-sight conditions, better user orthogonality is achieved with a distributed MIMO setup compared to a co-located MIMO array. Furthermore, the dirty-paper coding (DPC) capacity and zero forcing (ZF) precoding sum-rate capacities are investigated across varying numbers of antennas and their topologies. The results show that in all three conditions, the less complex ZF precoder can be applied in distributed massive MIMO systems while still achieving a large fraction of the DPC capacity. Additionally, in line-of-sight conditions, both sum-rate capacities and user fairness benefit from more antennas and a more distributed antenna topology. However, in the given NLoS condition, the improvement in spatial separability through distributed antenna topologies is limited.

Via

Access Paper or Ask Questions

Online Collision Risk Estimation via Monocular Depth-Aware Object Detectors and Fuzzy Inference

Nov 09, 2024

Brian Hsuan-Cheng Liao, Yingjie Xu, Chih-Hong Cheng, Hasan Esen, Alois Knoll

Abstract:This paper presents a monitoring framework that infers the level of autonomous vehicle (AV) collision risk based on its object detector's performance using only monocular camera images. Essentially, the framework takes two sets of predictions produced by different algorithms and associates their inconsistencies with the collision risk via fuzzy inference. The first set of predictions is obtained through retrieving safety-critical 2.5D objects from a depth map, and the second set comes from the AV's 3D object detector. We experimentally validate that, based on Intersection-over-Union (IoU) and a depth discrepancy measure, the inconsistencies between the two sets of predictions strongly correlate to the safety-related error of the 3D object detector against ground truths. This correlation allows us to construct a fuzzy inference system and map the inconsistency measures to an existing collision risk indicator. In particular, we apply various knowledge- and data-driven techniques and find using particle swarm optimization that learns general fuzzy rules gives the best mapping result. Lastly, we validate our monitor's capability to produce relevant risk estimates with the large-scale nuScenes dataset and show it can safeguard an AV in closed-loop simulations.

* 7 pages (IEEE double column format), 5 figures, 3 tables, submitted to ICRA 2025

Via

Access Paper or Ask Questions

ARIC: An Activity Recognition Dataset in Classroom Surveillance Images

Oct 16, 2024

Linfeng Xu, Fanman Meng, Qingbo Wu, Lili Pan, Heqian Qiu, Lanxiao Wang, Kailong Chen, Kanglei Geng, Yilei Qian, Haojie Wang(+9 more)

Figure 1 for ARIC: An Activity Recognition Dataset in Classroom Surveillance Images

Figure 2 for ARIC: An Activity Recognition Dataset in Classroom Surveillance Images

Figure 3 for ARIC: An Activity Recognition Dataset in Classroom Surveillance Images

Figure 4 for ARIC: An Activity Recognition Dataset in Classroom Surveillance Images

Abstract:The application of activity recognition in the ``AI + Education" field is gaining increasing attention. However, current work mainly focuses on the recognition of activities in manually captured videos and a limited number of activity types, with little attention given to recognizing activities in surveillance images from real classrooms. Activity recognition in classroom surveillance images faces multiple challenges, such as class imbalance and high activity similarity. To address this gap, we constructed a novel multimodal dataset focused on classroom surveillance image activity recognition called ARIC (Activity Recognition In Classroom). The ARIC dataset has advantages of multiple perspectives, 32 activity categories, three modalities, and real-world classroom scenarios. In addition to the general activity recognition tasks, we also provide settings for continual learning and few-shot continual learning. We hope that the ARIC dataset can act as a facilitator for future analysis and research for open teaching scenarios. You can download preliminary data from https://ivipclab.github.io/publication_ARIC/ARIC.

* arXiv admin note: text overlap with arXiv:2409.03354

Via

Access Paper or Ask Questions

Learning with Unreliability: Fast Few-shot Voxel Radiance Fields with Relative Geometric Consistency

Mar 26, 2024

Yingjie Xu, Bangzhen Liu, Hao Tang, Bailin Deng, Shengfeng He

Abstract:We propose a voxel-based optimization framework, ReVoRF, for few-shot radiance fields that strategically address the unreliability in pseudo novel view synthesis. Our method pivots on the insight that relative depth relationships within neighboring regions are more reliable than the absolute color values in disoccluded areas. Consequently, we devise a bilateral geometric consistency loss that carefully navigates the trade-off between color fidelity and geometric accuracy in the context of depth consistency for uncertain regions. Moreover, we present a reliability-guided learning strategy to discern and utilize the variable quality across synthesized views, complemented by a reliability-aware voxel smoothing algorithm that smoothens the transition between reliable and unreliable data patches. Our approach allows for a more nuanced use of all available data, promoting enhanced learning from regions previously considered unsuitable for high-quality reconstruction. Extensive experiments across diverse datasets reveal that our approach attains significant gains in efficiency and accuracy, delivering rendering speeds of 3 FPS, 7 mins to train a $360^\circ$ scene, and a 5\% improvement in PSNR over existing few-shot methods. Code is available at https://github.com/HKCLynn/ReVoRF.

* CVPR 2024 final version

Via

Access Paper or Ask Questions