Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ning Zhou

DepthSeg: Depth prompting in remote sensing semantic segmentation

Jun 17, 2025

Ning Zhou, Shanxiong Chen, Mingting Zhou, Haigang Sui, Lieyun Hu, Han Li, Li Hua, Qiming Zhou

Abstract:Remote sensing semantic segmentation is crucial for extracting detailed land surface information, enabling applications such as environmental monitoring, land use planning, and resource assessment. In recent years, advancements in artificial intelligence have spurred the development of automatic remote sensing semantic segmentation methods. However, the existing semantic segmentation methods focus on distinguishing spectral characteristics of different objects while ignoring the differences in the elevation of the different targets. This results in land cover misclassification in complex scenarios involving shadow occlusion and spectral confusion. In this paper, we introduce a depth prompting two-dimensional (2D) remote sensing semantic segmentation framework (DepthSeg). It automatically models depth/height information from 2D remote sensing images and integrates it into the semantic segmentation framework to mitigate the effects of spectral confusion and shadow occlusion. During the feature extraction phase of DepthSeg, we introduce a lightweight adapter to enable cost-effective fine-tuning of the large-parameter vision transformer encoder pre-trained by natural images. In the depth prompting phase, we propose a depth prompter to model depth/height features explicitly. In the semantic prediction phase, we introduce a semantic classification decoder that couples the depth prompts with high-dimensional land-cover features, enabling accurate extraction of land-cover types. Experiments on the LiuZhou dataset validate the advantages of the DepthSeg framework in land cover mapping tasks. Detailed ablation studies further highlight the significance of the depth prompts in remote sensing semantic segmentation.

Via

Access Paper or Ask Questions

Pose as a Modality: A Psychology-Inspired Network for Personality Recognition with a New Multimodal Dataset

Mar 17, 2025

Bin Tang, Keqi Pan, Miao Zheng, Ning Zhou, Jialu Sui, Dandan Zhu, Cheng-Long Deng, Shu-Guang Kuai

Abstract:In recent years, predicting Big Five personality traits from multimodal data has received significant attention in artificial intelligence (AI). However, existing computational models often fail to achieve satisfactory performance. Psychological research has shown a strong correlation between pose and personality traits, yet previous research has largely ignored pose data in computational models. To address this gap, we develop a novel multimodal dataset that incorporates full-body pose data. The dataset includes video recordings of 287 participants completing a virtual interview with 36 questions, along with self-reported Big Five personality scores as labels. To effectively utilize this multimodal data, we introduce the Psychology-Inspired Network (PINet), which consists of three key modules: Multimodal Feature Awareness (MFA), Multimodal Feature Interaction (MFI), and Psychology-Informed Modality Correlation Loss (PIMC Loss). The MFA module leverages the Vision Mamba Block to capture comprehensive visual features related to personality, while the MFI module efficiently fuses the multimodal features. The PIMC Loss, grounded in psychological theory, guides the model to emphasize different modalities for different personality dimensions. Experimental results show that the PINet outperforms several state-of-the-art baseline models. Furthermore, the three modules of PINet contribute almost equally to the model's overall performance. Incorporating pose data significantly enhances the model's performance, with the pose modality ranking mid-level in importance among the five modalities. These findings address the existing gap in personality-related datasets that lack full-body pose data and provide a new approach for improving the accuracy of personality prediction models, highlighting the importance of integrating psychological insights into AI frameworks.

* 9 pages, 6 figures, AAAI 2025 Oral

Via

Access Paper or Ask Questions

Encoding Argumentation Frameworks to Propositional Logic Systems

Mar 10, 2025

Shuai Tang, Jiachao Wu, Ning Zhou

Abstract:The theory of argumentation frameworks ($AF$s) has been a useful tool for artificial intelligence. The research of the connection between $AF$s and logic is an important branch. This paper generalizes the encoding method by encoding $AF$s as logical formulas in different propositional logic systems. It studies the relationship between models of an AF by argumentation semantics, including Dung's classical semantics and Gabbay's equational semantics, and models of the encoded formulas by semantics of propositional logic systems. Firstly, we supplement the proof of the regular encoding function in the case of encoding $AF$s to the 2-valued propositional logic system. Then we encode $AF$s to 3-valued propositional logic systems and fuzzy propositional logic systems and explore the model relationship. This paper enhances the connection between $AF$s and propositional logic systems. It also provides a new way to construct new equational semantics by choosing different fuzzy logic operations.

* 31 pages

Via

Access Paper or Ask Questions

Distributed Federated Learning-Based Deep Learning Model for Privacy MRI Brain Tumor Detection

Apr 15, 2024

Lisang Zhou, Meng Wang, Ning Zhou

Abstract:Distributed training can facilitate the processing of large medical image datasets, and improve the accuracy and efficiency of disease diagnosis while protecting patient privacy, which is crucial for achieving efficient medical image analysis and accelerating medical research progress. This paper presents an innovative approach to medical image classification, leveraging Federated Learning (FL) to address the dual challenges of data privacy and efficient disease diagnosis. Traditional Centralized Machine Learning models, despite their widespread use in medical imaging for tasks such as disease diagnosis, raise significant privacy concerns due to the sensitive nature of patient data. As an alternative, FL emerges as a promising solution by allowing the training of a collective global model across local clients without centralizing the data, thus preserving privacy. Focusing on the application of FL in Magnetic Resonance Imaging (MRI) brain tumor detection, this study demonstrates the effectiveness of the Federated Learning framework coupled with EfficientNet-B0 and the FedAvg algorithm in enhancing both privacy and diagnostic accuracy. Through a meticulous selection of preprocessing methods, algorithms, and hyperparameters, and a comparative analysis of various Convolutional Neural Network (CNN) architectures, the research uncovers optimal strategies for image classification. The experimental results reveal that EfficientNet-B0 outperforms other models like ResNet in handling data heterogeneity and achieving higher accuracy and lower loss, highlighting the potential of FL in overcoming the limitations of traditional models. The study underscores the significance of addressing data heterogeneity and proposes further research directions for broadening the applicability of FL in medical image analysis.

* Journal of Information, Technology and Policy (2023): 1-12

Via

Access Paper or Ask Questions

Fine-grained Action Analysis: A Multi-modality and Multi-task Dataset of Figure Skating

Jul 06, 2023

Sheng-Lan Liu, Yu-Ning Ding, Si-Fan Zhang, Wen-Yue Chen, Ning Zhou, Hao Liu, Gui-Hong Lao

Abstract:The fine-grained action analysis of the existing action datasets is challenged by insufficient action categories, low fine granularities, limited modalities, and tasks. In this paper, we propose a Multi-modality and Multi-task dataset of Figure Skating (MMFS) which was collected from the World Figure Skating Championships. MMFS, which possesses action recognition and action quality assessment, captures RGB, skeleton, and is collected the score of actions from 11671 clips with 256 categories including spatial and temporal labels. The key contributions of our dataset fall into three aspects as follows. (1) Independently spatial and temporal categories are first proposed to further explore fine-grained action recognition and quality assessment. (2) MMFS first introduces the skeleton modality for complex fine-grained action quality assessment. (3) Our multi-modality and multi-task dataset encourage more action analysis models. To benchmark our dataset, we adopt RGB-based and skeleton-based baseline methods for action recognition and action quality assessment.

Via

Access Paper or Ask Questions

Adaptive Fractional Dilated Convolution Network for Image Aesthetics Assessment

Apr 06, 2020

Qiuyu Chen, Wei Zhang, Ning Zhou, Peng Lei, Yi Xu, Yu Zheng, Jianping Fan

Figure 1 for Adaptive Fractional Dilated Convolution Network for Image Aesthetics Assessment

Figure 2 for Adaptive Fractional Dilated Convolution Network for Image Aesthetics Assessment

Figure 3 for Adaptive Fractional Dilated Convolution Network for Image Aesthetics Assessment

Figure 4 for Adaptive Fractional Dilated Convolution Network for Image Aesthetics Assessment

Abstract:To leverage deep learning for image aesthetics assessment, one critical but unsolved issue is how to seamlessly incorporate the information of image aspect ratios to learn more robust models. In this paper, an adaptive fractional dilated convolution (AFDC), which is aspect-ratio-embedded, composition-preserving and parameter-free, is developed to tackle this issue natively in convolutional kernel level. Specifically, the fractional dilated kernel is adaptively constructed according to the image aspect ratios, where the interpolation of nearest two integers dilated kernels is used to cope with the misalignment of fractional sampling. Moreover, we provide a concise formulation for mini-batch training and utilize a grouping strategy to reduce computational overhead. As a result, it can be easily implemented by common deep learning libraries and plugged into popular CNN architectures in a computation-efficient manner. Our experimental results demonstrate that our proposed method achieves state-of-the-art performance on image aesthetics assessment over the AVA dataset.

* Accepted by CVPR 2020

Via

Access Paper or Ask Questions

Five lessons from building a deep neural network recommender

Oct 07, 2018

Simen Eide, Audun M. Øygard, Ning Zhou

Figure 1 for Five lessons from building a deep neural network recommender

Figure 2 for Five lessons from building a deep neural network recommender

Figure 3 for Five lessons from building a deep neural network recommender

Figure 4 for Five lessons from building a deep neural network recommender

Abstract:Recommendation algorithms are widely adopted in marketplaces to help users find the items they are looking for. The sparsity of the items by user matrix and the cold-start issue in marketplaces pose challenges for the off-the-shelf matrix factorization based recommender systems. To understand user intent and tailor recommendations to their needs, we use deep learning to explore various heterogeneous data available in marketplaces. This paper summarizes five lessons we learned from experimenting with state-of-the-art deep learning recommenders at the leading Norwegian marketplace FINN.no. We design a hybrid recommender system that takes the user-generated contents of a marketplace (including text, images and meta attributes) and combines them with user behavior data such as page views and messages to provide recommendations for marketplace items. Among various tactics we experimented with, the following five show the best impact: staged training instead of end-to-end training, leveraging rich user behaviors beyond page views, using user behaviors as noisy labels to train embeddings, using transfer learning to solve the unbalanced data problem, and using attention mechanisms in the hybrid model. This system is currently running with around 20% click-through-rate in production at FINN.no and serves over one million visitors everyday.

* Fixed typos. Removed "staged training strategy" result, as it will vary a lot depending on how the stages are designed

Via

Access Paper or Ask Questions

Deep neural network marketplace recommenders in online experiments

Sep 06, 2018

Simen Eide, Ning Zhou

Figure 1 for Deep neural network marketplace recommenders in online experiments

Figure 2 for Deep neural network marketplace recommenders in online experiments

Figure 3 for Deep neural network marketplace recommenders in online experiments

Abstract:Recommendations are broadly used in marketplaces to match users with items relevant to their interests and needs. To understand user intent and tailor recommendations to their needs, we use deep learning to explore various heterogeneous data available in marketplaces. This paper focuses on the challenge of measuring recommender performance and summarizes the online experiment results with several promising types of deep neural network recommenders - hybrid item representation models combining features from user engagement and content, sequence-based models, and multi-armed bandit models that optimize user engagement by re-ranking proposals from multiple submodels. The recommenders are currently running in production at the leading Norwegian marketplace FINN.no and serves over one million visitors everyday.

Via

Access Paper or Ask Questions

Weather Forecasting Error in Solar Energy Forecasting

Sep 24, 2017

Hossein Sangrody, Morteza Sarailoo, Ning Zhou, Nhu Tran, Mahdi Motalleb, Elham Foruzan

Figure 1 for Weather Forecasting Error in Solar Energy Forecasting

Figure 2 for Weather Forecasting Error in Solar Energy Forecasting

Figure 3 for Weather Forecasting Error in Solar Energy Forecasting

Figure 4 for Weather Forecasting Error in Solar Energy Forecasting

Abstract:As renewable distributed energy resources (DERs) penetrate the power grid at an accelerating speed, it is essential for operators to have accurate solar photovoltaic (PV) energy forecasting for efficient operations and planning. Generally, observed weather data are applied in the solar PV generation forecasting model while in practice the energy forecasting is based on forecasted weather data. In this paper, a study on the uncertainty in weather forecasting for the most commonly used weather variables is presented. The forecasted weather data for six days ahead is compared with the observed data and the results of analysis are quantified by statistical metrics. In addition, the most influential weather predictors in energy forecasting model are selected. The performance of historical and observed weather data errors is assessed using a solar PV generation forecasting model. Finally, a sensitivity test is performed to identify the influential weather variables whose accurate values can significantly improve the results of energy forecasting.

Via

Access Paper or Ask Questions

On the Performance of Forecasting Models in the Presence of Input Uncertainty

Jul 15, 2017

Hossein Sangrody, Morteza Sarailoo, Ning Zhou, Ahmad Shokrollahi, Elham Foruzan

Figure 1 for On the Performance of Forecasting Models in the Presence of Input Uncertainty

Figure 2 for On the Performance of Forecasting Models in the Presence of Input Uncertainty

Figure 3 for On the Performance of Forecasting Models in the Presence of Input Uncertainty

Figure 4 for On the Performance of Forecasting Models in the Presence of Input Uncertainty

Abstract:Nowadays, with the unprecedented penetration of renewable distributed energy resources (DERs), the necessity of an efficient energy forecasting model is more demanding than before. Generally, forecasting models are trained using observed weather data while the trained models are applied for energy forecasting using forecasted weather data. In this study, the performance of several commonly used forecasting methods in the presence of weather predictors with uncertainty is assessed and compared. Accordingly, both observed and forecasted weather data are collected, then the influential predictors for solar PV generation forecasting model are selected using several measures. Using observed and forecasted weather data, an analysis on the uncertainty of weather variables is represented by MAE and bootstrapping. The energy forecasting model is trained using observed weather data, and finally, the performance of several commonly used forecasting methods in solar energy forecasting is simulated and compared for a real case study.

Via

Access Paper or Ask Questions