Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chenwei Tang

Zero-Shot Aerial Object Detection with Visual Description Regularization

Mar 01, 2024

Zhengqing Zang, Chenyu Lin, Chenwei Tang, Tao Wang, Jiancheng Lv

Figure 1 for Zero-Shot Aerial Object Detection with Visual Description Regularization

Figure 2 for Zero-Shot Aerial Object Detection with Visual Description Regularization

Figure 3 for Zero-Shot Aerial Object Detection with Visual Description Regularization

Figure 4 for Zero-Shot Aerial Object Detection with Visual Description Regularization

Abstract:Existing object detection models are mainly trained on large-scale labeled datasets. However, annotating data for novel aerial object classes is expensive since it is time-consuming and may require expert knowledge. Thus, it is desirable to study label-efficient object detection methods on aerial images. In this work, we propose a zero-shot method for aerial object detection named visual Description Regularization, or DescReg. Concretely, we identify the weak semantic-visual correlation of the aerial objects and aim to address the challenge with prior descriptions of their visual appearance. Instead of directly encoding the descriptions into class embedding space which suffers from the representation gap problem, we propose to infuse the prior inter-class visual similarity conveyed in the descriptions into the embedding learning. The infusion process is accomplished with a newly designed similarity-aware triplet loss which incorporates structured regularization on the representation space. We conduct extensive experiments with three challenging aerial object detection datasets, including DIOR, xView, and DOTA. The results demonstrate that DescReg significantly outperforms the state-of-the-art ZSD methods with complex projection designs and generative frameworks, e.g., DescReg outperforms best reported ZSD method on DIOR by 4.5 mAP on unseen classes and 8.1 in HM. We further show the generalizability of DescReg by integrating it into generative ZSD methods as well as varying the detection architecture.

* 13 pages, 3 figures

Via

Access Paper or Ask Questions

Analysis and Applications of Deep Learning with Finite Samples in Full Life-Cycle Intelligence of Nuclear Power Generation

Nov 07, 2023

Chenwei Tang, Wenqiang Zhou, Dong Wang, Caiyang Yu, Zhenan He, Jizhe Zhou, Shudong Huang, Yi Gao, Jianming Chen, Wentao Feng(+1 more)

Figure 1 for Analysis and Applications of Deep Learning with Finite Samples in Full Life-Cycle Intelligence of Nuclear Power Generation

Figure 2 for Analysis and Applications of Deep Learning with Finite Samples in Full Life-Cycle Intelligence of Nuclear Power Generation

Figure 3 for Analysis and Applications of Deep Learning with Finite Samples in Full Life-Cycle Intelligence of Nuclear Power Generation

Figure 4 for Analysis and Applications of Deep Learning with Finite Samples in Full Life-Cycle Intelligence of Nuclear Power Generation

Abstract:The advent of Industry 4.0 has precipitated the incorporation of Artificial Intelligence (AI) methods within industrial contexts, aiming to realize intelligent manufacturing, operation as well as maintenance, also known as industrial intelligence. However, intricate industrial milieus, particularly those relating to energy exploration and production, frequently encompass data characterized by long-tailed class distribution, sample imbalance, and domain shift. These attributes pose noteworthy challenges to data-centric Deep Learning (DL) techniques, crucial for the realization of industrial intelligence. The present study centers on the intricate and distinctive industrial scenarios of Nuclear Power Generation (NPG), meticulously scrutinizing the application of DL techniques under the constraints of finite data samples. Initially, the paper expounds on potential employment scenarios for AI across the full life-cycle of NPG. Subsequently, we delve into an evaluative exposition of DL's advancement, grounded in the finite sample perspective. This encompasses aspects such as small-sample learning, few-shot learning, zero-shot learning, and open-set recognition, also referring to the unique data characteristics of NPG. The paper then proceeds to present two specific case studies. The first revolves around the automatic recognition of zirconium alloy metallography, while the second pertains to open-set recognition for signal diagnosis of machinery sensors. These cases, spanning the entirety of NPG's life-cycle, are accompanied by constructive outcomes and insightful deliberations. By exploring and applying DL methodologies within the constraints of finite sample availability, this paper not only furnishes a robust technical foundation but also introduces a fresh perspective toward the secure and efficient advancement and exploitation of this advanced energy source.

Via

Access Paper or Ask Questions

VCL Challenges 2023 at ICCV 2023 Technical Report: Bi-level Adaptation Method for Test-time Adaptive Object Detection

Oct 13, 2023

Chenyu Lin, Yusheng He, Zhengqing Zang, Chenwei Tang, Tao Wang, Jiancheng Lv

Abstract:This report outlines our team's participation in VCL Challenges B Continual Test_time Adaptation, focusing on the technical details of our approach. Our primary focus is Testtime Adaptation using bi_level adaptations, encompassing image_level and detector_level adaptations. At the image level, we employ adjustable parameterbased image filters, while at the detector level, we leverage adjustable parameterbased mean teacher modules. Ultimately, through the utilization of these bi_level adaptations, we have achieved a remarkable 38.3% mAP on the target domain of the test set within VCL Challenges B. It is worth noting that the minimal drop in mAP, is mearly 4.2%, and the overall performance is 32.5% mAP.

Via

Access Paper or Ask Questions

Attribute Localization and Revision Network for Zero-Shot Learning

Oct 11, 2023

Junzhe Xu, Suling Duan, Chenwei Tang, Zhenan He, Jiancheng Lv

Abstract:Zero-shot learning enables the model to recognize unseen categories with the aid of auxiliary semantic information such as attributes. Current works proposed to detect attributes from local image regions and align extracted features with class-level semantics. In this paper, we find that the choice between local and global features is not a zero-sum game, global features can also contribute to the understanding of attributes. In addition, aligning attribute features with class-level semantics ignores potential intra-class attribute variation. To mitigate these disadvantages, we present Attribute Localization and Revision Network in this paper. First, we design Attribute Localization Module (ALM) to capture both local and global features from image regions, a novel module called Scale Control Unit is incorporated to fuse global and local representations. Second, we propose Attribute Revision Module (ARM), which generates image-level semantics by revising the ground-truth value of each attribute, compensating for performance degradation caused by ignoring intra-class variation. Finally, the output of ALM will be aligned with revised semantics produced by ARM to achieve the training process. Comprehensive experimental results on three widely used benchmarks demonstrate the effectiveness of our model in the zero-shot prediction task.

Via

Access Paper or Ask Questions

GPT-NAS: Neural Architecture Search with the Generative Pre-Trained Model

May 09, 2023

Caiyang Yu, Xianggen Liu, Chenwei Tang, Wentao Feng, Jiancheng Lv

Abstract:Neural Architecture Search (NAS) has emerged as one of the effective methods to design the optimal neural network architecture automatically. Although neural architectures have achieved human-level performances in several tasks, few of them are obtained from the NAS method. The main reason is the huge search space of neural architectures, making NAS algorithms inefficient. This work presents a novel architecture search algorithm, called GPT-NAS, that optimizes neural architectures by Generative Pre-Trained (GPT) model. In GPT-NAS, we assume that a generative model pre-trained on a large-scale corpus could learn the fundamental law of building neural architectures. Therefore, GPT-NAS leverages the generative pre-trained (GPT) model to propose reasonable architecture components given the basic one. Such an approach can largely reduce the search space by introducing prior knowledge in the search process. Extensive experimental results show that our GPT-NAS method significantly outperforms seven manually designed neural architectures and thirteen architectures provided by competing NAS methods. In addition, our ablation study indicates that the proposed algorithm improves the performance of finely tuned neural architectures by up to about 12% compared to those without GPT, further demonstrating its effectiveness in searching neural architectures.

Via

Access Paper or Ask Questions

Partial Differential Equations Meet Deep Neural Networks: A Survey

Nov 18, 2022

Shudong Huang, Wentao Feng, Chenwei Tang, Jiancheng Lv

Figure 1 for Partial Differential Equations Meet Deep Neural Networks: A Survey

Figure 2 for Partial Differential Equations Meet Deep Neural Networks: A Survey

Figure 3 for Partial Differential Equations Meet Deep Neural Networks: A Survey

Figure 4 for Partial Differential Equations Meet Deep Neural Networks: A Survey

Abstract:Many problems in science and engineering can be represented by a set of partial differential equations (PDEs) through mathematical modeling. Mechanism-based computation following PDEs has long been an essential paradigm for studying topics such as computational fluid dynamics, multiphysics simulation, molecular dynamics, or even dynamical systems. It is a vibrant multi-disciplinary field of increasing importance and with extraordinary potential. At the same time, solving PDEs efficiently has been a long-standing challenge. Generally, except for a few differential equations for which analytical solutions are directly available, many more equations must rely on numerical approaches such as the finite difference method, finite element method, finite volume method, and boundary element method to be solved approximately. These numerical methods usually divide a continuous problem domain into discrete points and then concentrate on solving the system at each of those points. Though the effectiveness of these traditional numerical methods, the vast number of iterative operations accompanying each step forward significantly reduces the efficiency. Recently, another equally important paradigm, data-based computation represented by deep learning, has emerged as an effective means of solving PDEs. Surprisingly, a comprehensive review for this interesting subfield is still lacking. This survey aims to categorize and review the current progress on Deep Neural Networks (DNNs) for PDEs. We discuss the literature published in this subfield over the past decades and present them in a common taxonomy, followed by an overview and classification of applications of these related methods in scientific research and engineering scenarios. The origin, developing history, character, sort, as well as the future trends in each potential direction of this subfield are also introduced.

Via

Access Paper or Ask Questions

Cluster-based Contrastive Disentangling for Generalized Zero-Shot Learning

Mar 05, 2022

Yi Gao, Chenwei Tang, Jiancheng Lv

Figure 1 for Cluster-based Contrastive Disentangling for Generalized Zero-Shot Learning

Figure 2 for Cluster-based Contrastive Disentangling for Generalized Zero-Shot Learning

Figure 3 for Cluster-based Contrastive Disentangling for Generalized Zero-Shot Learning

Figure 4 for Cluster-based Contrastive Disentangling for Generalized Zero-Shot Learning

Abstract:Generalized Zero-Shot Learning (GZSL) aims to recognize both seen and unseen classes by training only the seen classes, in which the instances of unseen classes tend to be biased towards the seen class. In this paper, we propose a Cluster-based Contrastive Disentangling (CCD) method to improve GZSL by alleviating the semantic gap and domain shift problems. Specifically, we first cluster the batch data to form several sets containing similar classes. Then, we disentangle the visual features into semantic-unspecific and semantic-matched variables, and further disentangle the semantic-matched variables into class-shared and class-unique variables according to the clustering results. The disentangled learning module with random swapping and semantic-visual alignment bridges the semantic gap. Moreover, we introduce contrastive learning on semantic-matched and class-unique variables to learn high intra-set and intra-class similarity, as well as inter-set and inter-class discriminability. Then, the generated visual features conform to the underlying characteristics of general images and have strong discriminative information, which alleviates the domain shift problem well. We evaluate our proposed method on four datasets and achieve state-of-the-art results in both conventional and generalized settings.

Via

Access Paper or Ask Questions

Learning Inverse Mapping by Autoencoder based Generative Adversarial Nets

Sep 12, 2017

Junyu Luo, Yong Xu, Chenwei Tang, Jiancheng Lv

Figure 1 for Learning Inverse Mapping by Autoencoder based Generative Adversarial Nets

Figure 2 for Learning Inverse Mapping by Autoencoder based Generative Adversarial Nets

Figure 3 for Learning Inverse Mapping by Autoencoder based Generative Adversarial Nets

Figure 4 for Learning Inverse Mapping by Autoencoder based Generative Adversarial Nets

Abstract:The inverse mapping of GANs'(Generative Adversarial Nets) generator has a great potential value.Hence, some works have been developed to construct the inverse function of generator by directly learning or adversarial learning.While the results are encouraging, the problem is highly challenging and the existing ways of training inverse models of GANs have many disadvantages, such as hard to train or poor performance.Due to these reasons, we propose a new approach based on using inverse generator ($IG$) model as encoder and pre-trained generator ($G$) as decoder of an AutoEncoder network to train the $IG$ model. In the proposed model, the difference between the input and output, which are both the generated image of pre-trained GAN's generator, of AutoEncoder is directly minimized. The optimizing method can overcome the difficulty in training and inverse model of an non one-to-one function.We also applied the inverse model of GANs' generators to image searching and translation.The experimental results prove that the proposed approach works better than the traditional approaches in image searching.

* 10 pages, 5 figures

Via

Access Paper or Ask Questions