Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhiwei Chen

PRIME: Physics-Related Intelligent Mixture of Experts for Transistor Characteristics Prediction

May 10, 2025

Zhenxing Dou, Yijiao Wang, Tao Zou, Zhiwei Chen, Fei Liu, Peng Wang, Weisheng Zhao

Abstract:In recent years, machine learning has been extensively applied to data prediction during process ramp-up, with a particular focus on transistor characteristics for circuit design and manufacture. However, capturing the nonlinear current response across multiple operating regions remains a challenge for neural networks. To address such challenge, a novel machine learning framework, PRIME (Physics-Related Intelligent Mixture of Experts), is proposed to capture and integrate complex regional characteristics. In essence, our framework incorporates physics-based knowledge with data-driven intelligence. By leveraging a dynamic weighting mechanism in its gating network, PRIME adaptively activates the suitable expert model based on distinct input data features. Extensive evaluations are conducted on various gate-all-around (GAA) structures to examine the effectiveness of PRIME and considerable improvements (60\%-84\%) in prediction accuracy are shown over state-of-the-art models.

* 8 pages, 6figures

Via

Access Paper or Ask Questions

Purifying, Labeling, and Utilizing: A High-Quality Pipeline for Small Object Detection

Apr 29, 2025

Siwei Wang, Zhiwei Chen, Liujuan Cao, Rongrong Ji

Abstract:Small object detection is a broadly investigated research task and is commonly conceptualized as a "pipeline-style" engineering process. In the upstream, images serve as raw materials for processing in the detection pipeline, where pre-trained models are employed to generate initial feature maps. In the midstream, an assigner selects training positive and negative samples. Subsequently, these samples and features are fed into the downstream for classification and regression. Previous small object detection methods often focused on improving isolated stages of the pipeline, thereby neglecting holistic optimization and consequently constraining overall performance gains. To address this issue, we have optimized three key aspects, namely Purifying, Labeling, and Utilizing, in this pipeline, proposing a high-quality Small object detection framework termed PLUSNet. Specifically, PLUSNet comprises three sequential components: the Hierarchical Feature Purifier (HFP) for purifying upstream features, the Multiple Criteria Label Assignment (MCLA) for improving the quality of midstream training samples, and the Frequency Decoupled Head (FDHead) for more effectively exploiting information to accomplish downstream tasks. The proposed PLUS modules are readily integrable into various object detectors, thus enhancing their detection capabilities in multi-scale scenarios. Extensive experiments demonstrate the proposed PLUSNet consistently achieves significant and consistent improvements across multiple datasets for small object detection.

Via

Access Paper or Ask Questions

FineCIR: Explicit Parsing of Fine-Grained Modification Semantics for Composed Image Retrieval

Mar 27, 2025

Zixu Li, Zhiheng Fu, Yupeng Hu, Zhiwei Chen, Haokun Wen, Liqiang Nie

Abstract:Composed Image Retrieval (CIR) facilitates image retrieval through a multimodal query consisting of a reference image and modification text. The reference image defines the retrieval context, while the modification text specifies desired alterations. However, existing CIR datasets predominantly employ coarse-grained modification text (CoarseMT), which inadequately captures fine-grained retrieval intents. This limitation introduces two key challenges: (1) ignoring detailed differences leads to imprecise positive samples, and (2) greater ambiguity arises when retrieving visually similar images. These issues degrade retrieval accuracy, necessitating manual result filtering or repeated queries. To address these limitations, we develop a robust fine-grained CIR data annotation pipeline that minimizes imprecise positive samples and enhances CIR systems' ability to discern modification intents accurately. Using this pipeline, we refine the FashionIQ and CIRR datasets to create two fine-grained CIR datasets: Fine-FashionIQ and Fine-CIRR. Furthermore, we introduce FineCIR, the first CIR framework explicitly designed to parse the modification text. FineCIR effectively captures fine-grained modification semantics and aligns them with ambiguous visual entities, enhancing retrieval precision. Extensive experiments demonstrate that FineCIR consistently outperforms state-of-the-art CIR baselines on both fine-grained and traditional CIR benchmark datasets. Our FineCIR code and fine-grained CIR datasets are available at https://github.com/SDU-L/FineCIR.git.

Via

Access Paper or Ask Questions

NAFRSSR: a Lightweight Recursive Network for Efficient Stereo Image Super-Resolution

May 14, 2024

Yihong Chen, Zhen Fan, Shuai Dong, Zhiwei Chen, Wenjie Li, Minghui Qin, Min Zeng, Xubing Lu, Guofu Zhou, Xingsen Gao(+1 more)

Figure 1 for NAFRSSR: a Lightweight Recursive Network for Efficient Stereo Image Super-Resolution

Figure 2 for NAFRSSR: a Lightweight Recursive Network for Efficient Stereo Image Super-Resolution

Figure 3 for NAFRSSR: a Lightweight Recursive Network for Efficient Stereo Image Super-Resolution

Figure 4 for NAFRSSR: a Lightweight Recursive Network for Efficient Stereo Image Super-Resolution

Abstract:Stereo image super-resolution (SR) refers to the reconstruction of a high-resolution (HR) image from a pair of low-resolution (LR) images as typically captured by a dual-camera device. To enhance the quality of SR images, most previous studies focused on increasing the number and size of feature maps and introducing complex and computationally intensive structures, resulting in models with high computational complexity. Here, we propose a simple yet efficient stereo image SR model called NAFRSSR, which is modified from the previous state-of-the-art model NAFSSR by introducing recursive connections and lightweighting the constituent modules. Our NAFRSSR model is composed of nonlinear activation free and group convolution-based blocks (NAFGCBlocks) and depth-separated stereo cross attention modules (DSSCAMs). The NAFGCBlock improves feature extraction and reduces number of parameters by removing the simple channel attention mechanism from NAFBlock and using group convolution. The DSSCAM enhances feature fusion and reduces number of parameters by replacing 1x1 pointwise convolution in SCAM with weight-shared 3x3 depthwise convolution. Besides, we propose to incorporate trainable edge detection operator into NAFRSSR to further improve the model performance. Four variants of NAFRSSR with different sizes, namely, NAFRSSR-Mobile (NAFRSSR-M), NAFRSSR-Tiny (NAFRSSR-T), NAFRSSR-Super (NAFRSSR-S) and NAFRSSR-Base (NAFRSSR-B) are designed, and they all exhibit fewer parameters, higher PSNR/SSIM, and faster speed than the previous state-of-the-art models. In particular, to the best of our knowledge, NAFRSSR-M is the lightest (0.28M parameters) and fastest (50 ms inference time) model achieving an average PSNR/SSIM as high as 24.657 dB/0.7622 on the benchmark datasets. Codes and models will be released at https://github.com/JNUChenYiHong/NAFRSSR.

Via

Access Paper or Ask Questions

Biomedical image analysis competitions: The state of current participation practice

Dec 16, 2022

Matthias Eisenmann, Annika Reinke, Vivienn Weru, Minu Dietlinde Tizabi, Fabian Isensee, Tim J. Adler, Patrick Godau, Veronika Cheplygina, Michal Kozubek, Sharib Ali(+344 more)

Abstract:The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

Via

Access Paper or Ask Questions

Re-Attention Transformer for Weakly Supervised Object Localization

Aug 03, 2022

Hui Su, Yue Ye, Zhiwei Chen, Mingli Song, Lechao Cheng

Abstract:Weakly supervised object localization is a challenging task which aims to localize objects with coarse annotations such as image categories. Existing deep network approaches are mainly based on class activation map, which focuses on highlighting discriminative local region while ignoring the full object. In addition, the emerging transformer-based techniques constantly put a lot of emphasis on the backdrop that impedes the ability to identify complete objects. To address these issues, we present a re-attention mechanism termed token refinement transformer (TRT) that captures the object-level semantics to guide the localization well. Specifically, TRT introduces a novel module named token priority scoring module (TPSM) to suppress the effects of background noise while focusing on the target object. Then, we incorporate the class activation map as the semantically aware input to restrain the attention map to the target object. Extensive experiments on two benchmarks showcase the superiority of our proposed method against existing methods with image category annotations. Source code is available in \url{https://github.com/su-hui-zz/ReAttentionTransformer}.

* 11 pages, 5 figures

Via

Access Paper or Ask Questions

Robust Transmission Design for RIS-Aided Communications with Both Transceiver Hardware Impairments and Imperfect CSI

Dec 12, 2021

Zhangjie Peng, Zhiwei Chen, Cunhua Pan, Gui Zhou, Hong Ren

Figure 1 for Robust Transmission Design for RIS-Aided Communications with Both Transceiver Hardware Impairments and Imperfect CSI

Figure 2 for Robust Transmission Design for RIS-Aided Communications with Both Transceiver Hardware Impairments and Imperfect CSI

Figure 3 for Robust Transmission Design for RIS-Aided Communications with Both Transceiver Hardware Impairments and Imperfect CSI

Figure 4 for Robust Transmission Design for RIS-Aided Communications with Both Transceiver Hardware Impairments and Imperfect CSI

Abstract:Reconfigurable intelligent surface (RIS) or intelligent reflecting surface (IRS) has recently been envisioned as one of the most promising technologies in the future sixth-generation (6G) communications. In this paper, we consider the joint optimization of the transmit beamforming at the base station (BS) and the phase shifts at the RIS for an RIS-aided wireless communication system with both hardware impairments and imperfect channel state information (CSI). Specifically, we assume both the BS-user channel and the BS-RIS-user channel are imperfect due to the channel estimation error, and we consider the channel estimation error under the statistical CSI error model. Then, the transmit power of the BS is minimized, subject to the outage probability constraint and the unit-modulus constraints on the reflecting elements. By using Bernstein-type inequality and semidefinite relaxation (SDR) to reformulate the constraints, we transform the optimization problem into a semidefinite programming (SDP) problem. Numerical results show that the proposed robust design algorithm can ensure communication quality of the user in the presence of both hardware impairments and imperfect CSI.

* Accepted by IEEE Wireless Communications Letters. Keywords: Reconfigurable intelligent surface (RIS), intelligent reflecting surface (IRS)

Via

Access Paper or Ask Questions

LCTR: On Awakening the Local Continuity of Transformer for Weakly Supervised Object Localization

Dec 10, 2021

Zhiwei Chen, Changan Wang, Yabiao Wang, Guannan Jiang, Yunhang Shen, Ying Tai, Chengjie Wang, Wei Zhang, Liujuan Cao

Figure 1 for LCTR: On Awakening the Local Continuity of Transformer for Weakly Supervised Object Localization

Figure 2 for LCTR: On Awakening the Local Continuity of Transformer for Weakly Supervised Object Localization

Figure 3 for LCTR: On Awakening the Local Continuity of Transformer for Weakly Supervised Object Localization

Figure 4 for LCTR: On Awakening the Local Continuity of Transformer for Weakly Supervised Object Localization

Abstract:Weakly supervised object localization (WSOL) aims to learn object localizer solely by using image-level labels. The convolution neural network (CNN) based techniques often result in highlighting the most discriminative part of objects while ignoring the entire object extent. Recently, the transformer architecture has been deployed to WSOL to capture the long-range feature dependencies with self-attention mechanism and multilayer perceptron structure. Nevertheless, transformers lack the locality inductive bias inherent to CNNs and therefore may deteriorate local feature details in WSOL. In this paper, we propose a novel framework built upon the transformer, termed LCTR (Local Continuity TRansformer), which targets at enhancing the local perception capability of global features among long-range feature dependencies. To this end, we propose a relational patch-attention module (RPAM), which considers cross-patch information on a global basis. We further design a cue digging module (CDM), which utilizes local features to guide the learning trend of the model for highlighting the weak local responses. Finally, comprehensive experiments are carried out on two widely used datasets, ie, CUB-200-2011 and ILSVRC, to verify the effectiveness of our method.

Via

Access Paper or Ask Questions