Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Qingchen Wang

Loop-Residual Neural Networks for Iterative Refinement

Sep 21, 2024

Kei-Sing Ng, Qingchen Wang

Abstract:The success of large-scale language models like GPT can be attributed to their ability to efficiently predict the next token in a sequence. However, these models rely on constant computational effort regardless of the complexity of the token they are predicting, lacking the capacity for iterative refinement. In this paper, we introduce a novel Loop-Residual Neural Network, which achieves better performance by utilizing longer computational time without increasing the model size. Our approach revisits the input multiple times, refining the prediction by iteratively looping over a subset of the model with residual connections. We demonstrate the effectiveness of this method through experiments comparing versions of GPT-2 with our Loop-Residual models, showing improved performance in language modeling tasks while maintaining similar parameter counts. Importantly, these improvements are achieved without the need for extra training data.

Via

Access Paper or Ask Questions

Widely Linear Matched Filter: A Lynchpin towards the Interpretability of Complex-valued CNNs

Jan 31, 2024

Qingchen Wang, Zhe Li, Zdenka Babic, Wei Deng, Ljubiša Stanković, Danilo P. Mandic

Figure 1 for Widely Linear Matched Filter: A Lynchpin towards the Interpretability of Complex-valued CNNs

Figure 2 for Widely Linear Matched Filter: A Lynchpin towards the Interpretability of Complex-valued CNNs

Figure 3 for Widely Linear Matched Filter: A Lynchpin towards the Interpretability of Complex-valued CNNs

Figure 4 for Widely Linear Matched Filter: A Lynchpin towards the Interpretability of Complex-valued CNNs

Abstract:A recent study on the interpretability of real-valued convolutional neural networks (CNNs) {Stankovic_Mandic_2023CNN} has revealed a direct and physically meaningful link with the task of finding features in data through matched filters. However, applying this paradigm to illuminate the interpretability of complex-valued CNNs meets a formidable obstacle: the extension of matched filtering to a general class of noncircular complex-valued data, referred to here as the widely linear matched filter (WLMF), has been only implicit in the literature. To this end, to establish the interpretability of the operation of complex-valued CNNs, we introduce a general WLMF paradigm, provide its solution and undertake analysis of its performance. For rigor, our WLMF solution is derived without imposing any assumption on the probability density of noise. The theoretical advantages of the WLMF over its standard strictly linear counterpart (SLMF) are provided in terms of their output signal-to-noise-ratios (SNRs), with WLMF consistently exhibiting enhanced SNR. Moreover, the lower bound on the SNR gain of WLMF is derived, together with condition to attain this bound. This serves to revisit the convolution-activation-pooling chain in complex-valued CNNs through the lens of matched filtering, which reveals the potential of WLMFs to provide physical interpretability and enhance explainability of general complex-valued CNNs. Simulations demonstrate the agreement between the theoretical and numerical results.

Via

Access Paper or Ask Questions

Self Meta Pseudo Labels: Meta Pseudo Labels Without The Teacher

Dec 27, 2022

Kei-Sing Ng, Qingchen Wang

Abstract:We present Self Meta Pseudo Labels, a novel semi-supervised learning method similar to Meta Pseudo Labels but without the teacher model. We introduce a novel way to use a single model for both generating pseudo labels and classification, allowing us to store only one model in memory instead of two. Our method attains similar performance to the Meta Pseudo Labels method while drastically reducing memory usage.

* Accepted by IEEE ICMLA 2022

Via

Access Paper or Ask Questions