Abstract:Deep neural networks for image super-resolution (ISR) have shown significant advantages over traditional approaches like the interpolation. However, they are often criticized as 'black boxes' compared to traditional approaches with solid mathematical foundations. In this paper, we attempt to interpret the behavior of deep neural networks in ISR using theories from the field of signal processing. First, we report an intriguing phenomenon, referred to as `the sinc phenomenon.' It occurs when an impulse input is fed to a neural network. Then, building on this observation, we propose a method named Hybrid Response Analysis (HyRA) to analyze the behavior of neural networks in ISR tasks. Specifically, HyRA decomposes a neural network into a parallel connection of a linear system and a non-linear system and demonstrates that the linear system functions as a low-pass filter while the non-linear system injects high-frequency information. Finally, to quantify the injected high-frequency information, we introduce a metric for image-to-image tasks called Frequency Spectrum Distribution Similarity (FSDS). FSDS reflects the distribution similarity of different frequency components and can capture nuances that traditional metrics may overlook. Code, videos and raw experimental results for this paper can be found in: https://github.com/RisingEntropy/LPFInISR.
Abstract:Brain-inspired spiking neural networks (SNNs) have gained prominence in the field of neuromorphic computing owing to their low energy consumption during feedforward inference on neuromorphic hardware. However, it remains an open challenge how to effectively benefit from the sparse event-driven property of SNNs to minimize backpropagation learning costs. In this paper, we conduct a comprehensive examination of the existing event-driven learning algorithms, reveal their limitations, and propose novel solutions to overcome them. Specifically, we introduce two novel event-driven learning methods: the spike-timing-dependent event-driven (STD-ED) and membrane-potential-dependent event-driven (MPD-ED) algorithms. These proposed algorithms leverage precise neuronal spike timing and membrane potential, respectively, for effective learning. The two methods are extensively evaluated on static and neuromorphic datasets to confirm their superior performance. They outperform existing event-driven counterparts by up to 2.51% for STD-ED and 6.79% for MPD-ED on the CIFAR-100 dataset. In addition, we theoretically and experimentally validate the energy efficiency of our methods on neuromorphic hardware. On-chip learning experiments achieved a remarkable 30-fold reduction in energy consumption over time-step-based surrogate gradient methods. The demonstrated efficiency and efficacy of the proposed event-driven learning methods emphasize their potential to significantly advance the fields of neuromorphic computing, offering promising avenues for energy-efficiency applications.
Abstract:This paper presents a novel approach to human image colorization by fine-tuning the InstructPix2Pix model, which integrates a language model (GPT-3) with a text-to-image model (Stable Diffusion). Despite the original InstructPix2Pix model's proficiency in editing images based on textual instructions, it exhibits limitations in the focused domain of colorization. To address this, we fine-tuned the model using the IMDB-WIKI dataset, pairing black-and-white images with a diverse set of colorization prompts generated by ChatGPT. This paper contributes by (1) applying fine-tuning techniques to stable diffusion models specifically for colorization tasks, and (2) employing generative models to create varied conditioning prompts. After finetuning, our model outperforms the original InstructPix2Pix model on multiple metrics quantitatively, and we produce more realistically colored images qualitatively. The code for this project is provided on the GitHub Repository https://github.com/AllenAnZifeng/DeepLearning282.