Abstract:Traditional methods for optimizing light source emissions rely on a time-consuming trial-and-error approach. While in-situ optimization of light source gain media emission during growth is ideal, it has yet to be realized. In this work, we integrate in-situ reflection high-energy electron diffraction (RHEED) with machine learning (ML) to correlate the surface reconstruction with the photoluminescence (PL) of InAs/GaAs quantum dots (QDs), which serve as the active region of lasers. A lightweight ResNet-GLAM model is employed for the real-time processing of RHEED data as input, enabling effective identification of optical performance. This approach guides the dynamic optimization of growth parameters, allowing real-time feedback control to adjust the QDs emission for lasers. We successfully optimized InAs QDs on GaAs substrates, with a 3.2-fold increase in PL intensity and a reduction in full width at half maximum (FWHM) from 36.69 meV to 28.17 meV under initially suboptimal growth conditions. Our automated, in-situ self-optimized lasers with 5-layer InAs QDs achieved electrically pumped continuous-wave operation at 1240 nm with a low threshold current of 150 A/cm2 at room temperature, an excellent performance comparable to samples grown through traditional manual multi-parameter optimization methods. These results mark a significant step toward intelligent, low-cost, and reproductive light emitters production.
Abstract:Meta-learning has emerged as a prominent technology for few-shot text classification and has achieved promising performance. However, existing methods often encounter difficulties in drawing accurate class prototypes from support set samples, primarily due to probable large intra-class differences and small inter-class differences within the task. Recent approaches attempt to incorporate external knowledge or pre-trained language models to augment data, but this requires additional resources and thus does not suit many few-shot scenarios. In this paper, we propose a novel solution to address this issue by adequately leveraging the information within the task itself. Specifically, we utilize label information to construct a task-adaptive metric space, thereby adaptively reducing the intra-class differences and magnifying the inter-class differences. We further employ the optimal transport technique to estimate class prototypes with query set samples together, mitigating the problem of inaccurate and ambiguous support set samples caused by large intra-class differences. We conduct extensive experiments on eight benchmark datasets, and our approach shows obvious advantages over state-of-the-art models across all the tasks on all the datasets. For reproducibility, all the datasets and codes are available at https://github.com/YvoGao/LAQDA.
Abstract:This paper has been accepted in the NeurIPS 2024 D & B Track. Harmful memes have proliferated on the Chinese Internet, while research on detecting Chinese harmful memes significantly lags behind due to the absence of reliable datasets and effective detectors. To this end, we focus on the comprehensive detection of Chinese harmful memes. We construct ToxiCN MM, the first Chinese harmful meme dataset, which consists of 12,000 samples with fine-grained annotations for various meme types. Additionally, we propose a baseline detector, Multimodal Knowledge Enhancement (MKE), incorporating contextual information of meme content generated by the LLM to enhance the understanding of Chinese memes. During the evaluation phase, we conduct extensive quantitative experiments and qualitative analyses on multiple baselines, including LLMs and our MKE. The experimental results indicate that detecting Chinese harmful memes is challenging for existing models while demonstrating the effectiveness of MKE. The resources for this paper are available at https://github.com/DUT-lujunyu/ToxiCN_MM.
Abstract:Logs play a critical role in providing essential information for system monitoring and troubleshooting. Recently, with the success of pre-trained language models (PLMs) and large language models (LLMs) in natural language processing (NLP), smaller PLMs (such as BERT) and LLMs (like ChatGPT) have become the current mainstream approaches for log analysis. While LLMs possess rich knowledge, their high computational costs and unstable performance make LLMs impractical for analyzing logs directly. In contrast, smaller PLMs can be fine-tuned for specific tasks even with limited computational resources, making them more practical. However, these smaller PLMs face challenges in understanding logs comprehensively due to their limited expert knowledge. To better utilize the knowledge embedded within LLMs for log understanding, this paper introduces a novel knowledge enhancement framework, called LUK, which acquires expert knowledge from LLMs to empower log understanding on a smaller PLM. Specifically, we design a multi-expert collaboration framework based on LLMs consisting of different roles to acquire expert knowledge. In addition, we propose two novel pre-training tasks to enhance the log pre-training with expert knowledge. LUK achieves state-of-the-art results on different log analysis tasks and extensive experiments demonstrate expert knowledge from LLMs can be utilized more effectively to understand logs.
Abstract:News recommendation emerges as a primary means for users to access content of interest from the vast amount of news. The title clickbait extensively exists in news domain and increases the difficulty for news recommendation to offer satisfactory services for users. Fortunately, we find that news abstract, as a critical field of news, aligns cohesively with the news authenticity. To this end, we propose a Title Debiasing News Recommendation with Cross-field Contrastive learning (TDNR-C2) to overcome the title bias by incorporating news abstract. Specifically, a multi-field knowledge extraction module is devised to extract multi-view knowledge about news from various fields. Afterwards, we present a cross-field contrastive learning module to conduct bias removal via contrasting learned knowledge from title and abstract fileds. Experimental results on a real-world dataset demonstrate the superiority of the proposed TDNR-C2 over existing state-of-the-art methods. Further analysis also indicates the significance of news abstract for title debiasing.
Abstract:The semiconductor industry has prioritized automating repetitive tasks by closed-loop, autonomous experimentation which enables accelerated optimization of complex multi-step processes. The emergence of machine learning (ML) has ushered in automated process with minimal human intervention. In this work, we develop SemiEpi, a self-driving automation platform capable of executing molecular beam epitaxy (MBE) growth with multi-steps, continuous in-situ monitoring, and on-the-fly feedback control. By integrating standard hardware, homemade software, curve fitting, and multiple ML models, SemiEpi operates autonomously, eliminating the need for extensive expertise in MBE processes to achieve optimal outcomes. The platform actively learns from previous experimental results, identifying favorable conditions and proposing new experiments to achieve the desired results. We standardize and optimize growth for InAs/GaAs quantum dots (QDs) heterostructures to showcase the power of ML-guided multi-step growth. A temperature calibration was implemented to get the initial growth condition, and fine control of the process was executed using ML. Leveraging RHEED movies acquired during the growth, SemiEpi successfully identified and optimized a novel route for multi-step heterostructure growth. This work demonstrates the capabilities of closed-loop, ML-guided systems in addressing challenges in multi-step growth for any device. Our method is critical to achieve repeatable materials growth using commercially scalable tools. Our strategy facilitates the development of a hardware-independent process and enhancing process repeatability and stability, even without exhaustive knowledge of growth parameters.
Abstract:The semiconductor industry has prioritized automating repetitive tasks by closed-loop, autonomous experimentation which enables accelerated optimization of complex multi-step processes. The emergence of machine learning (ML) has ushered in automated process with minimal human intervention. In this work, we develop SemiEpi, a self-driving automation platform capable of executing molecular beam epitaxy (MBE) growth with multi-steps, continuous in-situ monitoring, and on-the-fly feedback control. By integrating standard hardware, homemade software, curve fitting, and multiple ML models, SemiEpi operates autonomously, eliminating the need for extensive expertise in MBE processes to achieve optimal outcomes. The platform actively learns from previous experimental results, identifying favorable conditions and proposing new experiments to achieve the desired results. We standardize and optimize growth for InAs/GaAs quantum dots (QDs) heterostructures to showcase the power of ML-guided multi-step growth. A temperature calibration was implemented to get the initial growth condition, and fine control of the process was executed using ML. Leveraging RHEED movies acquired during the growth, SemiEpi successfully identified and optimized a novel route for multi-step heterostructure growth. This work demonstrates the capabilities of closed-loop, ML-guided systems in addressing challenges in multi-step growth for any device. Our method is critical to achieve repeatable materials growth using commercially scalable tools. Our strategy facilitates the development of a hardware-independent process and enhancing process repeatability and stability, even without exhaustive knowledge of growth parameters.
Abstract:Empathetic response generation, aiming at understanding the user's situation and feelings and respond empathically, is crucial in building human-like dialogue systems. Previous methods mainly focus on using maximum likelihood estimation as the optimization objective for training response generation models, without taking into account the empathy level alignment between generated responses and target responses. To this end, we propose an empathetic response generation using reinforcement learning (EmpRL) framework. The framework designs an effective empathy reward function and generates empathetic responses by maximizing the expected reward through reinforcement learning. Given the powerful text generation capability of pre-trained language models, EmpRL utilizes the pre-trained T5 model as the generator and conducts further training to initialize the policy. To align the empathy level between generated responses and target responses in the context, an empathy reward function containing three empathy communication mechanisms, i.e., emotional reaction, interpretation, and exploration, is constructed using pre-designed and pre-trained empathy identifiers. Finally, the proximal policy optimization algorithm is used to further train the policy to produce empathetic responses. Both automatic and manual evaluations demonstrate that the proposed EmpRL framework can improve the quality of generated responses, enhance the empathy level similarity between generated and target responses, and produce empathetic responses covering both affective and cognitive aspects.
Abstract:Brain-inspired Spiking Neural Networks (SNNs) have bio-plausibility and low-power advantages over Artificial Neural Networks (ANNs). Applications of SNNs are currently limited to simple classification tasks because of their poor performance. In this work, we focus on bridging the performance gap between ANNs and SNNs on object detection. Our design revolves around network architecture and spiking neuron. First, the overly complex module design causes spike degradation when the YOLO series is converted to the corresponding spiking version. We design a SpikeYOLO architecture to solve this problem by simplifying the vanilla YOLO and incorporating meta SNN blocks. Second, object detection is more sensitive to quantization errors in the conversion of membrane potentials into binary spikes by spiking neurons. To address this challenge, we design a new spiking neuron that activates Integer values during training while maintaining spike-driven by extending virtual timesteps during inference. The proposed method is validated on both static and neuromorphic object detection datasets. On the static COCO dataset, we obtain 66.2% mAP@50 and 48.9% mAP@50:95, which is +15.0% and +18.7% higher than the prior state-of-the-art SNN, respectively. On the neuromorphic Gen1 dataset, we achieve 67.2% mAP@50, which is +2.5% greater than the ANN with equivalent architecture, and the energy efficiency is improved by 5.7*. Code: https://github.com/BICLab/SpikeYOLO
Abstract:Spiking Neural Networks (SNNs) have received widespread attention due to their unique neuronal dynamics and low-power nature. Previous research empirically shows that SNNs with Poisson coding are more robust than Artificial Neural Networks (ANNs) on small-scale datasets. However, it is still unclear in theory how the adversarial robustness of SNNs is derived, and whether SNNs can still maintain its adversarial robustness advantage on large-scale dataset tasks. This work theoretically demonstrates that SNN's inherent adversarial robustness stems from its Poisson coding. We reveal the conceptual equivalence of Poisson coding and randomized smoothing in defense strategies, and analyze in depth the trade-off between accuracy and adversarial robustness in SNNs via the proposed Randomized Smoothing Coding (RSC) method. Experiments demonstrate that the proposed RSC-SNNs show remarkable adversarial robustness, surpassing ANNs and achieving state-of-the-art robustness results on large-scale dataset ImageNet. Our open-source implementation code is available at this https URL: https://github.com/KemingWu/RSC-SNN.