Abstract:In the rapidly evolving field of Large Language Models (LLMs), selecting high-quality data for fine-tuning is essential. This paper focuses on task-specific data pruning and selection to enhance fine-tuning. We introduce an innovative framework, termed P3, which improves LLM performance through a dynamic, adaptive training strategy. Specifically, P3 comprises the following components: (1) Policy-driven Difficulty Measurement: we begin by measuring the difficulty of data based on the model's real-time performance, transitioning from static, predefined metrics to more dynamic and adaptable ones. (2) Pace-adaptive Selection: we employ self-paced learning (SPL) to gradually select increasingly challenging data, thereby progressively enhancing the model's performance. (3) Diversity Promotion: we integrate Determinantal Point Process (DPP) into the selection process to promote the diversity within and between samples, enriching the learning process. We have validated our method on two well-known LLM datasets, APPS and MATH, designed for logical reasoning scenarios. The results show that our P3 framework significantly improves training outcomes compared to traditional methods. By fundamentally refining data selection and utilization strategies, P3 not only advances theoretical understanding of dynamic training approaches but also provides a versatile framework that can revolutionize model training in natural language processing.
Abstract:In this vision paper, we propose a shift in perspective for improving the effectiveness of similarity search. Rather than focusing solely on enhancing the data quality, particularly machine learning-generated embeddings, we advocate for a more comprehensive approach that also enhances the underpinning search mechanisms. We highlight three novel avenues that call for a redefinition of the similarity search problem: exploiting implicit data structures and distributions, engaging users in an iterative feedback loop, and moving beyond a single query vector. These novel pathways have gained relevance in emerging applications such as large-scale language models, video clip retrieval, and data labeling. We discuss the corresponding research challenges posed by these new problem areas and share insights from our preliminary discoveries.
Abstract:The recently developed image-free sensing technique maintains the advantages of both the light hardware and software, which has been applied in simple target classification and motion tracking. In practical applications, however, there usually exist multiple targets in the field of view, where existing trials fail to produce multi-semantic information. In this letter, we report a novel image-free sensing technique to tackle the multi-target recognition challenge for the first time. Different from the convolutional layer stack of image-free single-pixel networks, the reported CRNN network utilities the bidirectional LSTM architecture to predict the distribution of multiple characters simultaneously. The framework enables to capture the long-range dependencies, providing a high recognition accuracy of multiple characters. We demonstrated the technique's effectiveness in license plate detection, which achieved 87.60% recognition accuracy at a 5% sampling rate with a higher than 100 FPS refresh rate.