Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Biao Sun

Llumnix: Dynamic Scheduling for Large Language Model Serving

Jun 05, 2024

Biao Sun, Ziming Huang, Hanyu Zhao, Wencong Xiao, Xinyi Zhang, Yong Li, Wei Lin

Figure 1 for Llumnix: Dynamic Scheduling for Large Language Model Serving

Figure 2 for Llumnix: Dynamic Scheduling for Large Language Model Serving

Figure 3 for Llumnix: Dynamic Scheduling for Large Language Model Serving

Figure 4 for Llumnix: Dynamic Scheduling for Large Language Model Serving

Abstract:Inference serving for large language models (LLMs) is the key to unleashing their potential in people's daily lives. However, efficient LLM serving remains challenging today because the requests are inherently heterogeneous and unpredictable in terms of resource and latency requirements, as a result of the diverse applications and the dynamic execution nature of LLMs. Existing systems are fundamentally limited in handling these characteristics and cause problems such as severe queuing delays, poor tail latencies, and SLO violations. We introduce Llumnix, an LLM serving system that reacts to such heterogeneous and unpredictable requests by runtime rescheduling across multiple model instances. Similar to context switching across CPU cores in modern operating systems, Llumnix reschedules requests to improve load balancing and isolation, mitigate resource fragmentation, and differentiate request priorities and SLOs. Llumnix implements the rescheduling with an efficient and scalable live migration mechanism for requests and their in-memory states, and exploits it in a dynamic scheduling policy that unifies the multiple rescheduling scenarios elegantly. Our evaluations show that Llumnix improves tail latencies by an order of magnitude, accelerates high-priority requests by up to 1.5x, and delivers up to 36% cost savings while achieving similar tail latencies, compared against state-of-the-art LLM serving systems. Llumnix is publicly available at https://github.com/AlibabaPAI/llumnix.

* To appear at OSDI '24; open-source repo will be available in June 2024

Via

Access Paper or Ask Questions

An Olfactory EEG Signal Classification Network Based on Frequency Band Feature Extraction

Feb 05, 2022

Biao Sun, Zhigang Wei, Pei Liang, Huirang Hou

Figure 1 for An Olfactory EEG Signal Classification Network Based on Frequency Band Feature Extraction

Figure 2 for An Olfactory EEG Signal Classification Network Based on Frequency Band Feature Extraction

Figure 3 for An Olfactory EEG Signal Classification Network Based on Frequency Band Feature Extraction

Figure 4 for An Olfactory EEG Signal Classification Network Based on Frequency Band Feature Extraction

Abstract:Classification of olfactory-induced electroencephalogram (EEG) signals has shown great potential in many fields. Since different frequency bands within the EEG signals contain different information, extracting specific frequency bands for classification performance is important. Moreover, due to the large inter-subject variability of the EEG signals, extracting frequency bands with subject-specific information rather than general information is crucial. Considering these, the focus of this letter is to classify the olfactory EEG signals by exploiting the spectral-domain information of specific frequency bands. In this letter, we present an olfactory EEG signal classification network based on frequency band feature extraction. A frequency band generator is first designed to extract frequency bands via the sliding window technique. Then, a frequency band attention mechanism is proposed to optimize frequency bands for a specific subject adaptively. Last, a convolutional neural network (CNN) is constructed to extract the spatio-spectral information and predict the EEG category. Comparison experiment results reveal that the proposed method outperforms a series of baseline methods in terms of both classification quality and inter-subject robustness. Ablation experiment results demonstrate the effectiveness of each component of the proposed method.

Via

Access Paper or Ask Questions

A Detection and Segmentation Architecture for Skin Lesion Segmentation on Dermoscopy Images

Sep 30, 2018

Chengyao Qian, Ting Liu, Hao Jiang, Zhe Wang, Pengfei Wang, Mingxin Guan, Biao Sun

Figure 1 for A Detection and Segmentation Architecture for Skin Lesion Segmentation on Dermoscopy Images

Figure 2 for A Detection and Segmentation Architecture for Skin Lesion Segmentation on Dermoscopy Images

Figure 3 for A Detection and Segmentation Architecture for Skin Lesion Segmentation on Dermoscopy Images

Figure 4 for A Detection and Segmentation Architecture for Skin Lesion Segmentation on Dermoscopy Images

Abstract:This report summarises our method and validation results for the ISIC Challenge 2018 - Skin Lesion Analysis Towards Melanoma Detection - Task 1: Lesion Segmentation. We present a two-stage method for lesion segmentation with optimised training method and ensemble post-process. Our method achieves state-of-the-art performance on lesion segmentation and we win the first place in ISIC 2018 task1.

* 5 pages, 9 figures, Ranked 1st place in ISIC 2018 task1, title updated and results added

Via

Access Paper or Ask Questions