Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Liang Gao

Automatic MILP Model Construction for Multi-Robot Task Allocation and Scheduling Based on Large Language Models

Mar 18, 2025

Mingming Peng, Zhendong Chen, Jie Yang, Jin Huang, Zhengqi Shi, Qihao Liu, Xinyu Li, Liang Gao

Abstract:With the accelerated development of Industry 4.0, intelligent manufacturing systems increasingly require efficient task allocation and scheduling in multi-robot systems. However, existing methods rely on domain expertise and face challenges in adapting to dynamic production constraints. Additionally, enterprises have high privacy requirements for production scheduling data, which prevents the use of cloud-based large language models (LLMs) for solution development. To address these challenges, there is an urgent need for an automated modeling solution that meets data privacy requirements. This study proposes a knowledge-augmented mixed integer linear programming (MILP) automated formulation framework, integrating local LLMs with domain-specific knowledge bases to generate executable code from natural language descriptions automatically. The framework employs a knowledge-guided DeepSeek-R1-Distill-Qwen-32B model to extract complex spatiotemporal constraints (82% average accuracy) and leverages a supervised fine-tuned Qwen2.5-Coder-7B-Instruct model for efficient MILP code generation (90% average accuracy). Experimental results demonstrate that the framework successfully achieves automatic modeling in the aircraft skin manufacturing case while ensuring data privacy and computational efficiency. This research provides a low-barrier and highly reliable technical path for modeling in complex industrial scenarios.

Via

Access Paper or Ask Questions

Automatic programming via large language models with population self-evolution for dynamic job shop scheduling problem

Oct 30, 2024

Jin Huang, Xinyu Li, Liang Gao, Qihao Liu, Yue Teng

Abstract:Heuristic dispatching rules (HDRs) are widely regarded as effective methods for solving dynamic job shop scheduling problems (DJSSP) in real-world production environments. However, their performance is highly scenario-dependent, often requiring expert customization. To address this, genetic programming (GP) and gene expression programming (GEP) have been extensively used for automatic algorithm design. Nevertheless, these approaches often face challenges due to high randomness in the search process and limited generalization ability, hindering the application of trained dispatching rules to new scenarios or dynamic environments. Recently, the integration of large language models (LLMs) with evolutionary algorithms has opened new avenues for prompt engineering and automatic algorithm design. To enhance the capabilities of LLMs in automatic HDRs design, this paper proposes a novel population self-evolutionary (SeEvo) method, a general search framework inspired by the self-reflective design strategies of human experts. The SeEvo method accelerates the search process and enhances exploration capabilities. Experimental results show that the proposed SeEvo method outperforms GP, GEP, end-to-end deep reinforcement learning methods, and more than 10 common HDRs from the literature, particularly in unseen and dynamic scenarios.

Via

Access Paper or Ask Questions

Simple parameter-free self-attention approximation

Jul 22, 2023

Yuwen Zhai, Jing Hao, Liang Gao, Xinyu Li, Yiping Gao, Shumin Han

Abstract:The hybrid model of self-attention and convolution is one of the methods to lighten ViT. The quadratic computational complexity of self-attention with respect to token length limits the efficiency of ViT on edge devices. We propose a self-attention approximation without training parameters, called SPSA, which captures global spatial features with linear complexity. To verify the effectiveness of SPSA combined with convolution, we conduct extensive experiments on image classification and object detection tasks.

Via

Access Paper or Ask Questions

Winning Solution for the CVPR2023 Visual Anomaly and Novelty Detection Challenge: Multimodal Prompting for Data-centric Anomaly Detection

Jun 15, 2023

Yunkang Cao, Xiaohao Xu, Chen Sun, Yuqi Cheng, Liang Gao, Weiming Shen

Abstract:This technical report introduces the winning solution of the team \textit{Segment Any Anomaly} for the CVPR2023 Visual Anomaly and Novelty Detection (VAND) challenge. Going beyond uni-modal prompt, \textit{e.g.}, language prompt, we present a novel framework, \textit{i.e.}, Segment Any Anomaly + (SAA$+$), for zero-shot anomaly segmentation with multi-modal prompts for the regularization of cascaded modern foundation models. Inspired by the great zero-shot generalization ability of foundation models like Segment Anything, we first explore their assembly (SAA) to leverage diverse multi-modal prior knowledge for anomaly localization. Subsequently, we further introduce multimodal prompts (SAA$+$) derived from domain expert knowledge and target image context to enable the non-parameter adaptation of foundation models to anomaly segmentation. The proposed SAA$+$ model achieves state-of-the-art performance on several anomaly segmentation benchmarks, including VisA and MVTec-AD, in the zero-shot setting. We will release the code of our winning solution for the CVPR2023 VAND challenge at \href{Segment-Any-Anomaly}{https://github.com/caoyunkang/Segment-Any-Anomaly} \footnote{The extended-version paper with more details is available at ~\cite{cao2023segment}.}

* The first two author contribute equally. CVPR workshop challenge report. arXiv admin note: substantial text overlap with arXiv:2305.10724

Via

Access Paper or Ask Questions

Segment Any Anomaly without Training via Hybrid Prompt Regularization

May 18, 2023

Yunkang Cao, Xiaohao Xu, Chen Sun, Yuqi Cheng, Zongwei Du, Liang Gao, Weiming Shen

Figure 1 for Segment Any Anomaly without Training via Hybrid Prompt Regularization

Figure 2 for Segment Any Anomaly without Training via Hybrid Prompt Regularization

Figure 3 for Segment Any Anomaly without Training via Hybrid Prompt Regularization

Figure 4 for Segment Any Anomaly without Training via Hybrid Prompt Regularization

Abstract:We present a novel framework, i.e., Segment Any Anomaly + (SAA+), for zero-shot anomaly segmentation with hybrid prompt regularization to improve the adaptability of modern foundation models. Existing anomaly segmentation models typically rely on domain-specific fine-tuning, limiting their generalization across countless anomaly patterns. In this work, inspired by the great zero-shot generalization ability of foundation models like Segment Anything, we first explore their assembly to leverage diverse multi-modal prior knowledge for anomaly localization. For non-parameter foundation model adaptation to anomaly segmentation, we further introduce hybrid prompts derived from domain expert knowledge and target image context as regularization. Our proposed SAA+ model achieves state-of-the-art performance on several anomaly segmentation benchmarks, including VisA, MVTec-AD, MTD, and KSDD2, in the zero-shot setting. We will release the code at \href{https://github.com/caoyunkang/Segment-Any-Anomaly}{https://github.com/caoyunkang/Segment-Any-Anomaly}.

* The first two authors contribute equally. The code will be available on https://github.com/caoyunkang/Segment-Any-Anomaly

Via

Access Paper or Ask Questions

FedDC: Federated Learning with Non-IID Data via Local Drift Decoupling and Correction

Mar 22, 2022

Liang Gao, Huazhu Fu, Li Li, Yingwen Chen, Ming Xu, Cheng-Zhong Xu

Figure 1 for FedDC: Federated Learning with Non-IID Data via Local Drift Decoupling and Correction

Figure 2 for FedDC: Federated Learning with Non-IID Data via Local Drift Decoupling and Correction

Figure 3 for FedDC: Federated Learning with Non-IID Data via Local Drift Decoupling and Correction

Figure 4 for FedDC: Federated Learning with Non-IID Data via Local Drift Decoupling and Correction

Abstract:Federated learning (FL) allows multiple clients to collectively train a high-performance global model without sharing their private data. However, the key challenge in federated learning is that the clients have significant statistical heterogeneity among their local data distributions, which would cause inconsistent optimized local models on the client-side. To address this fundamental dilemma, we propose a novel federated learning algorithm with local drift decoupling and correction (FedDC). Our FedDC only introduces lightweight modifications in the local training phase, in which each client utilizes an auxiliary local drift variable to track the gap between the local model parameter and the global model parameters. The key idea of FedDC is to utilize this learned local drift variable to bridge the gap, i.e., conducting consistency in parameter-level. The experiment results and analysis demonstrate that FedDC yields expediting convergence and better performance on various image classification tasks, robust in partial participation settings, non-iid data, and heterogeneous clients.

* 28 pages, 13 figures, to be published in CVPR2022

Via

Access Paper or Ask Questions

A new neighborhood structure for job shop scheduling problems

Sep 07, 2021

Jin Xie, Xinyu Li, Liang Gao, Lin Gui

Figure 1 for A new neighborhood structure for job shop scheduling problems

Figure 2 for A new neighborhood structure for job shop scheduling problems

Figure 3 for A new neighborhood structure for job shop scheduling problems

Figure 4 for A new neighborhood structure for job shop scheduling problems

Abstract:Job shop scheduling problem (JSP) is a widely studied NP-complete combinatorial optimization problem. Neighborhood structures play a critical role in solving JSP. At present, there are three state-of-the-art neighborhood structures, i.e., N5, N6, and N7. Improving the upper bounds of some famous benchmarks is inseparable from the role of these neighborhood structures. However, these existing neighborhood structures only consider the movement of critical operations within a critical block. According to our experiments, it is also possible to improve the makespan of a scheduling scheme by moving a critical operation outside its critical block. According to the above finding, this paper proposes a new N8 neighborhood structure considering the movement of critical operations within a critical block and the movement of critical operations outside the critical block. Besides, a neighborhood clipping method is designed to avoid invalid movement, reducing the computational time. Tabu search (TS) is a commonly used algorithm framework combined with neighborhood structures. This paper uses this framework to compare the N8 neighborhood structure with N5, N6, and N7 neighborhood structures on four famous benchmarks. The experimental results verify that the N8 neighborhood structure is more effective and efficient in solving JSP than the other state-of-the-art neighborhood structures.

Via

Access Paper or Ask Questions

SDNet: mutil-branch for single image deraining using swin

May 31, 2021

Fuxiang Tan, YuTing Kong, Yingying Fan, Feng Liu, Daxin Zhou, Hao zhang, Long Chen, Liang Gao, Yurong Qian

Figure 1 for SDNet: mutil-branch for single image deraining using swin

Figure 2 for SDNet: mutil-branch for single image deraining using swin

Figure 3 for SDNet: mutil-branch for single image deraining using swin

Figure 4 for SDNet: mutil-branch for single image deraining using swin

Abstract:Rain streaks degrade the image quality and seriously affect the performance of subsequent computer vision tasks, such as autonomous driving, social security, etc. Therefore, removing rain streaks from a given rainy images is of great significance. Convolutional neural networks(CNN) have been widely used in image deraining tasks, however, the local computational characteristics of convolutional operations limit the development of image deraining tasks. Recently, the popular transformer has global computational features that can further facilitate the development of image deraining tasks. In this paper, we introduce Swin-transformer into the field of image deraining for the first time to study the performance and potential of Swin-transformer in the field of image deraining. Specifically, we improve the basic module of Swin-transformer and design a three-branch model to implement single-image rain removal. The former implements the basic rain pattern feature extraction, while the latter fuses different features to further extract and process the image features. In addition, we employ a jump connection to fuse deep features and shallow features. In terms of experiments, the existing public dataset suffers from image duplication and relatively homogeneous background. So we propose a new dataset Rain3000 to validate our model. Therefore, we propose a new dataset Rain3000 for validating our model. Experimental results on the publicly available datasets Rain100L, Rain100H and our dataset Rain3000 show that our proposed method has performance and inference speed advantages over the current mainstream single-image rain streaks removal models.The source code will be available at https://github.com/H-tfx/SDNet.

Via

Access Paper or Ask Questions

Whale swarm algorithm with the mechanism of identifying and escaping from extreme point for multimodal function optimization

Jul 10, 2018

Bing Zeng, Xinyu Li, Liang Gao, Yuyan Zhang, Haozhen Dong

Figure 1 for Whale swarm algorithm with the mechanism of identifying and escaping from extreme point for multimodal function optimization

Figure 2 for Whale swarm algorithm with the mechanism of identifying and escaping from extreme point for multimodal function optimization

Figure 3 for Whale swarm algorithm with the mechanism of identifying and escaping from extreme point for multimodal function optimization

Figure 4 for Whale swarm algorithm with the mechanism of identifying and escaping from extreme point for multimodal function optimization

Abstract:Most real-world optimization problems often come with multiple global optima or local optima. Therefore, increasing niching metaheuristic algorithms, which devote to finding multiple optima in a single run, are developed to solve these multimodal optimization problems. However, there are two difficulties urgently to be solved for most existing niching metaheuristic algorithms: how to set the optimal values of niching parameters for different optimization problems, and how to jump out of the local optima efficiently. These two difficulties limited their practicality largely. Based on Whale Swarm Algorithm (WSA) we proposed previously, this paper presents a new multimodal optimizer named WSA with Iterative Counter (WSA-IC) to address these two difficulties. In the one hand, WSA-IC improves the iteration rule of the original WSA for multimodal optimization, which removes the need of specifying different values of attenuation coefficient for different problems to form multiple subpopulations, without introducing any niching parameter. In the other hand, WSA-IC enables the identification of extreme point during iterations relying on two new parameters (i.e., stability threshold Ts and fitness threshold Tf), to jump out of the located extreme point. Moreover, the convergence of WSA-IC is proved. Finally, the proposed WSA-IC is compared with several niching metaheuristic algorithms on CEC2015 niching benchmark test functions and five additional classical multimodal functions with high dimensions. The experimental results demonstrate that WSA-IC statistically outperforms other niching metaheuristic algorithms on most test functions.

* 28 pages, 11 figures, 9 tables, 36 references

Via

Access Paper or Ask Questions

Optical Mapping Near-eye Three-dimensional Display with Correct Focus Cues

May 24, 2017

Wei Cui, Liang Gao

Figure 1 for Optical Mapping Near-eye Three-dimensional Display with Correct Focus Cues

Figure 2 for Optical Mapping Near-eye Three-dimensional Display with Correct Focus Cues

Figure 3 for Optical Mapping Near-eye Three-dimensional Display with Correct Focus Cues

Figure 4 for Optical Mapping Near-eye Three-dimensional Display with Correct Focus Cues

Abstract:We present an optical mapping near-eye (OMNI) three-dimensional display method for wearable devices. By dividing a display screen into different sub-panels and optically mapping them to various depths, we create a multiplane volumetric image with correct focus cues for depth perception. The resultant system can drive the eye's accommodation to the distance that is consistent with binocular stereopsis, thereby alleviating the vergence-accommodation conflict, the primary cause for eye fatigue and discomfort. Compared with the previous methods, the OMNI display offers prominent advantages in adaptability, image dynamic range, and refresh rate.

* 5 pages, 6 figures, 2 tables, short article for Optics Letters

Via

Access Paper or Ask Questions