Abstract:The job shop scheduling problem (JSSP) and its solution algorithms have been of enduring interest in both academia and industry for decades. In recent years, machine learning (ML) is playing an increasingly important role in advancing existing and building new heuristic solutions for the JSSP, aiming to find better solutions in shorter computation times. In this paper we build on top of a state-of-the-art deep reinforcement learning (DRL) agent, called Neural Local Search (NLS), which can efficiently and effectively control a large local neighborhood search on the JSSP. In particular, we develop a method for training the decision transformer (DT) algorithm on search trajectories taken by a trained NLS agent to further improve upon the learned decision-making sequences. Our experiments show that the DT successfully learns local search strategies that are different and, in many cases, more effective than those of the NLS agent itself. In terms of the tradeoff between solution quality and acceptable computational time needed for the search, the DT is particularly superior in application scenarios where longer computational times are acceptable. In this case, it makes up for the longer inference times required per search step, which are caused by the larger neural network architecture, through better quality decisions per step. Thereby, the DT achieves state-of-the-art results for solving the JSSP with ML-enhanced search.
Abstract:The field of emergent language represents a novel area of research within the domain of artificial intelligence, particularly within the context of multi-agent reinforcement learning. Although the concept of studying language emergence is not new, early approaches were primarily concerned with explaining human language formation, with little consideration given to its potential utility for artificial agents. In contrast, studies based on reinforcement learning aim to develop communicative capabilities in agents that are comparable to or even superior to human language. Thus, they extend beyond the learned statistical representations that are common in natural language processing research. This gives rise to a number of fundamental questions, from the prerequisites for language emergence to the criteria for measuring its success. This paper addresses these questions by providing a comprehensive review of 181 scientific publications on emergent language in artificial intelligence. Its objective is to serve as a reference for researchers interested in or proficient in the field. Consequently, the main contributions are the definition and overview of the prevailing terminology, the analysis of existing evaluation methods and metrics, and the description of the identified research gaps.
Abstract:In multitask learning, conflicts between task gradients are a frequent issue degrading a model's training performance. This is commonly addressed by using the Gradient Projection algorithm PCGrad that often leads to faster convergence and improved performance metrics. In this work, we present a method to adapt this algorithm to simultaneously also perform task prioritization. Our approach differs from traditional task weighting performed by scaling task losses in that our weighting scheme applies only in cases where tasks are in conflict, but lets the training proceed unhindered otherwise. We replace task weighting factors by a probability distribution that determines which task gradients get projected in conflict cases. Our experiments on the nuScenes, CIFAR-100, and CelebA datasets confirm that our approach is a practical method for task weighting. Paired with multiple different task weighting schemes, we observe a significant improvement in the performance metrics of most tasks compared to Gradient Projection with uniform projection probabilities.
Abstract:Learned construction heuristics for scheduling problems have become increasingly competitive with established solvers and heuristics in recent years. In particular, significant improvements have been observed in solution approaches using deep reinforcement learning (DRL). While much attention has been paid to the design of network architectures and training algorithms to achieve state-of-the-art results, little research has investigated the optimal use of trained DRL agents during inference. Our work is based on the hypothesis that, similar to search algorithms, the utilization of trained DRL agents should be dependent on the acceptable computational budget. We propose a simple yet effective parameterization, called $\delta$-sampling that manipulates the trained action vector to bias agent behavior towards exploration or exploitation during solution construction. By following this approach, we can achieve a more comprehensive coverage of the search space while still generating an acceptable number of solutions. In addition, we propose an algorithm for obtaining the optimal parameterization for such a given number of solutions and any given trained agent. Experiments extending existing training protocols for job shop scheduling problems with our inference method validate our hypothesis and result in the expected improvements of the generated solutions.
Abstract:The digitization of manufacturing processes enables promising applications for machine learning-assisted quality assurance. A widely used manufacturing process that can strongly benefit from data-driven solutions is gas metal arc welding (GMAW). The welding process is characterized by complex cause-effect relationships between material properties, process conditions and weld quality. In non-laboratory environments with frequently changing process parameters, accurate determination of weld quality by destructive testing is economically unfeasible. Deep learning offers the potential to identify the relationships in available process data and predict the weld quality from process observations. In this paper, we present a concept for a deep learning based predictive quality system in GMAW. At its core, the concept involves a pipeline consisting of four major phases: collection and management of multi-sensor data (e.g. current and voltage), real-time processing and feature engineering of the time series data by means of autoencoders, training and deployment of suitable recurrent deep learning models for quality predictions, and model evolutions under changing process conditions using continual learning. The concept provides the foundation for future research activities in which we will realize an online predictive quality system for running production.
Abstract:Solving job shop scheduling problems (JSSPs) with a fixed strategy, such as a priority dispatching rule, may yield satisfactory results for several problem instances but, nevertheless, insufficient results for others. From this single-strategy perspective finding a near optimal solution to a specific JSSP varies in difficulty even if the machine setup remains the same. A recent intensively researched and promising method to deal with difficulty variability is Deep Reinforcement Learning (DRL), which dynamically adjusts an agent's planning strategy in response to difficult instances not only during training, but also when applied to new situations. In this paper, we further improve DLR as an underlying method by actively incorporating the variability of difficulty within the same problem size into the design of the learning process. We base our approach on a state-of-the-art methodology that solves JSSP by means of DRL and graph neural network embeddings. Our work supplements the training routine of the agent by a curriculum learning strategy that ranks the problem instances shown during training by a new metric of problem instance difficulty. Our results show that certain curricula lead to significantly better performances of the DRL solutions. Agents trained on these curricula beat the top performance of those trained on randomly distributed training data, reaching 3.2% shorter average makespans.