Abstract:We propose a transformer architecture for time series forecasting with a focus on time series tokenisation and apply it to a real-world prediction problem from the pricing domain. Our architecture aims to learn effective representations at many scales across all available data simultaneously. The model contains a number of novel modules: a differentiated form of time series patching which employs multiple resolutions, a multiple-resolution module for time-varying known variables, a mixer-based module for capturing cross-series information, and a novel output head with favourable scaling to account for the increased number of tokens. We present an application of this model to a real world prediction problem faced by the markdown team at a very large retailer. On the experiments conducted our model outperforms in-house models and the selected existing deep learning architectures.
Abstract:Decision-focused learning has emerged as a promising approach for decision making under uncertainty by training the upstream predictive aspect of the pipeline with respect to the quality of the downstream decisions. Most existing work has focused on single stage problems. Many real-world decision problems are more appropriately modelled using multistage optimisation as contextual information such as prices or demand is revealed over time and decisions now have a bearing on future decisions. We propose decision-focused forecasting, a multiple-implicitlayer model which in its training accounts for the intertemporal decision effects of forecasts using differentiable optimisation. The recursive model reflects a fully differentiable multistage optimisation approach. We present an analysis of the gradients produced by this model showing the adjustments made to account for the state-path caused by forecasting. We demonstrate an application of the model to an energy storage arbitrage task and report that our model outperforms existing approaches.
Abstract:Decision-focused learning is a promising development for contextual optimisation. It enables us to train prediction models that reflect the contextual sensitivity structure of the problem. However, there have been limited attempts to extend this paradigm to robust optimisation. We propose a double implicit layer model for training prediction models with respect to robust decision loss in uncertain convex quadratically constrained quadratic programs (QCQP). The first layer solves a deterministic version of the problem, the second layer evaluates the worst case realisation for an uncertainty set centred on the observation given the decisions obtained from the first layer. This enables us to learn model parameterisations that lead to robust decisions while only solving a simpler deterministic problem at test time. Additionally, instead of having to solve a robust counterpart we solve two smaller and potentially easier problems in training. The second layer (worst case problem) can be seen as a regularisation approach for predict-and-optimise by fitting to a neighbourhood of problems instead of just a point observation. We motivate relaxations of the worst-case problem in cases of uncertainty sets that would otherwise lead to trust region problems, and leverage various relaxations to deal with uncertain constraints. Both layers are typically strictly convex in this problem setting and thus have meaningful gradients almost everywhere. We demonstrate an application of this model on simulated experiments. The method is an effective regularisation tool for decision-focused learning for uncertain convex QCQPs.
Abstract:With the increasing effects of climate change, the urgency to step away from fossil fuels is greater than ever before. Electric vehicles (EVs) are one way to diminish these effects, but their widespread adoption is often limited by the insufficient availability of charging stations. In this work, our goal is to expand the infrastructure of EV charging stations, in order to provide a better quality of service in terms of user satisfaction (and availability of charging stations). Specifically, our focus is directed towards urban areas. We first propose a model for the assignment of EV charging demand to stations, framing it as a maximum flow problem. This model is the basis for the evaluation of user satisfaction with a given charging infrastructure. Secondly, we incorporate the maximum flow model into a mixed-integer linear program, where decisions on the opening of new stations and on the expansion of their capacity through additional outlets is accounted for. We showcase our methodology for the city of Montreal, demonstrating the scalability of our approach to handle real-world scenarios. We conclude that considering both spacial and temporal variations in charging demand is meaningful when solving realistic instances.