Abstract:The multiple-choice knapsack problem (MCKP) is a classic NP-hard combinatorial optimization problem. Motivated by several significant practical applications, this work investigates a novel variant of MCKP called data-driven chance-constrained multiple-choice knapsack problem (DDCCMCKP), where the item weight is a random variable with unknown probability distribution. We first present the problem formulation of DDCCMCKP, and then establish two benchmark sets. The first set contains synthetic instances, and the second set is devised to simulate a real-world application scenario of a certain telecommunication company. To solve DDCCMCKP, we propose a data-driven adaptive local search (DDALS) algorithm. The main merit of DDALS lies in evaluating solutions with chance constraints by data-driven methods, under the condition of unknown distributions and only historical sample data being available. The experimental results demonstrate the effectiveness of the proposed algorithm and show that it is superior to other baselines. Additionally, ablation experiments confirm the necessity of each component in the algorithm. Our proposed algorithm can serve as the baseline for future research, and the code and benchmark sets will be open-sourced to further promote research on this challenging problem.