Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Saaduddin Mahmud

Distributed Multi-Agent Coordination Using Multi-Modal Foundation Models

Jan 24, 2025

Saaduddin Mahmud, Dorian Benhamou Goldfajn, Shlomo Zilberstein

Abstract:Distributed Constraint Optimization Problems (DCOPs) offer a powerful framework for multi-agent coordination but often rely on labor-intensive, manual problem construction. To address this, we introduce VL-DCOPs, a framework that takes advantage of large multimodal foundation models (LFMs) to automatically generate constraints from both visual and linguistic instructions. We then introduce a spectrum of agent archetypes for solving VL-DCOPs: from a neuro-symbolic agent that delegates some of the algorithmic decisions to an LFM, to a fully neural agent that depends entirely on an LFM for coordination. We evaluate these agent archetypes using state-of-the-art LLMs (large language models) and VLMs (vision language models) on three novel VL-DCOP tasks and compare their respective advantages and drawbacks. Lastly, we discuss how this work extends to broader frontier challenges in the DCOP literature.

Via

Access Paper or Ask Questions

MAPLE: A Framework for Active Preference Learning Guided by Large Language Models

Dec 10, 2024

Saaduddin Mahmud, Mason Nakamura, Shlomo Zilberstein

Abstract:The advent of large language models (LLMs) has sparked significant interest in using natural language for preference learning. However, existing methods often suffer from high computational burdens, taxing human supervision, and lack of interpretability. To address these issues, we introduce MAPLE, a framework for large language model-guided Bayesian active preference learning. MAPLE leverages LLMs to model the distribution over preference functions, conditioning it on both natural language feedback and conventional preference learning feedback, such as pairwise trajectory rankings. MAPLE also employs active learning to systematically reduce uncertainty in this distribution and incorporates a language-conditioned active query selection mechanism to identify informative and easy-to-answer queries, thus reducing human burden. We evaluate MAPLE's sample efficiency and preference inference quality across two benchmarks, including a real-world vehicle route planning benchmark using OpenStreetMap data. Our results demonstrate that MAPLE accelerates the learning process and effectively improves humans' ability to answer queries.

Via

Access Paper or Ask Questions

A Unifying Framework for Causal Explanation of Sequential Decision Making

May 30, 2022

Samer B. Nashed, Saaduddin Mahmud, Claudia V. Goldman, Shlomo Zilberstein

Figure 1 for A Unifying Framework for Causal Explanation of Sequential Decision Making

Figure 2 for A Unifying Framework for Causal Explanation of Sequential Decision Making

Figure 3 for A Unifying Framework for Causal Explanation of Sequential Decision Making

Figure 4 for A Unifying Framework for Causal Explanation of Sequential Decision Making

Abstract:We present a novel framework for causal explanations of stochastic, sequential decision-making systems. Building on the well-studied structural causal model paradigm for causal reasoning, we show how to identify semantically distinct types of explanations for agent actions using a single unified approach. We provide results on the generality of this framework, run time bounds, and offer several approximate techniques. Finally, we discuss several qualitative scenarios that illustrate the framework's flexibility and efficacy.

* 9 pages, 4 figures, conference

Via

Access Paper or Ask Questions

Learning Cooperation and Online Planning Through Simulation and Graph Convolutional Network

Oct 16, 2021

Rafid Ameer Mahmud, Fahim Faisal, Saaduddin Mahmud, Md. Mosaddek Khan

Figure 1 for Learning Cooperation and Online Planning Through Simulation and Graph Convolutional Network

Figure 2 for Learning Cooperation and Online Planning Through Simulation and Graph Convolutional Network

Figure 3 for Learning Cooperation and Online Planning Through Simulation and Graph Convolutional Network

Figure 4 for Learning Cooperation and Online Planning Through Simulation and Graph Convolutional Network

Abstract:Multi-agent Markov Decision Process (MMDP) has been an effective way of modelling sequential decision making algorithms for multi-agent cooperative environments. A number of algorithms based on centralized and decentralized planning have been developed in this domain. However, dynamically changing environment, coupled with exponential size of the state and joint action space, make it difficult for these algorithms to provide both efficiency and scalability. Recently, Centralized planning algorithm FV-MCTS-MP and decentralized planning algorithm \textit{Alternate maximization with Behavioural Cloning} (ABC) have achieved notable performance in solving MMDPs. However, they are not capable of adapting to dynamically changing environments and accounting for the lack of communication among agents, respectively. Against this background, we introduce a simulation based online planning algorithm, that we call SiCLOP, for multi-agent cooperative environments. Specifically, SiCLOP tailors Monte Carlo Tree Search (MCTS) and uses Coordination Graph (CG) and Graph Neural Network (GCN) to learn cooperation and provides real time solution of a MMDP problem. It also improves scalability through an effective pruning of action space. Additionally, unlike FV-MCTS-MP and ABC, SiCLOP supports transfer learning, which enables learned agents to operate in different environments. We also provide theoretical discussion about the convergence property of our algorithm within the context of multi-agent settings. Finally, our extensive empirical results show that SiCLOP significantly outperforms the state-of-the-art online planning algorithms.

Via

Access Paper or Ask Questions

On Population-Based Algorithms for Distributed Constraint Optimization Problems

Sep 02, 2020

Saaduddin Mahmud, Md. Mosaddek Khan, Nicholas R. Jennings

Abstract:Distributed Constraint Optimization Problems (DCOPs) are a widely studied class of optimization problems in which interaction between a set of cooperative agents are modeled as a set of constraints. DCOPs are NP-hard and significant effort has been devoted to developing methods for finding incomplete solutions. In this paper, we study an emerging class of such incomplete algorithms that are broadly termed as population-based algorithms. The main characteristic of these algorithms is that they maintain a population of candidate solutions of a given problem and use this population to cover a large area of the search space and to avoid local-optima. In recent years, this class of algorithms has gained significant attention due to their ability to produce high-quality incomplete solutions. With the primary goal of further improving the quality of solutions compared to the state-of-the-art incomplete DCOP algorithms, we present two new population-based algorithms in this paper. Our first approach, Anytime Evolutionary DCOP or AED, exploits evolutionary optimization meta-heuristics to solve DCOPs. We also present a novel anytime update mechanism that gives AED its anytime property. While in our second contribution, we show that population-based approaches can be combined with local search approaches. Specifically, we develop an algorithm called DPSA based on the Simulated Annealing meta-heuristic. We empirically evaluate these two algorithms to illustrate their respective effectiveness in different settings against the state-of-the-art incomplete DCOP algorithms including all existing population-based algorithms in a wide variety of benchmarks. Our evaluation shows AED and DPSA markedly outperform the state-of-the-art and produce up to 75% improved solutions.

* 7 Figures. arXiv admin note: text overlap with arXiv:1909.06254, arXiv:2002.12001

Via

Access Paper or Ask Questions

Learning Optimal Temperature Region for Solving Mixed Integer Functional DCOPs

Feb 27, 2020

Saaduddin Mahmud, Md. Mosaddek Khan, Moumita Choudhury, Long Tran-Thanh, Nicholas R. Jennings

Figure 1 for Learning Optimal Temperature Region for Solving Mixed Integer Functional DCOPs

Figure 2 for Learning Optimal Temperature Region for Solving Mixed Integer Functional DCOPs

Figure 3 for Learning Optimal Temperature Region for Solving Mixed Integer Functional DCOPs

Figure 4 for Learning Optimal Temperature Region for Solving Mixed Integer Functional DCOPs

Abstract:Distributed Constraint Optimization Problems (DCOPs) are an important framework that models coordinated decision-making problem in multi-agent systems with a set of discrete variables. Later work has extended this to model problems with a set of continuous variables (F-DCOPs). In this paper, we combine both of these models into the Mixed Integer Functional DCOP (MIF-DCOP) model that can deal with problems regardless of its variables' type. We then propose a novel algorithm, called Distributed Parallel Simulated Annealing (DPSA), where agents cooperatively learn the optimal parameter configuration for the algorithm while also solving the given problem using the learned knowledge. Finally, we empirically benchmark our approach in DCOP, F-DCOP and MIF-DCOP settings and show that DPSA produces solutions of significantly better quality than the state-of-the-art non-exact algorithms in their corresponding setting.

* 8 pages, 6 figures, 1 Table

Via

Access Paper or Ask Questions

AED: An Anytime Evolutionary DCOP Algorithm

Sep 13, 2019

Saaduddin Mahmud, Moumita Choudhury, Md. Mosaddek Khan, Long Tran-Thanh, Nicholas R. Jennings

Figure 1 for AED: An Anytime Evolutionary DCOP Algorithm

Figure 2 for AED: An Anytime Evolutionary DCOP Algorithm

Figure 3 for AED: An Anytime Evolutionary DCOP Algorithm

Figure 4 for AED: An Anytime Evolutionary DCOP Algorithm

Abstract:Evolutionary optimization is a generic population-based metaheuristic that can be adapted to solve a wide variety of optimization problems and has proven very effective for combinatorial optimization problems. However, the potential of this metaheuristic has not been utilized in Distributed Constraint Optimization Problems (DCOPs), a well-known class of combinatorial optimization problems. In this paper, we present a new population-based algorithm, namely Anytime Evolutionary DCOP (AED), that adapts evolutionary optimization to solve DCOPs. In AED, the agents cooperatively construct an initial set of random solutions and gradually improve them through a new mechanism that considers the optimistic approximation of local benefits. Moreover, we propose a new anytime update mechanism for AED that identifies the best among a distributed set of candidate solutions and notifies all the agents when a new best is found. In our theoretical analysis, we prove that AED is anytime. Finally, we present empirical results indicating AED outperforms the state-of-the-art DCOP algorithms in terms of solution quality.

* 8 pages, 5 figures

Via

Access Paper or Ask Questions