Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ceyhun Eksin

Simulation-Based Optimistic Policy Iteration For Multi-Agent MDPs with Kullback-Leibler Control Cost

Oct 19, 2024

Khaled Nakhleh, Ceyhun Eksin, Sabit Ekin

Figure 1 for Simulation-Based Optimistic Policy Iteration For Multi-Agent MDPs with Kullback-Leibler Control Cost

Figure 2 for Simulation-Based Optimistic Policy Iteration For Multi-Agent MDPs with Kullback-Leibler Control Cost

Abstract:This paper proposes an agent-based optimistic policy iteration (OPI) scheme for learning stationary optimal stochastic policies in multi-agent Markov Decision Processes (MDPs), in which agents incur a Kullback-Leibler (KL) divergence cost for their control efforts and an additional cost for the joint state. The proposed scheme consists of a greedy policy improvement step followed by an m-step temporal difference (TD) policy evaluation step. We use the separable structure of the instantaneous cost to show that the policy improvement step follows a Boltzmann distribution that depends on the current value function estimate and the uncontrolled transition probabilities. This allows agents to compute the improved joint policy independently. We show that both the synchronous (entire state space evaluation) and asynchronous (a uniformly sampled set of substates) versions of the OPI scheme with finite policy evaluation rollout converge to the optimal value function and an optimal joint policy asymptotically. Simulation results on a multi-agent MDP with KL control cost variant of the Stag-Hare game validates our scheme's performance in terms of minimizing the cost return.

Via

Access Paper or Ask Questions

Learning graph-Fourier spectra of textured surface images for defect localization

Dec 02, 2023

Tapan Ganatma Nakkina, Adithyaa Karthikeyan, Yuhao Zhong, Ceyhun Eksin, Satish T. S. Bukkapatnam

Figure 1 for Learning graph-Fourier spectra of textured surface images for defect localization

Figure 2 for Learning graph-Fourier spectra of textured surface images for defect localization

Figure 3 for Learning graph-Fourier spectra of textured surface images for defect localization

Figure 4 for Learning graph-Fourier spectra of textured surface images for defect localization

Abstract:In the realm of industrial manufacturing, product inspection remains a significant bottleneck, with only a small fraction of manufactured items undergoing inspection for surface defects. Advances in imaging systems and AI can allow automated full inspection of manufactured surfaces. However, even the most contemporary imaging and machine learning methods perform poorly for detecting defects in images with highly textured backgrounds, that stem from diverse manufacturing processes. This paper introduces an approach based on graph Fourier analysis to automatically identify defective images, as well as crucial graph Fourier coefficients that inform the defects in images amidst highly textured backgrounds. The approach capitalizes on the ability of graph representations to capture the complex dynamics inherent in high-dimensional data, preserving crucial locality properties in a lower dimensional space. A convolutional neural network model (1D-CNN) was trained with the coefficients of the graph Fourier transform of the images as the input to identify, with classification accuracy of 99.4%, if the image contains a defect. An explainable AI method using SHAP (SHapley Additive exPlanations) was used to further analyze the trained 1D-CNN model to discern important spectral coefficients for each image. This approach sheds light on the crucial contribution of low-frequency graph eigen waveforms to precisely localize surface defects in images, thereby advancing the realization of zero-defect manufacturing.

Via

Access Paper or Ask Questions

Distributed Estimation via Network Regularization

Oct 28, 2019

Lingzhou Hong, Alfredo Garcia, Ceyhun Eksin

Figure 1 for Distributed Estimation via Network Regularization

Figure 2 for Distributed Estimation via Network Regularization

Figure 3 for Distributed Estimation via Network Regularization

Figure 4 for Distributed Estimation via Network Regularization

Abstract:We propose a new method for distributed estimation of a linear model by a network of local learners with heterogeneously distributed datasets. Unlike other ensemble learning methods, in the proposed method, model averaging is done continuously over time in a distributed and asynchronous manner. To ensure robust estimation, a network regularization term which penalizes models with high local variability is used. We provide a finite-time characterization of convergence of the weighted ensemble average and compare this result to centralized estimation. We illustrate the general applicability of the method in two examples: estimation of a Markov random field using wireless sensor networks and modeling prey escape behavior of birds based on a real-world dataset.

* 27 pages

Via

Access Paper or Ask Questions