Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michael Li

The House Always Wins: A Framework for Evaluating Strategic Deception in LLMs

Jul 01, 2024

Tanush Chopra, Michael Li

Abstract:We propose a framework for evaluating strategic deception in large language models (LLMs). In this framework, an LLM acts as a game master in two scenarios: one with random game mechanics and another where it can choose between random or deliberate actions. As an example, we use blackjack because the action space nor strategies involve deception. We benchmark Llama3-70B, GPT-4-Turbo, and Mixtral in blackjack, comparing outcomes against expected distributions in fair play to determine if LLMs develop strategies favoring the "house." Our findings reveal that the LLMs exhibit significant deviations from fair play when given implicit randomness instructions, suggesting a tendency towards strategic manipulation in ambiguous scenarios. However, when presented with an explicit choice, the LLMs largely adhere to fair play, indicating that the framing of instructions plays a crucial role in eliciting or mitigating potentially deceptive behaviors in AI systems.

* Research conducted at the Deception Detection Hackathon 2024 hosted by Apart & Apollo Research

Via

Access Paper or Ask Questions

NAS-X: Neural Adaptive Smoothing via Twisting

Aug 28, 2023

Dieterich Lawson, Michael Li, Scott Linderman

Figure 1 for NAS-X: Neural Adaptive Smoothing via Twisting

Figure 2 for NAS-X: Neural Adaptive Smoothing via Twisting

Figure 3 for NAS-X: Neural Adaptive Smoothing via Twisting

Figure 4 for NAS-X: Neural Adaptive Smoothing via Twisting

Abstract:We present Neural Adaptive Smoothing via Twisting (NAS-X), a method for learning and inference in sequential latent variable models based on reweighted wake-sleep (RWS). NAS-X works with both discrete and continuous latent variables, and leverages smoothing SMC to fit a broader range of models than traditional RWS methods. We test NAS-X on discrete and continuous tasks and find that it substantially outperforms previous variational and RWS-based methods in inference and parameter recovery.

Via

Access Paper or Ask Questions

Generative Probabilistic Image Colorization

Sep 29, 2021

Chie Furusawa, Shinya Kitaoka, Michael Li, Yuri Odagiri

Figure 1 for Generative Probabilistic Image Colorization

Figure 2 for Generative Probabilistic Image Colorization

Figure 3 for Generative Probabilistic Image Colorization

Figure 4 for Generative Probabilistic Image Colorization

Abstract:We propose Generative Probabilistic Image Colorization, a diffusion-based generative process that trains a sequence of probabilistic models to reverse each step of noise corruption. Given a line-drawing image as input, our method suggests multiple candidate colorized images. Therefore, our method accounts for the ill-posed nature of the colorization problem. We conducted comprehensive experiments investigating the colorization of line-drawing images, report the influence of a score-based MCMC approach that corrects the marginal distribution of estimated samples, and further compare different combinations of models and the similarity of their generated images. Despite using only a relatively small training dataset, we experimentally develop a method to generate multiple diverse colorization candidates which avoids mode collapse and does not require any additional constraints, losses, or re-training with alternative training conditions. Our proposed approach performed well not only on color-conditional image generation tasks using biased initial values, but also on some practical image completion and inpainting tasks.

* 11 pages

Via

Access Paper or Ask Questions