Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tomoyuki Kaneko

The Role of Background Information in Reducing Object Hallucination in Vision-Language Models: Insights from Cutoff API Prompting

Feb 21, 2025

Masayo Tomita, Katsuhiko Hayashi, Tomoyuki Kaneko

Abstract:Vision-Language Models (VLMs) occasionally generate outputs that contradict input images, constraining their reliability in real-world applications. While visual prompting is reported to suppress hallucinations by augmenting prompts with relevant area inside an image, the effectiveness in terms of the area remains uncertain. This study analyzes success and failure cases of Attention-driven visual prompting in object hallucination, revealing that preserving background context is crucial for mitigating object hallucination.

* Under review

Via

Access Paper or Ask Questions

DEIR: Efficient and Robust Exploration through Discriminative-Model-Based Episodic Intrinsic Rewards

Apr 21, 2023

Shanchuan Wan, Yujin Tang, Yingtao Tian, Tomoyuki Kaneko

Figure 1 for DEIR: Efficient and Robust Exploration through Discriminative-Model-Based Episodic Intrinsic Rewards

Figure 2 for DEIR: Efficient and Robust Exploration through Discriminative-Model-Based Episodic Intrinsic Rewards

Figure 3 for DEIR: Efficient and Robust Exploration through Discriminative-Model-Based Episodic Intrinsic Rewards

Figure 4 for DEIR: Efficient and Robust Exploration through Discriminative-Model-Based Episodic Intrinsic Rewards

Abstract:Exploration is a fundamental aspect of reinforcement learning (RL), and its effectiveness crucially decides the performance of RL algorithms, especially when facing sparse extrinsic rewards. Recent studies showed the effectiveness of encouraging exploration with intrinsic rewards estimated from novelty in observations. However, there is a gap between the novelty of an observation and an exploration in general, because the stochasticity in the environment as well as the behavior of an agent may affect the observation. To estimate exploratory behaviors accurately, we propose DEIR, a novel method where we theoretically derive an intrinsic reward from a conditional mutual information term that principally scales with the novelty contributed by agent explorations, and materialize the reward with a discriminative forward model. We conduct extensive experiments in both standard and hardened exploration games in MiniGrid to show that DEIR quickly learns a better policy than baselines. Our evaluations in ProcGen demonstrate both generalization capabilities and the general applicability of our intrinsic reward.

* Accepted as a conference paper to the 32nd International Joint Conference on Artificial Intelligence (IJCAI-23)

Via

Access Paper or Ask Questions

Diverse Exploration via InfoMax Options

Oct 06, 2020

Yuji Kanagawa, Tomoyuki Kaneko

Figure 1 for Diverse Exploration via InfoMax Options

Figure 2 for Diverse Exploration via InfoMax Options

Figure 3 for Diverse Exploration via InfoMax Options

Figure 4 for Diverse Exploration via InfoMax Options

Abstract:In this paper, we study the problem of autonomously discovering temporally abstracted actions, or options, for exploration in reinforcement learning. For learning diverse options suitable for exploration, we introduce the infomax termination objective defined as the mutual information between options and their corresponding state transitions. We derive a scalable optimization scheme for maximizing this objective via the termination condition of options, yielding the InfoMax Option Critic (IMOC) algorithm. Through illustrative experiments, we empirically show that IMOC learns diverse options and utilizes them for exploration. Moreover, we show that IMOC scales well to continuous control tasks.

* Preprint. Under review

Via

Access Paper or Ask Questions

Playing Catan with Cross-dimensional Neural Network

Aug 17, 2020

Quentin Gendre, Tomoyuki Kaneko

Figure 1 for Playing Catan with Cross-dimensional Neural Network

Figure 2 for Playing Catan with Cross-dimensional Neural Network

Figure 3 for Playing Catan with Cross-dimensional Neural Network

Figure 4 for Playing Catan with Cross-dimensional Neural Network

Abstract:Catan is a strategic board game having interesting properties, including multi-player, imperfect information, stochastic, complex state space structure (hexagonal board where each vertex, edge and face has its own features, cards for each player, etc), and a large action space (including negotiation). Therefore, it is challenging to build AI agents by Reinforcement Learning (RL for short), without domain knowledge nor heuristics. In this paper, we introduce cross-dimensional neural networks to handle a mixture of information sources and a wide variety of outputs, and empirically demonstrate that the network dramatically improves RL in Catan. We also show that, for the first time, a RL agent can outperform jsettler, the best heuristic agent available.

* 12 pages, 5 tables and 10 figures; submitted to the ICONIP 2020

Via

Access Paper or Ask Questions

Rogue-Gym: A New Challenge for Generalization in Reinforcement Learning

Apr 17, 2019

Yuji Kanagawa, Tomoyuki Kaneko

Figure 1 for Rogue-Gym: A New Challenge for Generalization in Reinforcement Learning

Figure 2 for Rogue-Gym: A New Challenge for Generalization in Reinforcement Learning

Figure 3 for Rogue-Gym: A New Challenge for Generalization in Reinforcement Learning

Figure 4 for Rogue-Gym: A New Challenge for Generalization in Reinforcement Learning

Abstract:This paper presents Rogue-Gym, that enables agents to learn and play a subset of the original Rogue game with the OpenAI Gym interface. In roguelike games, a player explores a dungeon where each floor is two dimensional grid maze with enemies, golds, and downstairs. Because the map of a dungeon is different each time an agent starts a new game, learning in Rogue-Gym inevitably involves generalization of experiences, in a highly abstract manner. We argue that this generalization in reinforcement learning is a big challenge for AI agents. Recently, deep reinforcement learning (DRL) has succeeded in many games. However, it has been pointed out that agents trained by DRL methods often overfit to the training environment. To investigate this problem, some research environments with procedural content generation have been proposed. Following these studies, we show that our Rogue-Gym imposes a new generalization problem of their policies. In our experiments, we evaluate a standard reinforcement learning method, PPO, with and without enhancements for generalization. The results show that some enhancements work effective, but that there is still a large room for improvement. Therefore, Rogue-Gym a is a new challenging domain for further studies.

* 8 pages, 14 figures, 4 tables, submitted to IEEE COG 2019

Via

Access Paper or Ask Questions