Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jianing Zhao

How to Understand "Support"? An Implicit-enhanced Causal Inference Approach for Weakly-supervised Phrase Grounding

Mar 04, 2024

Jiamin Luo, Jianing Zhao, Jingjing Wang, Guodong Zhou

Figure 1 for How to Understand "Support"? An Implicit-enhanced Causal Inference Approach for Weakly-supervised Phrase Grounding

Figure 2 for How to Understand "Support"? An Implicit-enhanced Causal Inference Approach for Weakly-supervised Phrase Grounding

Figure 3 for How to Understand "Support"? An Implicit-enhanced Causal Inference Approach for Weakly-supervised Phrase Grounding

Figure 4 for How to Understand "Support"? An Implicit-enhanced Causal Inference Approach for Weakly-supervised Phrase Grounding

Abstract:Weakly-supervised Phrase Grounding (WPG) is an emerging task of inferring the fine-grained phrase-region matching, while merely leveraging the coarse-grained sentence-image pairs for training. However, existing studies on WPG largely ignore the implicit phrase-region matching relations, which are crucial for evaluating the capability of models in understanding the deep multimodal semantics. To this end, this paper proposes an Implicit-Enhanced Causal Inference (IECI) approach to address the challenges of modeling the implicit relations and highlighting them beyond the explicit. Specifically, this approach leverages both the intervention and counterfactual techniques to tackle the above two challenges respectively. Furthermore, a high-quality implicit-enhanced dataset is annotated to evaluate IECI and detailed evaluations show the great advantages of IECI over the state-of-the-art baselines. Particularly, we observe an interesting finding that IECI outperforms the advanced multimodal LLMs by a large margin on this implicit-enhanced dataset, which may facilitate more research to evaluate the multimodal LLMs in this direction.

Via

Access Paper or Ask Questions

To Explore or Not to Explore: Regret-Based LTL Planning in Partially-Known Environments

Apr 01, 2022

Jianing Zhao, Keyi Zhu, Xiang Yin, Shaoyuan Li

Figure 1 for To Explore or Not to Explore: Regret-Based LTL Planning in Partially-Known Environments

Figure 2 for To Explore or Not to Explore: Regret-Based LTL Planning in Partially-Known Environments

Figure 3 for To Explore or Not to Explore: Regret-Based LTL Planning in Partially-Known Environments

Abstract:In this paper, we investigate the optimal robot path planning problem for high-level specifications described by co-safe linear temporal logic (LTL) formulae. We consider the scenario where the map geometry of the workspace is partially-known. Specifically, we assume that there are some unknown regions, for which the robot does not know their successor regions a priori unless it reaches these regions physically. In contrast to the standard game-based approach that optimizes the worst-case cost, in the paper, we propose to use regret as a new metric for planning in such a partially-known environment. The regret of a plan under a fixed but unknown environment is the difference between the actual cost incurred and the best-response cost the robot could have achieved if it realizes the actual environment with hindsight. We provide an effective algorithm for finding an optimal plan that satisfies the LTL specification while minimizing its regret. A case study on firefighting robots is provided to illustrate the proposed framework. We argue that the new metric is more suitable for the scenario of partially-known environment since it captures the trade-off between the actual cost spent and the potential benefit one may obtain for exploring an unknown region.

Via

Access Paper or Ask Questions