Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Rethinking the Evaluation of Unbiased Scene Graph Generation

Aug 03, 2022

Xingchen Li, Long Chen, Jian Shao, Shaoning Xiao, Songyang Zhang, Jun Xiao

Figure 1 for Rethinking the Evaluation of Unbiased Scene Graph Generation

Figure 2 for Rethinking the Evaluation of Unbiased Scene Graph Generation

Figure 3 for Rethinking the Evaluation of Unbiased Scene Graph Generation

Figure 4 for Rethinking the Evaluation of Unbiased Scene Graph Generation

Share this with someone who'll enjoy it:

Abstract:Since the severe imbalanced predicate distributions in common subject-object relations, current Scene Graph Generation (SGG) methods tend to predict frequent predicate categories and fail to recognize rare ones. To improve the robustness of SGG models on different predicate categories, recent research has focused on unbiased SGG and adopted mean Recall@K (mR@K) as the main evaluation metric. However, we discovered two overlooked issues about this de facto standard metric mR@K, which makes current unbiased SGG evaluation vulnerable and unfair: 1) mR@K neglects the correlations among predicates and unintentionally breaks category independence when ranking all the triplet predictions together regardless of the predicate categories, leading to the performance of some predicates being underestimated. 2) mR@K neglects the compositional diversity of different predicates and assigns excessively high weights to some oversimple category samples with limited composable relation triplet types. It totally conflicts with the goal of SGG task which encourages models to detect more types of visual relationship triplets. In addition, we investigate the under-explored correlation between objects and predicates, which can serve as a simple but strong baseline for unbiased SGG. In this paper, we refine mR@K and propose two complementary evaluation metrics for unbiased SGG: Independent Mean Recall (IMR) and weighted IMR (wIMR). These two metrics are designed by considering the category independence and diversity of composable relation triplets, respectively. We compare the proposed metrics with the de facto standard metrics through extensive experiments and discuss the solutions to evaluate unbiased SGG in a more trustworthy way.

View paper on

Share this with someone who'll enjoy it:

Title:Rethinking the Evaluation of Unbiased Scene Graph Generation

Paper and Code