Abstract:Existing cross-domain few-shot learning (CDFSL) methods, which develop source-domain training strategies to enhance model transferability, face challenges with large-scale pre-trained models (LMs) due to inaccessible source data and training strategies. Moreover, fine-tuning LMs for CDFSL demands substantial computational resources, limiting practicality. This paper addresses the source-free CDFSL (SF-CDFSL) problem, tackling few-shot learning (FSL) in the target domain using only pre-trained models and a few target samples without source data or strategies. To overcome the challenge of inaccessible source data, this paper introduces Step-wise Distribution Alignment Guided Style Prompt Tuning (StepSPT), which implicitly narrows domain gaps through prediction distribution optimization. StepSPT proposes a style prompt to align target samples with the desired distribution and adopts a dual-phase optimization process. In the external process, a step-wise distribution alignment strategy factorizes prediction distribution optimization into a multi-step alignment problem to tune the style prompt. In the internal process, the classifier is updated using standard cross-entropy loss. Evaluations on five datasets demonstrate that StepSPT outperforms existing prompt tuning-based methods and SOTAs. Ablation studies further verify its effectiveness. Code will be made publicly available at \url{https://github.com/xuhuali-mxj/StepSPT}.
Abstract:Despite the impressive performance of recent unbiased Scene Graph Generation (SGG) methods, the current debiasing literature mainly focuses on the long-tailed distribution problem, whereas it overlooks another source of bias, i.e., semantic confusion, which makes the SGG model prone to yield false predictions for similar relationships. In this paper, we explore a debiasing procedure for the SGG task leveraging causal inference. Our central insight is that the Sparse Mechanism Shift (SMS) in causality allows independent intervention on multiple biases, thereby potentially preserving head category performance while pursuing the prediction of high-informative tail relationships. However, the noisy datasets lead to unobserved confounders for the SGG task, and thus the constructed causal models are always causal-insufficient to benefit from SMS. To remedy this, we propose Two-stage Causal Modeling (TsCM) for the SGG task, which takes the long-tailed distribution and semantic confusion as confounders to the Structural Causal Model (SCM) and then decouples the causal intervention into two stages. The first stage is causal representation learning, where we use a novel Population Loss (P-Loss) to intervene in the semantic confusion confounder. The second stage introduces the Adaptive Logit Adjustment (AL-Adjustment) to eliminate the long-tailed distribution confounder to complete causal calibration learning. These two stages are model agnostic and thus can be used in any SGG model that seeks unbiased predictions. Comprehensive experiments conducted on the popular SGG backbones and benchmarks show that our TsCM can achieve state-of-the-art performance in terms of mean recall rate. Furthermore, TsCM can maintain a higher recall rate than other debiasing methods, which indicates that our method can achieve a better tradeoff between head and tail relationships.
Abstract:Deep learning has been highly successful in computer vision with large amounts of labeled data, but struggles with limited labeled training data. To address this, Few-shot learning (FSL) is proposed, but it assumes that all samples (including source and target task data, where target tasks are performed with prior knowledge from source ones) are from the same domain, which is a stringent assumption in the real world. To alleviate this limitation, Cross-domain few-shot learning (CDFSL) has gained attention as it allows source and target data from different domains and label spaces. This paper provides a comprehensive review of CDFSL at the first time, which has received far less attention than FSL due to its unique setup and difficulties. We expect this paper to serve as both a position paper and a tutorial for those doing research in CDFSL. This review first introduces the definition of CDFSL and the issues involved, followed by the core scientific question and challenge. A comprehensive review of validated CDFSL approaches from the existing literature is then presented, along with their detailed descriptions based on a rigorous taxonomy. Furthermore, this paper outlines and discusses several promising directions of CDFSL that deserve further scientific investigation, covering aspects of problem setups, applications and theories.