Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuchen Ma

DiffPO: A causal diffusion model for learning distributions of potential outcomes

Oct 11, 2024

Yuchen Ma, Valentyn Melnychuk, Jonas Schweisthal, Stefan Feuerriegel

Figure 1 for DiffPO: A causal diffusion model for learning distributions of potential outcomes

Figure 2 for DiffPO: A causal diffusion model for learning distributions of potential outcomes

Figure 3 for DiffPO: A causal diffusion model for learning distributions of potential outcomes

Figure 4 for DiffPO: A causal diffusion model for learning distributions of potential outcomes

Abstract:Predicting potential outcomes of interventions from observational data is crucial for decision-making in medicine, but the task is challenging due to the fundamental problem of causal inference. Existing methods are largely limited to point estimates of potential outcomes with no uncertain quantification; thus, the full information about the distributions of potential outcomes is typically ignored. In this paper, we propose a novel causal diffusion model called DiffPO, which is carefully designed for reliable inferences in medicine by learning the distribution of potential outcomes. In our DiffPO, we leverage a tailored conditional denoising diffusion model to learn complex distributions, where we address the selection bias through a novel orthogonal diffusion loss. Another strength of our DiffPO method is that it is highly flexible (e.g., it can also be used to estimate different causal quantities such as CATE). Across a wide range of experiments, we show that our method achieves state-of-the-art performance.

Via

Access Paper or Ask Questions

Neural Randomized Planning for Whole Body Robot Motion

May 18, 2024

Yunfan Lu, Yuchen Ma, David Hsu, Caicai Pan

Abstract:Robot motion planning has made vast advances over the past decades, but the challenge remains: robot mobile manipulators struggle to plan long-range whole-body motion in common household environments in real time, because of high-dimensional robot configuration space and complex environment geometry. To tackle the challenge, this paper proposes Neural Randomized Planner (NRP), which combines a global sampling-based motion planning (SBMP) algorithm and a local neural sampler. Intuitively, NRP uses the search structure inside the global planner to stitch together learned local sampling distributions to form a global sampling distribution adaptively. It benefits from both learning and planning. Locally, it tackles high dimensionality by learning to sample in promising regions from data, with a rich neural network representation. Globally, it composes the local sampling distributions through planning and exploits local geometric similarity to scale up to complex environments. Experiments both in simulation and on a real robot show \NRP yields superior performance compared to some of the best classical and learning-enhanced SBMP algorithms. Further, despite being trained in simulation, NRP demonstrates zero-shot transfer to a real robot operating in novel household environments, without any fine-tuning or manual adaptation.

Via

Access Paper or Ask Questions

Counterfactual Fairness for Predictions using Generative Adversarial Networks

Oct 26, 2023

Yuchen Ma, Dennis Frauen, Valentyn Melnychuk, Stefan Feuerriegel

Figure 1 for Counterfactual Fairness for Predictions using Generative Adversarial Networks

Figure 2 for Counterfactual Fairness for Predictions using Generative Adversarial Networks

Figure 3 for Counterfactual Fairness for Predictions using Generative Adversarial Networks

Figure 4 for Counterfactual Fairness for Predictions using Generative Adversarial Networks

Abstract:Fairness in predictions is of direct importance in practice due to legal, ethical, and societal reasons. It is often achieved through counterfactual fairness, which ensures that the prediction for an individual is the same as that in a counterfactual world under a different sensitive attribute. However, achieving counterfactual fairness is challenging as counterfactuals are unobservable. In this paper, we develop a novel deep neural network called Generative Counterfactual Fairness Network (GCFN) for making predictions under counterfactual fairness. Specifically, we leverage a tailored generative adversarial network to directly learn the counterfactual distribution of the descendants of the sensitive attribute, which we then use to enforce fair predictions through a novel counterfactual mediator regularization. If the counterfactual distribution is learned sufficiently well, our method is mathematically guaranteed to ensure the notion of counterfactual fairness. Thereby, our GCFN addresses key shortcomings of existing baselines that are based on inferring latent variables, yet which (a) are potentially correlated with the sensitive attributes and thus lead to bias, and (b) have weak capability in constructing latent representations and thus low prediction performance. Across various experiments, our method achieves state-of-the-art performance. Using a real-world case study from recidivism prediction, we further demonstrate that our method makes meaningful predictions in practice.

Via

Access Paper or Ask Questions

DiT: Efficient Vision Transformers with Dynamic Token Routing

Aug 11, 2023

Yuchen Ma, Zhengcong Fei, Junshi Huang

Figure 1 for DiT: Efficient Vision Transformers with Dynamic Token Routing

Figure 2 for DiT: Efficient Vision Transformers with Dynamic Token Routing

Figure 3 for DiT: Efficient Vision Transformers with Dynamic Token Routing

Figure 4 for DiT: Efficient Vision Transformers with Dynamic Token Routing

Abstract:Recently, the tokens of images share the same static data flow in many dense networks. However, challenges arise from the variance among the objects in images, such as large variations in the spatial scale and difficulties of recognition for visual entities. In this paper, we propose a data-dependent token routing strategy to elaborate the routing paths of image tokens for Dynamic Vision Transformer, dubbed DiT. The proposed framework generates a data-dependent path per token, adapting to the object scales and visual discrimination of tokens. In feed-forward, the differentiable routing gates are designed to select the scaling paths and feature transformation paths for image tokens, leading to multi-path feature propagation. In this way, the impact of object scales and visual discrimination of image representation can be carefully tuned. Moreover, the computational cost can be further reduced by giving budget constraints to the routing gate and early-stopping of feature extraction. In experiments, our DiT achieves superior performance and favorable complexity/accuracy trade-offs than many SoTA methods on ImageNet classification, object detection, instance segmentation, and semantic segmentation. Particularly, the DiT-B5 obtains 84.8\% top-1 Acc on ImageNet with 10.3 GFLOPs, which is 1.0\% higher than that of the SoTA method with similar computational complexity. These extensive results demonstrate that DiT can serve as versatile backbones for various vision tasks.

Via

Access Paper or Ask Questions

Distilling Knowledge from Self-Supervised Teacher by Embedding Graph Alignment

Nov 23, 2022

Yuchen Ma, Yanbei Chen, Zeynep Akata

Abstract:Recent advances have indicated the strengths of self-supervised pre-training for improving representation learning on downstream tasks. Existing works often utilize self-supervised pre-trained models by fine-tuning on downstream tasks. However, fine-tuning does not generalize to the case when one needs to build a customized model architecture different from the self-supervised model. In this work, we formulate a new knowledge distillation framework to transfer the knowledge from self-supervised pre-trained models to any other student network by a novel approach named Embedding Graph Alignment. Specifically, inspired by the spirit of instance discrimination in self-supervised learning, we model the instance-instance relations by a graph formulation in the feature embedding space and distill the self-supervised teacher knowledge to a student network by aligning the teacher graph and the student graph. Our distillation scheme can be flexibly applied to transfer the self-supervised knowledge to enhance representation learning on various student networks. We demonstrate that our model outperforms multiple representative knowledge distillation methods on three benchmark datasets, including CIFAR100, STL10, and TinyImageNet. Code is here: https://github.com/yccm/EGA.

* British Machine Vision Conference (BMVC 2022)

Via

Access Paper or Ask Questions

Generalized Few-Shot Object Detection without Forgetting

May 20, 2021

Zhibo Fan, Yuchen Ma, Zeming Li, Jian Sun

Figure 1 for Generalized Few-Shot Object Detection without Forgetting

Figure 2 for Generalized Few-Shot Object Detection without Forgetting

Figure 3 for Generalized Few-Shot Object Detection without Forgetting

Figure 4 for Generalized Few-Shot Object Detection without Forgetting

Abstract:Recently few-shot object detection is widely adopted to deal with data-limited situations. While most previous works merely focus on the performance on few-shot categories, we claim that detecting all classes is crucial as test samples may contain any instances in realistic applications, which requires the few-shot detector to learn new concepts without forgetting. Through analysis on transfer learning based methods, some neglected but beneficial properties are utilized to design a simple yet effective few-shot detector, Retentive R-CNN. It consists of Bias-Balanced RPN to debias the pretrained RPN and Re-detector to find few-shot class objects without forgetting previous knowledge. Extensive experiments on few-shot detection benchmarks show that Retentive R-CNN significantly outperforms state-of-the-art methods on overall performance among all settings as it can achieve competitive results on few-shot classes and does not degrade the base class performance at all. Our approach has demonstrated that the long desired never-forgetting learner is available in object detection.

* Accepted by CVPR 2021

Via

Access Paper or Ask Questions

IQDet: Instance-wise Quality Distribution Sampling for Object Detection

Apr 14, 2021

Yuchen Ma, Songtao Liu, Zeming Li, Jian Sun

Figure 1 for IQDet: Instance-wise Quality Distribution Sampling for Object Detection

Figure 2 for IQDet: Instance-wise Quality Distribution Sampling for Object Detection

Figure 3 for IQDet: Instance-wise Quality Distribution Sampling for Object Detection

Figure 4 for IQDet: Instance-wise Quality Distribution Sampling for Object Detection

Abstract:We propose a dense object detector with an instance-wise sampling strategy, named IQDet. Instead of using human prior sampling strategies, we first extract the regional feature of each ground-truth to estimate the instance-wise quality distribution. According to a mixture model in spatial dimensions, the distribution is more noise-robust and adapted to the semantic pattern of each instance. Based on the distribution, we propose a quality sampling strategy, which automatically selects training samples in a probabilistic manner and trains with more high-quality samples. Extensive experiments on MS COCO show that our method steadily improves baseline by nearly 2.4 AP without bells and whistles. Moreover, our best model achieves 51.6 AP, outperforming all existing state-of-the-art one-stage detectors and it is completely cost-free in inference time.

* Accepted by CVPR 2021

Via

Access Paper or Ask Questions

Joint COCO and Mapillary Workshop at ICCV 2019: COCO Instance Segmentation Challenge Track

Oct 06, 2020

Zeming Li, Yuchen Ma, Yukang Chen, Xiangyu Zhang, Jian Sun

Figure 1 for Joint COCO and Mapillary Workshop at ICCV 2019: COCO Instance Segmentation Challenge Track

Figure 2 for Joint COCO and Mapillary Workshop at ICCV 2019: COCO Instance Segmentation Challenge Track

Figure 3 for Joint COCO and Mapillary Workshop at ICCV 2019: COCO Instance Segmentation Challenge Track

Figure 4 for Joint COCO and Mapillary Workshop at ICCV 2019: COCO Instance Segmentation Challenge Track

Abstract:In this report, we present our object detection/instance segmentation system, MegDetV2, which works in a two-pass fashion, first to detect instances then to obtain segmentation. Our baseline detector is mainly built on a new designed RPN, called RPN++. On the COCO-2019 detection/instance-segmentation test-dev dataset, our system achieves 61.0/53.1 mAP, which surpassed our 2018 winning results by 5.0/4.2 respectively. We achieve the best results in COCO Challenge 2019 and 2020.

* 1st Place Technical Report in ICCV2019/ ECCV2020: MegDetV2

Via

Access Paper or Ask Questions

BorderDet: Border Feature for Dense Object Detection

Jul 21, 2020

Han Qiu, Yuchen Ma, Zeming Li, Songtao Liu, Jian Sun

Figure 1 for BorderDet: Border Feature for Dense Object Detection

Figure 2 for BorderDet: Border Feature for Dense Object Detection

Figure 3 for BorderDet: Border Feature for Dense Object Detection

Figure 4 for BorderDet: Border Feature for Dense Object Detection

Abstract:Dense object detectors rely on the sliding-window paradigm that predicts the object over a regular grid of image. Meanwhile, the feature maps on the point of the grid are adopted to generate the bounding box predictions. The point feature is convenient to use but may lack the explicit border information for accurate localization. In this paper, We propose a simple and efficient operator called Border-Align to extract "border features" from the extreme point of the border to enhance the point feature. Based on the BorderAlign, we design a novel detection architecture called BorderDet, which explicitly exploits the border information for stronger classification and more accurate localization. With ResNet-50 backbone, our method improves single-stage detector FCOS by 2.8 AP gains (38.6 v.s. 41.4). With the ResNeXt-101-DCN backbone, our BorderDet obtains 50.3 AP, outperforming the existing state-of-the-art approaches. The code is available at (https://github.com/Megvii-BaseDetection/BorderDet).

* Accepted by ECCV 2020 as Oral

Via

Access Paper or Ask Questions

A Novel Artificial Fish Swarm Algorithm for Pattern Recognition with Convex Optimization

Dec 07, 2016

Lei Shi, Rui Guo, Yuchen Ma

Figure 1 for A Novel Artificial Fish Swarm Algorithm for Pattern Recognition with Convex Optimization

Figure 2 for A Novel Artificial Fish Swarm Algorithm for Pattern Recognition with Convex Optimization

Figure 3 for A Novel Artificial Fish Swarm Algorithm for Pattern Recognition with Convex Optimization

Figure 4 for A Novel Artificial Fish Swarm Algorithm for Pattern Recognition with Convex Optimization

Abstract:Image pattern recognition is an important area in digital image processing. An efficient pattern recognition algorithm should be able to provide correct recognition at a reduced computational time. Off late amongst the machine learning pattern recognition algorithms, Artificial fish swarm algorithm is one of the swarm intelligence optimization algorithms that works based on population and stochastic search. In order to achieve acceptable result, there are many parameters needs to be adjusted in AFSA. Among these parameters, visual and step are very significant in view of the fact that artificial fish basically move based on these parameters. In standard AFSA, these two parameters remain constant until the algorithm termination. Large values of these parameters increase the capability of algorithm in global search, while small values improve the local search ability of the algorithm. In this paper, we empirically study the performance of the AFSA and different approaches to balance between local and global exploration have been tested based on the adaptive modification of visual and step during algorithm execution. The proposed approaches have been evaluated based on the four well-known benchmark functions. Experimental results show considerable positive impact on the performance of AFSA. A Convex optimization has been integrated into the proposed work to have an ideal segmentation of the input image which is a MR brain image.

* arXiv admin note: submission has been withdrawn by arXiv administrators due to inappropriate text reuse from external sources

Via

Access Paper or Ask Questions