Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sihao Ding

AugMapNet: Improving Spatial Latent Structure via BEV Grid Augmentation for Enhanced Vectorized Online HD Map Construction

Mar 17, 2025

Thomas Monninger, Md Zafar Anwar, Stanislaw Antol, Steffen Staab, Sihao Ding

Abstract:Autonomous driving requires an understanding of the infrastructure elements, such as lanes and crosswalks. To navigate safely, this understanding must be derived from sensor data in real-time and needs to be represented in vectorized form. Learned Bird's-Eye View (BEV) encoders are commonly used to combine a set of camera images from multiple views into one joint latent BEV grid. Traditionally, from this latent space, an intermediate raster map is predicted, providing dense spatial supervision but requiring post-processing into the desired vectorized form. More recent models directly derive infrastructure elements as polylines using vectorized map decoders, providing instance-level information. Our approach, Augmentation Map Network (AugMapNet), proposes latent BEV grid augmentation, a novel technique that significantly enhances the latent BEV representation. AugMapNet combines vector decoding and dense spatial supervision more effectively than existing architectures while remaining as straightforward to integrate and as generic as auxiliary supervision. Experiments on nuScenes and Argoverse2 datasets demonstrate significant improvements in vectorized map prediction performance up to 13.3% over the StreamMapNet baseline on 60m range and greater improvements on larger ranges. We confirm transferability by applying our method to another baseline and find similar improvements. A detailed analysis of the latent BEV grid confirms a more structured latent space of AugMapNet and shows the value of our novel concept beyond pure performance improvement. The code will be released soon.

Via

Access Paper or Ask Questions

Be Aware of the Neighborhood Effect: Modeling Selection Bias under Interference

Apr 30, 2024

Haoxuan Li, Chunyuan Zheng, Sihao Ding, Peng Wu, Zhi Geng, Fuli Feng, Xiangnan He

Abstract:Selection bias in recommender system arises from the recommendation process of system filtering and the interactive process of user selection. Many previous studies have focused on addressing selection bias to achieve unbiased learning of the prediction model, but ignore the fact that potential outcomes for a given user-item pair may vary with the treatments assigned to other user-item pairs, named neighborhood effect. To fill the gap, this paper formally formulates the neighborhood effect as an interference problem from the perspective of causal inference and introduces a treatment representation to capture the neighborhood effect. On this basis, we propose a novel ideal loss that can be used to deal with selection bias in the presence of neighborhood effect. We further develop two new estimators for estimating the proposed ideal loss. We theoretically establish the connection between the proposed and previous debiasing methods ignoring the neighborhood effect, showing that the proposed methods can achieve unbiased learning when both selection bias and neighborhood effect are present, while the existing methods are biased. Extensive semi-synthetic and real-world experiments are conducted to demonstrate the effectiveness of the proposed methods.

* ICLR 24

Via

Access Paper or Ask Questions

Understanding and Counteracting Feature-Level Bias in Click-Through Rate Prediction

Feb 06, 2024

Jinqiu Jin, Sihao Ding, Wenjie Wang, Fuli Feng

Figure 1 for Understanding and Counteracting Feature-Level Bias in Click-Through Rate Prediction

Figure 2 for Understanding and Counteracting Feature-Level Bias in Click-Through Rate Prediction

Figure 3 for Understanding and Counteracting Feature-Level Bias in Click-Through Rate Prediction

Figure 4 for Understanding and Counteracting Feature-Level Bias in Click-Through Rate Prediction

Abstract:Common click-through rate (CTR) prediction recommender models tend to exhibit feature-level bias, which leads to unfair recommendations among item groups and inaccurate recommendations for users. While existing methods address this issue by adjusting the learning of CTR models, such as through additional optimization objectives, they fail to consider how the bias is caused within these models. To address this research gap, our study performs a top-down analysis on representative CTR models. Through blocking different components of a trained CTR model one by one, we identify the key contribution of the linear component to feature-level bias. We conduct a theoretical analysis of the learning process for the weights in the linear component, revealing how group-wise properties of training data influence them. Our experimental and statistical analyses demonstrate a strong correlation between imbalanced positive sample ratios across item groups and feature-level bias. Based on this understanding, we propose a minimally invasive yet effective strategy to counteract feature-level bias in CTR models by removing the biased linear weights from trained models. Additionally, we present a linear weight adjusting strategy that requires fewer random exposure records than relevant debiasing methods. The superiority of our proposed strategies are validated through extensive experiments on three real-world datasets.

Via

Access Paper or Ask Questions

F2BEV: Bird's Eye View Generation from Surround-View Fisheye Camera Images for Automated Driving

Mar 07, 2023

Ekta U. Samani, Feng Tao, Harshavardhan R. Dasari, Sihao Ding, Ashis G. Banerjee

Figure 1 for F2BEV: Bird's Eye View Generation from Surround-View Fisheye Camera Images for Automated Driving

Figure 2 for F2BEV: Bird's Eye View Generation from Surround-View Fisheye Camera Images for Automated Driving

Figure 3 for F2BEV: Bird's Eye View Generation from Surround-View Fisheye Camera Images for Automated Driving

Figure 4 for F2BEV: Bird's Eye View Generation from Surround-View Fisheye Camera Images for Automated Driving

Abstract:Bird's Eye View (BEV) representations are tremendously useful for perception-related automated driving tasks. However, generating BEVs from surround-view fisheye camera images is challenging due to the strong distortions introduced by such wide-angle lenses. We take the first step in addressing this challenge and introduce a baseline, F2BEV, to generate BEV height maps and semantic segmentation maps from fisheye images. F2BEV consists of a distortion-aware spatial cross attention module for querying and consolidating spatial information from fisheye image features in a transformer-style architecture followed by a task-specific head. We evaluate single-task and multi-task variants of F2BEV on our synthetic FB-SSEM dataset, all of which generate better BEV height and segmentation maps (in terms of the IoU) than a state-of-the-art BEV generation method operating on undistorted fisheye images. We also demonstrate height map generation from real-world fisheye images using F2BEV. An initial sample of our dataset is publicly available at https://tinyurl.com/58jvnscy

Via

Access Paper or Ask Questions

Causal Incremental Graph Convolution for Recommender System Retraining

Aug 16, 2021

Sihao Ding, Fuli Feng, Xiangnan He, Yong Liao, Jun Shi, Yongdong Zhang

Figure 1 for Causal Incremental Graph Convolution for Recommender System Retraining

Figure 2 for Causal Incremental Graph Convolution for Recommender System Retraining

Figure 3 for Causal Incremental Graph Convolution for Recommender System Retraining

Figure 4 for Causal Incremental Graph Convolution for Recommender System Retraining

Abstract:Real-world recommender system needs to be regularly retrained to keep with the new data. In this work, we consider how to efficiently retrain graph convolution network (GCN) based recommender models, which are state-of-the-art techniques for collaborative recommendation. To pursue high efficiency, we set the target as using only new data for model updating, meanwhile not sacrificing the recommendation accuracy compared with full model retraining. This is non-trivial to achieve, since the interaction data participates in both the graph structure for model construction and the loss function for model learning, whereas the old graph structure is not allowed to use in model updating. Towards the goal, we propose a \textit{Causal Incremental Graph Convolution} approach, which consists of two new operators named \textit{Incremental Graph Convolution} (IGC) and \textit{Colliding Effect Distillation} (CED) to estimate the output of full graph convolution. In particular, we devise simple and effective modules for IGC to ingeniously combine the old representations and the incremental graph and effectively fuse the long-term and short-term preference signals. CED aims to avoid the out-of-date issue of inactive nodes that are not in the incremental graph, which connects the new data with inactive nodes through causal inference. In particular, CED estimates the causal effect of new data on the representation of inactive nodes through the control of their collider. Extensive experiments on three real-world datasets demonstrate both accuracy gains and significant speed-ups over the existing retraining mechanism.

* submitted to TNNLS

Via

Access Paper or Ask Questions

Monte Carlo Filtering Objectives: A New Family of Variational Objectives to Learn Generative Model and Neural Adaptive Proposal for Time Series

May 20, 2021

Shuangshuang Chen, Sihao Ding, Yiannis Karayiannidis, Mårten Björkman

Figure 1 for Monte Carlo Filtering Objectives: A New Family of Variational Objectives to Learn Generative Model and Neural Adaptive Proposal for Time Series

Figure 2 for Monte Carlo Filtering Objectives: A New Family of Variational Objectives to Learn Generative Model and Neural Adaptive Proposal for Time Series

Figure 3 for Monte Carlo Filtering Objectives: A New Family of Variational Objectives to Learn Generative Model and Neural Adaptive Proposal for Time Series

Figure 4 for Monte Carlo Filtering Objectives: A New Family of Variational Objectives to Learn Generative Model and Neural Adaptive Proposal for Time Series

Abstract:Learning generative models and inferring latent trajectories have shown to be challenging for time series due to the intractable marginal likelihoods of flexible generative models. It can be addressed by surrogate objectives for optimization. We propose Monte Carlo filtering objectives (MCFOs), a family of variational objectives for jointly learning parametric generative models and amortized adaptive importance proposals of time series. MCFOs extend the choices of likelihood estimators beyond Sequential Monte Carlo in state-of-the-art objectives, possess important properties revealing the factors for the tightness of objectives, and allow for less biased and variant gradient estimates. We demonstrate that the proposed MCFOs and gradient estimations lead to efficient and stable model learning, and learned generative models well explain data and importance proposals are more sample efficient on various kinds of time series data.

* A complete version of manuscript accepted by IJCAI-21

Via

Access Paper or Ask Questions

Cirrus: A Long-range Bi-pattern LiDAR Dataset

Dec 05, 2020

Ze Wang, Sihao Ding, Ying Li, Jonas Fenn, Sohini Roychowdhury, Andreas Wallin, Lane Martin, Scott Ryvola, Guillermo Sapiro, Qiang Qiu

Figure 1 for Cirrus: A Long-range Bi-pattern LiDAR Dataset

Figure 2 for Cirrus: A Long-range Bi-pattern LiDAR Dataset

Figure 3 for Cirrus: A Long-range Bi-pattern LiDAR Dataset

Figure 4 for Cirrus: A Long-range Bi-pattern LiDAR Dataset

Abstract:In this paper, we introduce Cirrus, a new long-range bi-pattern LiDAR public dataset for autonomous driving tasks such as 3D object detection, critical to highway driving and timely decision making. Our platform is equipped with a high-resolution video camera and a pair of LiDAR sensors with a 250-meter effective range, which is significantly longer than existing public datasets. We record paired point clouds simultaneously using both Gaussian and uniform scanning patterns. Point density varies significantly across such a long range, and different scanning patterns further diversify object representation in LiDAR. In Cirrus, eight categories of objects are exhaustively annotated in the LiDAR point clouds for the entire effective range. To illustrate the kind of studies supported by this new dataset, we introduce LiDAR model adaptation across different ranges, scanning patterns, and sensor devices. Promising results show the great potential of this new dataset to the robotics and computer vision communities.

Via

Access Paper or Ask Questions

Range Adaptation for 3D Object Detection in LiDAR

Sep 26, 2019

Ze Wang, Sihao Ding, Ying Li, Minming Zhao, Sohini Roychowdhury, Andreas Wallin, Guillermo Sapiro, Qiang Qiu

Figure 1 for Range Adaptation for 3D Object Detection in LiDAR

Figure 2 for Range Adaptation for 3D Object Detection in LiDAR

Figure 3 for Range Adaptation for 3D Object Detection in LiDAR

Figure 4 for Range Adaptation for 3D Object Detection in LiDAR

Abstract:LiDAR-based 3D object detection plays a crucial role in modern autonomous driving systems. LiDAR data often exhibit severe changes in properties across different observation ranges. In this paper, we explore cross-range adaptation for 3D object detection using LiDAR, i.e., far-range observations are adapted to near-range. This way, far-range detection is optimized for similar performance to near-range one. We adopt a bird-eyes view (BEV) detection framework to perform the proposed model adaptation. Our model adaptation consists of an adversarial global adaptation, and a fine-grained local adaptation. The proposed cross range adaptation framework is validated on three state-of-the-art LiDAR based object detection networks, and we consistently observe performance improvement on the far-range objects, without adding any auxiliary parameters to the model. To the best of our knowledge, this paper is the first attempt to study cross-range LiDAR adaptation for object detection in point clouds. To demonstrate the generality of the proposed adaptation framework, experiments on more challenging cross-device adaptation are further conducted, and a new LiDAR dataset with high-quality annotated point clouds is released to promote future research.

Via

Access Paper or Ask Questions

Towards Recovery of Conditional Vectors from Conditional Generative Adversarial Networks

Dec 06, 2017

Sihao Ding, Andreas Wallin

Figure 1 for Towards Recovery of Conditional Vectors from Conditional Generative Adversarial Networks

Figure 2 for Towards Recovery of Conditional Vectors from Conditional Generative Adversarial Networks

Figure 3 for Towards Recovery of Conditional Vectors from Conditional Generative Adversarial Networks

Figure 4 for Towards Recovery of Conditional Vectors from Conditional Generative Adversarial Networks

Abstract:A conditional Generative Adversarial Network allows for generating samples conditioned on certain external information. Being able to recover latent and conditional vectors from a condi- tional GAN can be potentially valuable in various applications, ranging from image manipulation for entertaining purposes to diagnosis of the neural networks for security purposes. In this work, we show that it is possible to recover both latent and conditional vectors from generated images given the generator of a conditional generative adversarial network. Such a recovery is not trivial due to the often multi-layered non-linearity of deep neural networks. Furthermore, the effect of such recovery applied on real natural images are investigated. We discovered that there exists a gap between the recovery performance on generated and real images, which we believe comes from the difference between generated data distribution and real data distribution. Experiments are conducted to evaluate the recovered conditional vectors and the reconstructed images from these recovered vectors quantitatively and qualitatively, showing promising results.

* Under consideration for Pattern Recognition Letters, 11 pages

Via

Access Paper or Ask Questions