Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yu Tang

Tandon School of Engineering, New York University, New York, USA

Geo-ConvGRU: Geographically Masked Convolutional Gated Recurrent Unit for Bird-Eye View Segmentation

Dec 28, 2024

Guanglei Yang, Yongqiang Zhang, Wanlong Li, Yu Tang, Weize Shang, Feng Wen, Hongbo Zhang, Mingli Ding

Abstract:Convolutional Neural Networks (CNNs) have significantly impacted various computer vision tasks, however, they inherently struggle to model long-range dependencies explicitly due to the localized nature of convolution operations. Although Transformers have addressed limitations in long-range dependencies for the spatial dimension, the temporal dimension remains underexplored. In this paper, we first highlight that 3D CNNs exhibit limitations in capturing long-range temporal dependencies. Though Transformers mitigate spatial dimension issues, they result in a considerable increase in parameter and processing speed reduction. To overcome these challenges, we introduce a simple yet effective module, Geographically Masked Convolutional Gated Recurrent Unit (Geo-ConvGRU), tailored for Bird's-Eye View segmentation. Specifically, we substitute the 3D CNN layers with ConvGRU in the temporal module to bolster the capacity of networks for handling temporal dependencies. Additionally, we integrate a geographical mask into the Convolutional Gated Recurrent Unit to suppress noise introduced by the temporal module. Comprehensive experiments conducted on the NuScenes dataset substantiate the merits of the proposed Geo-ConvGRU, revealing that our approach attains state-of-the-art performance in Bird's-Eye View segmentation.

Via

Access Paper or Ask Questions

Semi-Supervised Learning for Visual Bird's Eye View Semantic Segmentation

Aug 28, 2023

Junyu Zhu, Lina Liu, Yu Tang, Feng Wen, Wanlong Li, Yong Liu

Abstract:Visual bird's eye view (BEV) semantic segmentation helps autonomous vehicles understand the surrounding environment only from images, including static elements (e.g., roads) and dynamic elements (e.g., vehicles, pedestrians). However, the high cost of annotation procedures of full-supervised methods limits the capability of the visual BEV semantic segmentation, which usually needs HD maps, 3D object bounding boxes, and camera extrinsic matrixes. In this paper, we present a novel semi-supervised framework for visual BEV semantic segmentation to boost performance by exploiting unlabeled images during the training. A consistency loss that makes full use of unlabeled data is then proposed to constrain the model on not only semantic prediction but also the BEV feature. Furthermore, we propose a novel and effective data augmentation method named conjoint rotation which reasonably augments the dataset while maintaining the geometric relationship between the front-view images and the BEV semantic segmentation. Extensive experiments on the nuScenes and Argoverse datasets show that our semi-supervised framework can effectively improve prediction accuracy. To the best of our knowledge, this is the first work that explores improving visual BEV semantic segmentation performance using unlabeled data. The code will be publicly available.

Via

Access Paper or Ask Questions

Physics-informed Machine Learning for Calibrating Macroscopic Traffic Flow Models

Jul 12, 2023

Yu Tang, Li Jin, Kaan Ozbay

Figure 1 for Physics-informed Machine Learning for Calibrating Macroscopic Traffic Flow Models

Figure 2 for Physics-informed Machine Learning for Calibrating Macroscopic Traffic Flow Models

Figure 3 for Physics-informed Machine Learning for Calibrating Macroscopic Traffic Flow Models

Figure 4 for Physics-informed Machine Learning for Calibrating Macroscopic Traffic Flow Models

Abstract:Well-calibrated traffic flow models are fundamental to understanding traffic phenomena and designing control strategies. Traditional calibration has been developed base on optimization methods. In this paper, we propose a novel physics-informed, learning-based calibration approach that achieves performances comparable to and even better than those of optimization-based methods. To this end, we combine the classical deep autoencoder, an unsupervised machine learning model consisting of one encoder and one decoder, with traffic flow models. Our approach informs the decoder of the physical traffic flow models and thus induces the encoder to yield reasonable traffic parameters given flow and speed measurements. We also introduce the denoising autoencoder into our method so that it can handles not only with normal data but also with corrupted data with missing values. We verified our approach with a case study of I-210 E in California.

Via

Access Paper or Ask Questions

Geometric sliding mode control of mechanical systems on Lie groups

May 31, 2023

Eduardo Espindola, Yu Tang

Abstract:This paper presents a generalization of conventional sliding mode control designs for systems in Euclidean spaces to fully actuated simple mechanical systems whose configuration space is a Lie group for the trajectory-tracking problem. A generic kinematic control is first devised in the underlying Lie algebra, which enables the construction of a Lie group on the tangent bundle where the system state evolves. A sliding subgroup is then proposed on the tangent bundle with the desired sliding properties, and a control law is designed for the error dynamics trajectories to reach the sliding subgroup globally exponentially. Tracking control is then composed of the reaching law and sliding mode, and is applied for attitude tracking on the special orthogonal group SO(3) and the unit sphere S3. Numerical simulations show the performance of the proposed geometric sliding-mode controller (GSMC) in contrast with two control schemes of the literature.

* 13 pages, 1 figure

Via

Access Paper or Ask Questions

SAD: A Large-scale Dataset towards Airport Detection in Synthetic Aperture Radar Images

Apr 07, 2022

Daochang Wang, Fan Zhang, Fei Ma, Wei Hu, Yu Tang, Yongsheng Zhou

Figure 1 for SAD: A Large-scale Dataset towards Airport Detection in Synthetic Aperture Radar Images

Figure 2 for SAD: A Large-scale Dataset towards Airport Detection in Synthetic Aperture Radar Images

Figure 3 for SAD: A Large-scale Dataset towards Airport Detection in Synthetic Aperture Radar Images

Figure 4 for SAD: A Large-scale Dataset towards Airport Detection in Synthetic Aperture Radar Images

Abstract:Airports have an important role in both military and civilian domains. The synthetic aperture radar (SAR) based airport detection has received increasing attention in recent years. However, due to the high cost of SAR imaging and annotation process, there is no publicly available SAR dataset for airport detection. As a result, deep learning methods have not been fully used in airport detection tasks. To provide a benchmark for airport detection research in SAR images, this paper introduces a large-scale SAR Airport Dataset (SAD). In order to adequately reflect the demands of real world applications, it contains 624 SAR images from Sentinel 1B and covers 104 airfield instances with different scales, orientations and shapes. The experiments of multiple deep learning approach on this dataset proves its effectiveness. It developing state-of-the-art airport area detection algorithms or other relevant tasks.

Via

Access Paper or Ask Questions

DELTA: Dynamically Optimizing GPU Memory beyond Tensor Recomputation

Mar 30, 2022

Yu Tang, Chenyu Wang, Yufan Zhang, Yuliang Liu, Xingcheng Zhang, Linbo Qiao, Zhiquan Lai, Dongsheng Li

Figure 1 for DELTA: Dynamically Optimizing GPU Memory beyond Tensor Recomputation

Figure 2 for DELTA: Dynamically Optimizing GPU Memory beyond Tensor Recomputation

Figure 3 for DELTA: Dynamically Optimizing GPU Memory beyond Tensor Recomputation

Figure 4 for DELTA: Dynamically Optimizing GPU Memory beyond Tensor Recomputation

Abstract:The further development of deep neural networks is hampered by the limited GPU memory resource. Therefore, the optimization of GPU memory resources is highly demanded. Swapping and recomputation are commonly applied to make better use of GPU memory in deep learning. However, as an emerging domain, several challenges remain:1)The efficiency of recomputation is limited for both static and dynamic methods. 2)Swapping requires offloading parameters manually, which incurs a great time cost. 3) There is no such dynamic and fine-grained method that involves tensor swapping together with tensor recomputation nowadays. To remedy the above issues, we propose a novel scheduler manager named DELTA(Dynamic tEnsor offLoad and recompuTAtion). To the best of our knowledge, we are the first to make a reasonable dynamic runtime scheduler on the combination of tensor swapping and tensor recomputation without user oversight. In DELTA, we propose a filter algorithm to select the optimal tensors to be released out of GPU memory and present a director algorithm to select a proper action for each of these tensors. Furthermore, prefetching and overlapping are deliberately considered to overcome the time cost caused by swapping and recomputing tensors. Experimental results show that DELTA not only saves 40%-70% of GPU memory, surpassing the state-of-the-art method to a great extent but also gets comparable convergence results as the baseline with acceptable time delay. Also, DELTA gains 2.04$\times$ maximum batchsize when training ResNet-50 and 2.25$\times$ when training ResNet-101 compared with the baseline. Besides, comparisons between the swapping cost and recomputation cost in our experiments demonstrate the importance of making a reasonable dynamic scheduler on tensor swapping and tensor recomputation, which refutes the arguments in some related work that swapping should be the first and best choice.

* 12 pages, 6 figures

Via

Access Paper or Ask Questions

A Deep Reinforcement Learning Approach for Ramp Metering Based on Traffic Video Data

Dec 09, 2020

Bing Liu, Yu Tang, Yuxiong Ji, Yu Shen, Yuchuan Du

Figure 1 for A Deep Reinforcement Learning Approach for Ramp Metering Based on Traffic Video Data

Figure 2 for A Deep Reinforcement Learning Approach for Ramp Metering Based on Traffic Video Data

Figure 3 for A Deep Reinforcement Learning Approach for Ramp Metering Based on Traffic Video Data

Figure 4 for A Deep Reinforcement Learning Approach for Ramp Metering Based on Traffic Video Data

Abstract:Ramp metering that uses traffic signals to regulate vehicle flows from the on-ramps has been widely implemented to improve vehicle mobility of the freeway. Previous studies generally update signal timings in real-time based on predefined traffic measures collected by point detectors, such as traffic volumes and occupancies. Comparing with point detectors, traffic cameras-which have been increasingly deployed on road networks-could cover larger areas and provide more detailed traffic information. In this work, we propose a deep reinforcement learning (DRL) method to explore the potential of traffic video data in improving the efficiency of ramp metering. The proposed method uses traffic video frames as inputs and learns the optimal control strategies directly from the high-dimensional visual inputs. A real-world case study demonstrates that, in comparison with a state-of-the-practice method, the proposed DRL method results in 1) lower travel times in the mainline, 2) shorter vehicle queues at the on-ramp, and 3) higher traffic flows downstream of the merging area. The results suggest that the proposed method is able to extract useful information from the video data for better ramp metering controls.

Via

Access Paper or Ask Questions

ADMMiRNN: Training RNN with Stable Convergence via An Efficient ADMM Approach

Jun 17, 2020

Yu Tang, Zhigang Kan, Dequan Sun, Linbo Qiao, Jingjing Xiao, Zhiquan Lai, Dongsheng Li

Figure 1 for ADMMiRNN: Training RNN with Stable Convergence via An Efficient ADMM Approach

Figure 2 for ADMMiRNN: Training RNN with Stable Convergence via An Efficient ADMM Approach

Figure 3 for ADMMiRNN: Training RNN with Stable Convergence via An Efficient ADMM Approach

Figure 4 for ADMMiRNN: Training RNN with Stable Convergence via An Efficient ADMM Approach

Abstract:It is hard to train Recurrent Neural Network (RNN) with stable convergence and avoid gradient vanishing and exploding, as the weights in the recurrent unit are repeated from iteration to iteration. Moreover, RNN is sensitive to the initialization of weights and bias, which brings difficulty in the training phase. With the gradient-free feature and immunity to poor conditions, the Alternating Direction Method of Multipliers (ADMM) has become a promising algorithm to train neural networks beyond traditional stochastic gradient algorithms. However, ADMM could not be applied to train RNN directly since the state in the recurrent unit is repetitively updated over timesteps. Therefore, this work builds a new framework named ADMMiRNN upon the unfolded form of RNN to address the above challenges simultaneously and provides novel update rules and theoretical convergence analysis. We explicitly specify key update rules in the iterations of ADMMiRNN with deliberately constructed approximation techniques and solutions to each subproblem instead of vanilla ADMM. Numerical experiments are conducted on MNIST and text classification tasks, where ADMMiRNN achieves convergent results and outperforms compared baselines. Furthermore, ADMMiRNN trains RNN in a more stable way without gradient vanishing or exploding compared to the stochastic gradient algorithms. Source code has been available at https://github.com/TonyTangYu/ADMMiRNN.

* 17 pages, 11 figures

Via

Access Paper or Ask Questions

Median regression with differential privacy

Jun 04, 2020

E Chen, Ying Miao, Yu Tang

Figure 1 for Median regression with differential privacy

Figure 2 for Median regression with differential privacy

Abstract:Median regression analysis has robustness properties which make it attractive compared with regression based on the mean, while differential privacy can protect individual privacy during statistical analysis of certain datasets. In this paper, three privacy preserving methods are proposed for median regression. The first algorithm is based on a finite smoothing method, the second provides an iterative way and the last one further employs the greedy coordinate descent approach. Privacy preserving properties of these three methods are all proved. Accuracy bound or convergence properties of these algorithms are also provided. Numerical calculation shows that the first method has better accuracy than the others when the sample size is small. When the sample size becomes larger, the first method needs more time while the second method needs less time with well-matched accuracy. For the third method, it costs less time in both cases, while it highly depends on step size.

Via

Access Paper or Ask Questions