Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Fuqiang Gu

Efficient Event-based Semantic Segmentation with Spike-driven Lightweight Transformer-based Networks

Dec 17, 2024

Xiaxin Zhu, Fangming Guo, Xianlei Long, Qingyi Gu, Chao Chen, Fuqiang Gu

Abstract:Event-based semantic segmentation has great potential in autonomous driving and robotics due to the advantages of event cameras, such as high dynamic range, low latency, and low power cost. Unfortunately, current artificial neural network (ANN)-based segmentation methods suffer from high computational demands, the requirements for image frames, and massive energy consumption, limiting their efficiency and application on resource-constrained edge/mobile platforms. To address these problems, we introduce SLTNet, a spike-driven lightweight transformer-based network designed for event-based semantic segmentation. Specifically, SLTNet is built on efficient spike-driven convolution blocks (SCBs) to extract rich semantic features while reducing the model's parameters. Then, to enhance the long-range contextural feature interaction, we propose novel spike-driven transformer blocks (STBs) with binary mask operations. Based on these basic blocks, SLTNet employs a high-efficiency single-branch architecture while maintaining the low energy consumption of the Spiking Neural Network (SNN). Finally, extensive experiments on DDD17 and DSEC-Semantic datasets demonstrate that SLTNet outperforms state-of-the-art (SOTA) SNN-based methods by at least 7.30% and 3.30% mIoU, respectively, with extremely 5.48x lower energy consumption and 1.14x faster inference speed.

* Submitted to IEEE ICRA 2025

Via

Access Paper or Ask Questions

A Novel Wide-Area Multiobject Detection System with High-Probability Region Searching

May 07, 2024

Xianlei Long, Hui Zhao, Chao Chen, Fuqiang Gu, Qingyi Gu

Figure 1 for A Novel Wide-Area Multiobject Detection System with High-Probability Region Searching

Figure 2 for A Novel Wide-Area Multiobject Detection System with High-Probability Region Searching

Figure 3 for A Novel Wide-Area Multiobject Detection System with High-Probability Region Searching

Figure 4 for A Novel Wide-Area Multiobject Detection System with High-Probability Region Searching

Abstract:In recent years, wide-area visual surveillance systems have been widely applied in various industrial and transportation scenarios. These systems, however, face significant challenges when implementing multi-object detection due to conflicts arising from the need for high-resolution imaging, efficient object searching, and accurate localization. To address these challenges, this paper presents a hybrid system that incorporates a wide-angle camera, a high-speed search camera, and a galvano-mirror. In this system, the wide-angle camera offers panoramic images as prior information, which helps the search camera capture detailed images of the targeted objects. This integrated approach enhances the overall efficiency and effectiveness of wide-area visual detection systems. Specifically, in this study, we introduce a wide-angle camera-based method to generate a panoramic probability map (PPM) for estimating high-probability regions of target object presence. Then, we propose a probability searching module that uses the PPM-generated prior information to dynamically adjust the sampling range and refine target coordinates based on uncertainty variance computed by the object detector. Finally, the integration of PPM and the probability searching module yields an efficient hybrid vision system capable of achieving 120 fps multi-object search and detection. Extensive experiments are conducted to verify the system's effectiveness and robustness.

* 2024 IEEE International Conference on Robotics and Automation (ICRA)
* Accepted by ICRA 2024

Via

Access Paper or Ask Questions

MoMa-Pos: Where Should Mobile Manipulators Stand in Cluttered Environment Before Task Execution?

Mar 29, 2024

Beichen Shao, Yan Ding, Xingchen Wang, Xuefeng Xie, Fuqiang Gu, Jun Luo, Chao Chen

Figure 1 for MoMa-Pos: Where Should Mobile Manipulators Stand in Cluttered Environment Before Task Execution?

Figure 2 for MoMa-Pos: Where Should Mobile Manipulators Stand in Cluttered Environment Before Task Execution?

Figure 3 for MoMa-Pos: Where Should Mobile Manipulators Stand in Cluttered Environment Before Task Execution?

Figure 4 for MoMa-Pos: Where Should Mobile Manipulators Stand in Cluttered Environment Before Task Execution?

Abstract:Mobile manipulators always need to determine feasible base positions prior to carrying out navigation-manipulation tasks. Real-world environments are often cluttered with various furniture, obstacles, and dozens of other objects. Efficiently computing base positions poses a challenge. In this work, we introduce a framework named MoMa-Pos to address this issue. MoMa-Pos first learns to predict a small set of objects that, taken together, would be sufficient for finding base positions using a graph embedding architecture. MoMa-Pos then calculates standing positions by considering furniture structures, robot models, and obstacles comprehensively. We have extensively evaluated the proposed MoMa-Pos across different settings (e.g., environment and algorithm parameters) and with various mobile manipulators. Our empirical results show that MoMa-Pos demonstrates remarkable effectiveness and efficiency in its performance, surpassing the methods in the literature. %, but also is adaptable to cluttered environments and different robot models. Supplementary material can be found at \url{https://yding25.com/MoMa-Pos}.

* Submitted to IROS 2024

Via

Access Paper or Ask Questions

EdgeVO: An Efficient and Accurate Edge-based Visual Odometry

Feb 19, 2023

Hui Zhao, Jianga Shang, Kai Liu, Chao Chen, Fuqiang Gu

Abstract:Visual odometry is important for plenty of applications such as autonomous vehicles, and robot navigation. It is challenging to conduct visual odometry in textureless scenes or environments with sudden illumination changes where popular feature-based methods or direct methods cannot work well. To address this challenge, some edge-based methods have been proposed, but they usually struggle between the efficiency and accuracy. In this work, we propose a novel visual odometry approach called \textit{EdgeVO}, which is accurate, efficient, and robust. By efficiently selecting a small set of edges with certain strategies, we significantly improve the computational efficiency without sacrificing the accuracy. Compared to existing edge-based method, our method can significantly reduce the computational complexity while maintaining similar accuracy or even achieving better accuracy. This is attributed to that our method removes useless or noisy edges. Experimental results on the TUM datasets indicate that EdgeVO significantly outperforms other methods in terms of efficiency, accuracy and robustness.

* Accepted by 2023 IEEE International Conference on Robotics and Automation (ICRA)

Via

Access Paper or Ask Questions

Location reference recognition from texts: A survey and comparison

Jul 04, 2022

Xuke Hu, Zhiyong Zhou, Hao Li, Yingjie Hu, Fuqiang Gu, Jens Kersten, Hongchao Fan, Friederike Klan

Figure 1 for Location reference recognition from texts: A survey and comparison

Figure 2 for Location reference recognition from texts: A survey and comparison

Figure 3 for Location reference recognition from texts: A survey and comparison

Figure 4 for Location reference recognition from texts: A survey and comparison

Abstract:A vast amount of location information exists in unstructured texts, such as social media posts, news stories, scientific articles, web pages, travel blogs, and historical archives. Geoparsing refers to the process of recognizing location references from texts and identifying their geospatial representations. While geoparsing can benefit many domains, a summary of the specific applications is still missing. Further, there lacks a comprehensive review and comparison of existing approaches for location reference recognition, which is the first and a core step of geoparsing. To fill these research gaps, this review first summarizes seven typical application domains of geoparsing: geographic information retrieval, disaster management, disease surveillance, traffic management, spatial humanities, tourism management, and crime management. We then review existing approaches for location reference recognition by categorizing these approaches into four groups based on their underlying functional principle: rule-based, gazetteer matching-based, statistical learning-based, and hybrid approaches. Next, we thoroughly evaluate the correctness and computational efficiency of the 27 most widely used approaches for location reference recognition based on 26 public datasets with different types of texts (e.g., social media posts and news stories) containing 39,736 location references across the world. Results from this thorough evaluation can help inform future methodological developments for location reference recognition, and can help guide the selection of proper approaches based on application needs.

* 35 pages, 11 figures

Via

Access Paper or Ask Questions

Surrogate-based cross-correlation for particle image velocimetry

Dec 10, 2021

Yong Lee, Fuqiang Gu, Zeyu Gong

Figure 1 for Surrogate-based cross-correlation for particle image velocimetry

Figure 2 for Surrogate-based cross-correlation for particle image velocimetry

Figure 3 for Surrogate-based cross-correlation for particle image velocimetry

Figure 4 for Surrogate-based cross-correlation for particle image velocimetry

Abstract:This paper presents a novel surrogate-based cross-correlation (SBCC) framework to improve the correlation performance between two image signals. The basic idea behind the SBCC is that an optimized surrogate filter/image, supplanting one original image, will produce a more robust and more accurate correlation signal. The cross-correlation estimation of the SBCC is formularized with an objective function composed of surrogate loss and correlation consistency loss. The closed-form solution provides an efficient estimation. To our surprise, the SBCC framework could provide an alternative view to explain a set of generalized cross-correlation (GCC) methods and comprehend the meaning of parameters. With the help of our SBCC framework, we further propose four new specific cross-correlation methods, and provide some suggestions for improving existing GCC methods. A noticeable fact is that the SBCC could enhance the correlation robustness by incorporating other negative context images. Considering the sub-pixel accuracy and robustness requirement of particle image velocimetry (PIV), the contribution of each term in the objective function is investigated with particles' images. Compared with the state-of-the-art baseline methods, the SBCC methods exhibit improved performance (accuracy and robustness) on the synthetic dataset and several challenging real experimental PIV cases.

* 13 pages, 11 figures

Via

Access Paper or Ask Questions

EventDrop: data augmentation for event-based learning

Jun 07, 2021

Fuqiang Gu, Weicong Sng, Xuke Hu, Fangwen Yu

Figure 1 for EventDrop: data augmentation for event-based learning

Figure 2 for EventDrop: data augmentation for event-based learning

Figure 3 for EventDrop: data augmentation for event-based learning

Figure 4 for EventDrop: data augmentation for event-based learning

Abstract:The advantages of event-sensing over conventional sensors (e.g., higher dynamic range, lower time latency, and lower power consumption) have spurred research into machine learning for event data. Unsurprisingly, deep learning has emerged as a competitive methodology for learning with event sensors; in typical setups, discrete and asynchronous events are first converted into frame-like tensors on which standard deep networks can be applied. However, over-fitting remains a challenge, particularly since event datasets remain small relative to conventional datasets (e.g., ImageNet). In this paper, we introduce EventDrop, a new method for augmenting asynchronous event data to improve the generalization of deep models. By dropping events selected with various strategies, we are able to increase the diversity of training data (e.g., to simulate various levels of occlusion). From a practical perspective, EventDrop is simple to implement and computationally low-cost. Experiments on two event datasets (N-Caltech101 and N-Cars) demonstrate that EventDrop can significantly improve the generalization performance across a variety of deep networks.

* IJCAI 2021

Via

Access Paper or Ask Questions

Fast and Reliable WiFi Fingerprint Collection for Indoor Localization

Aug 01, 2020

Fuqiang Gu, Milad Ramezani, Kourosh Khoshelham, Xiaoping Zheng, Ruiqin Zhou, Jianga Shang

Figure 1 for Fast and Reliable WiFi Fingerprint Collection for Indoor Localization

Figure 2 for Fast and Reliable WiFi Fingerprint Collection for Indoor Localization

Figure 3 for Fast and Reliable WiFi Fingerprint Collection for Indoor Localization

Figure 4 for Fast and Reliable WiFi Fingerprint Collection for Indoor Localization

Abstract:Fingerprinting is a popular indoor localization technique since it can utilize existing infrastructures (e.g., access points). However, its site survey process is a labor-intensive and time-consuming task, which limits the application of such systems in practice. In this paper, motivated by the availability of advanced sensing capabilities in smartphones, we propose a fast and reliable fingerprint collection method to reduce the time and labor required for site survey. The proposed method uses a landmark graph-based method to automatically associate the collected fingerprints, which does not require active user participation. We will show that besides fast fingerprint data collection, the proposed method results in accurate location estimate compared to the state-of-the-art methods. Experimental results show that the proposed method is an order of magnitude faster than the manual fingerprint collection method, and using the radio map generated by our method achieves a much better accuracy compared to the existing methods.

Via

Access Paper or Ask Questions

TactileSGNet: A Spiking Graph Neural Network for Event-based Tactile Object Recognition

Aug 01, 2020

Fuqiang Gu, Weicong Sng, Tasbolat Taunyazov, Harold Soh

Figure 1 for TactileSGNet: A Spiking Graph Neural Network for Event-based Tactile Object Recognition

Figure 2 for TactileSGNet: A Spiking Graph Neural Network for Event-based Tactile Object Recognition

Figure 3 for TactileSGNet: A Spiking Graph Neural Network for Event-based Tactile Object Recognition

Figure 4 for TactileSGNet: A Spiking Graph Neural Network for Event-based Tactile Object Recognition

Abstract:Tactile perception is crucial for a variety of robot tasks including grasping and in-hand manipulation. New advances in flexible, event-driven, electronic skins may soon endow robots with touch perception capabilities similar to humans. These electronic skins respond asynchronously to changes (e.g., in pressure, temperature), and can be laid out irregularly on the robot's body or end-effector. However, these unique features may render current deep learning approaches such as convolutional feature extractors unsuitable for tactile learning. In this paper, we propose a novel spiking graph neural network for event-based tactile object recognition. To make use of local connectivity of taxels, we present several methods for organizing the tactile data in a graph structure. Based on the constructed graphs, we develop a spiking graph convolutional network. The event-driven nature of spiking neural network makes it arguably more suitable for processing the event-based data. Experimental results on two tactile datasets show that the proposed method outperforms other state-of-the-art spiking methods, achieving high accuracies of approximately 90\% when classifying a variety of different household objects.

* IROS 2020

Via

Access Paper or Ask Questions