Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ye Tao

Light-YOLOv8-Flame: A Lightweight High-Performance Flame Detection Algorithm

Apr 15, 2025

Jiawei Lan, Ye Tao, Zhibiao Wang, Haoyang Yu, Wenhua Cui

Abstract:Fire detection algorithms, particularly those based on computer vision, encounter significant challenges such as high computational costs and delayed response times, which hinder their application in real-time systems. To address these limitations, this paper introduces Light-YOLOv8-Flame, a lightweight flame detection algorithm specifically designed for fast and efficient real-time deployment. The proposed model enhances the YOLOv8 architecture through the substitution of the original C2f module with the FasterNet Block module. This new block combines Partial Convolution (PConv) and Convolution (Conv) layers, reducing both computational complexity and model size. A dataset comprising 7,431 images, representing both flame and non-flame scenarios, was collected and augmented for training purposes. Experimental findings indicate that the modified YOLOv8 model achieves a 0.78% gain in mean average precision (mAP) and a 2.05% boost in recall, while reducing the parameter count by 25.34%, with only a marginal decrease in precision by 0.82%. These findings highlight that Light-YOLOv8-Flame offers enhanced detection performance and speed, making it well-suited for real-time fire detection on resource-constrained devices.

* 12 pages, 19 figures, 6 tables. Submitted to Engineering Letters

Via

Access Paper or Ask Questions

GSV3D: Gaussian Splatting-based Geometric Distillation with Stable Video Diffusion for Single-Image 3D Object Generation

Mar 08, 2025

Ye Tao, Jiawei Zhang, Yahao Shi, Dongqing Zou, Bin Zhou

Abstract:Image-based 3D generation has vast applications in robotics and gaming, where high-quality, diverse outputs and consistent 3D representations are crucial. However, existing methods have limitations: 3D diffusion models are limited by dataset scarcity and the absence of strong pre-trained priors, while 2D diffusion-based approaches struggle with geometric consistency. We propose a method that leverages 2D diffusion models' implicit 3D reasoning ability while ensuring 3D consistency via Gaussian-splatting-based geometric distillation. Specifically, the proposed Gaussian Splatting Decoder enforces 3D consistency by transforming SV3D latent outputs into an explicit 3D representation. Unlike SV3D, which only relies on implicit 2D representations for video generation, Gaussian Splatting explicitly encodes spatial and appearance attributes, enabling multi-view consistency through geometric constraints. These constraints correct view inconsistencies, ensuring robust geometric consistency. As a result, our approach simultaneously generates high-quality, multi-view-consistent images and accurate 3D models, providing a scalable solution for single-image-based 3D generation and bridging the gap between 2D Diffusion diversity and 3D structural coherence. Experimental results demonstrate state-of-the-art multi-view consistency and strong generalization across diverse datasets. The code will be made publicly available upon acceptance.

Via

Access Paper or Ask Questions

Differentially Private Distribution Estimation Using Functional Approximation

Jan 11, 2025

Ye Tao, Anand D. Sarwate

Abstract:The cumulative distribution function (CDF) is fundamental due to its ability to reveal information about random variables, making it essential in studies that require privacy-preserving methods to protect sensitive data. This paper introduces a novel privacy-preserving CDF method inspired by the functional analysis and functional mechanism. Our approach projects the empirical CDF into a predefined space, approximating it using specific functions, and protects the coefficients to achieve a differentially private empirical CDF. Compared to existing methods like histogram queries and adaptive quantiles, our method is preferable in decentralized settings and scenarios where CDFs must be updated with newly collected data.

* 11 pages, 8 figures

Via

Access Paper or Ask Questions

Push Quantization-Aware Training Toward Full Precision Performances via Consistency Regularization

Feb 21, 2024

Junbiao Pang, Tianyang Cai, Baochang Zhang, Jiaqi Wu, Ye Tao

Abstract:Existing Quantization-Aware Training (QAT) methods intensively depend on the complete labeled dataset or knowledge distillation to guarantee the performances toward Full Precision (FP) accuracies. However, empirical results show that QAT still has inferior results compared to its FP counterpart. One question is how to push QAT toward or even surpass FP performances. In this paper, we address this issue from a new perspective by injecting the vicinal data distribution information to improve the generalization performances of QAT effectively. We present a simple, novel, yet powerful method introducing an Consistency Regularization (CR) for QAT. Concretely, CR assumes that augmented samples should be consistent in the latent feature space. Our method generalizes well to different network architectures and various QAT methods. Extensive experiments demonstrate that our approach significantly outperforms the current state-of-the-art QAT methods and even FP counterparts.

* 11 pages, 5 figures

Via

Access Paper or Ask Questions

Knowledge Graphs and Pre-trained Language Models enhanced Representation Learning for Conversational Recommender Systems

Dec 27, 2023

Zhangchi Qiu, Ye Tao, Shirui Pan, Alan Wee-Chung Liew

Abstract:Conversational recommender systems (CRS) utilize natural language interactions and dialogue history to infer user preferences and provide accurate recommendations. Due to the limited conversation context and background knowledge, existing CRSs rely on external sources such as knowledge graphs to enrich the context and model entities based on their inter-relations. However, these methods ignore the rich intrinsic information within entities. To address this, we introduce the Knowledge-Enhanced Entity Representation Learning (KERL) framework, which leverages both the knowledge graph and a pre-trained language model to improve the semantic understanding of entities for CRS. In our KERL framework, entity textual descriptions are encoded via a pre-trained language model, while a knowledge graph helps reinforce the representation of these entities. We also employ positional encoding to effectively capture the temporal information of entities in a conversation. The enhanced entity representation is then used to develop a recommender component that fuses both entity and contextual representations for more informed recommendations, as well as a dialogue component that generates informative entity-related information in the response text. A high-quality knowledge graph with aligned entity descriptions is constructed to facilitate our study, namely the Wiki Movie Knowledge Graph (WikiMKG). The experimental results show that KERL achieves state-of-the-art results in both recommendation and response generation tasks.

Via

Access Paper or Ask Questions

An Automata-Theoretic Approach to Synthesizing Binarized Neural Networks

Jul 29, 2023

Ye Tao, Wanwei Liu, Fu Song, Zhen Liang, Ji Wang, Hongxu Zhu

Abstract:Deep neural networks, (DNNs, a.k.a. NNs), have been widely used in various tasks and have been proven to be successful. However, the accompanied expensive computing and storage costs make the deployments in resource-constrained devices a significant concern. To solve this issue, quantization has emerged as an effective way to reduce the costs of DNNs with little accuracy degradation by quantizing floating-point numbers to low-width fixed-point representations. Quantized neural networks (QNNs) have been developed, with binarized neural networks (BNNs) restricted to binary values as a special case. Another concern about neural networks is their vulnerability and lack of interpretability. Despite the active research on trustworthy of DNNs, few approaches have been proposed to QNNs. To this end, this paper presents an automata-theoretic approach to synthesizing BNNs that meet designated properties. More specifically, we define a temporal logic, called BLTL, as the specification language. We show that each BLTL formula can be transformed into an automaton on finite words. To deal with the state-explosion problem, we provide a tableau-based approach in real implementation. For the synthesis procedure, we utilize SMT solvers to detect the existence of a model (i.e., a BNN) in the construction process. Notably, synthesis provides a way to determine the hyper-parameters of the network before training.Moreover, we experimentally evaluate our approach and demonstrate its effectiveness in improving the individual fairness and local robustness of BNNs while maintaining accuracy to a great extent.

Via

Access Paper or Ask Questions

Time-to-Collision-Aware Lane-Change Strategy Based on Potential Field and Cubic Polynomial for Autonomous Vehicles

Jun 12, 2023

Pengfei Lin, Ehsan Javanmardi, Ye Tao, Vishal Chauhan, Jin Nakazato, Manabu Tsukada

Abstract:Making safe and successful lane changes (LCs) is one of the many vitally important functions of autonomous vehicles (AVs) that are needed to ensure safe driving on expressways. Recently, the simplicity and real-time performance of the potential field (PF) method have been leveraged to design decision and planning modules for AVs. However, the LC trajectory planned by the PF method is usually lengthy and takes the ego vehicle laterally parallel and close to the obstacle vehicle, which creates a dangerous situation if the obstacle vehicle suddenly steers. To mitigate this risk, we propose a time-to-collision-aware LC (TTCA-LC) strategy based on the PF and cubic polynomial in which the TTC constraint is imposed in the optimized curve fitting. The proposed approach is evaluated using MATLAB/Simulink under high-speed conditions in a comparative driving scenario. The simulation results indicate that the TTCA-LC method performs better than the conventional PF-based LC (CPF-LC) method in generating shorter, safer, and smoother trajectories. The length of the LC trajectory is shortened by over 27.1\%, and the curvature is reduced by approximately 56.1\% compared with the CPF-LC method.

* Accepted in IEEE Intelligent Vehicles Symposium (IV) 2023

Via

Access Paper or Ask Questions