Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yongzhe Zhang

ChallengeMe: An Adversarial Learning-enabled Text Summarization Framework

Feb 07, 2025

Xiaoyu Deng, Ye Zhang, Tianmin Guo, Yongzhe Zhang, Zhengjian Kang, Hang Yang

Abstract:The astonishing performance of large language models (LLMs) and their remarkable achievements in production and daily life have led to their widespread application in collaborative tasks. However, current large models face challenges such as hallucination and lack of specificity in content generation in vertical domain tasks. Inspired by the contrast and classification mechanisms in human cognitive processes, this paper constructs an adversarial learning-based prompt framework named ChallengeMe, which includes three cascaded solutions: generation prompts, evaluation prompts, and feedback optimization. In this process, we designed seven core optimization dimensions and set the threshold for adversarial learning. The results of mixed case studies on the text summarization task show that the proposed framework can generate more accurate and fluent text summaries compared to the current advanced mainstream LLMs.

Via

Access Paper or Ask Questions

LP-DETR: Layer-wise Progressive Relations for Object Detection

Feb 07, 2025

Zhengjian Kang, Ye Zhang, Xiaoyu Deng, Xintao Li, Yongzhe Zhang

Figure 1 for LP-DETR: Layer-wise Progressive Relations for Object Detection

Figure 2 for LP-DETR: Layer-wise Progressive Relations for Object Detection

Figure 3 for LP-DETR: Layer-wise Progressive Relations for Object Detection

Figure 4 for LP-DETR: Layer-wise Progressive Relations for Object Detection

Abstract:This paper presents LP-DETR (Layer-wise Progressive DETR), a novel approach that enhances DETR-based object detection through multi-scale relation modeling. Our method introduces learnable spatial relationships between object queries through a relation-aware self-attention mechanism, which adaptively learns to balance different scales of relations (local, medium and global) across decoder layers. This progressive design enables the model to effectively capture evolving spatial dependencies throughout the detection pipeline. Extensive experiments on COCO 2017 dataset demonstrate that our method improves both convergence speed and detection accuracy compared to standard self-attention module. The proposed method achieves competitive results, reaching 52.3\% AP with 12 epochs and 52.5\% AP with 24 epochs using ResNet-50 backbone, and further improving to 58.0\% AP with Swin-L backbone. Furthermore, our analysis reveals an interesting pattern: the model naturally learns to prioritize local spatial relations in early decoder layers while gradually shifting attention to broader contexts in deeper layers, providing valuable insights for future research in object detection.

* 7 pages, 4 figures

Via

Access Paper or Ask Questions

CoVis: A Collaborative Framework for Fine-grained Graphic Visual Understanding

Nov 27, 2024

Xiaoyu Deng, Zhengjian Kang, Xintao Li, Yongzhe Zhang, Tianmin Guo

Figure 1 for CoVis: A Collaborative Framework for Fine-grained Graphic Visual Understanding

Figure 2 for CoVis: A Collaborative Framework for Fine-grained Graphic Visual Understanding

Figure 3 for CoVis: A Collaborative Framework for Fine-grained Graphic Visual Understanding

Figure 4 for CoVis: A Collaborative Framework for Fine-grained Graphic Visual Understanding

Abstract:Graphic visual content helps in promoting information communication and inspiration divergence. However, the interpretation of visual content currently relies mainly on humans' personal knowledge background, thereby affecting the quality and efficiency of information acquisition and understanding. To improve the quality and efficiency of visual information transmission and avoid the limitation of the observer due to the information cocoon, we propose CoVis, a collaborative framework for fine-grained visual understanding. By designing and implementing a cascaded dual-layer segmentation network coupled with a large-language-model (LLM) based content generator, the framework extracts as much knowledge as possible from an image. Then, it generates visual analytics for images, assisting observers in comprehending imagery from a more holistic perspective. Quantitative experiments and qualitative experiments based on 32 human participants indicate that the CoVis has better performance than current methods in feature extraction and can generate more comprehensive and detailed visual descriptions than current general-purpose large models.

Via

Access Paper or Ask Questions

MT-SNN: Enhance Spiking Neural Network with Multiple Thresholds

Mar 20, 2023

Xiaoting Wang, Yanxiang Zhang, Yongzhe Zhang

Figure 1 for MT-SNN: Enhance Spiking Neural Network with Multiple Thresholds

Figure 2 for MT-SNN: Enhance Spiking Neural Network with Multiple Thresholds

Figure 3 for MT-SNN: Enhance Spiking Neural Network with Multiple Thresholds

Figure 4 for MT-SNN: Enhance Spiking Neural Network with Multiple Thresholds

Abstract:Spiking neural networks (SNNs), as a biology-inspired method mimicking the spiking nature of brain neurons, is a promising energy-efficient alternative to the traditional artificial neural networks (ANNs). The energy saving of SNNs is mainly from multiplication free property brought by binarized intermediate activations. In this paper, we proposed a Multiple Threshold (MT) approach to alleviate the precision loss brought by the binarized activations, such that SNNs can reach higher accuracy at fewer steps. We evaluate the approach on CIFAR10, CIFAR100 and DVS-CIFAR10, and demonstrate that MT can promote SNNs extensively, especially at early steps. For example, With MT, Parametric-Leaky-Integrate-Fire(PLIF) based VGG net can even outperform the ANN counterpart with 1 step.

* 10 pages, 5 figures, 3 tables

Via

Access Paper or Ask Questions