Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Patricio Loncomilla

YotoR-You Only Transform One Representation

May 30, 2024

José Ignacio Díaz Villa, Patricio Loncomilla, Javier Ruiz-del-Solar

Figure 1 for YotoR-You Only Transform One Representation

Figure 2 for YotoR-You Only Transform One Representation

Figure 3 for YotoR-You Only Transform One Representation

Figure 4 for YotoR-You Only Transform One Representation

Abstract:This paper introduces YotoR (You Only Transform One Representation), a novel deep learning model for object detection that combines Swin Transformers and YoloR architectures. Transformers, a revolutionary technology in natural language processing, have also significantly impacted computer vision, offering the potential to enhance accuracy and computational efficiency. YotoR combines the robust Swin Transformer backbone with the YoloR neck and head. In our experiments, YotoR models TP5 and BP4 consistently outperform YoloR P6 and Swin Transformers in various evaluations, delivering improved object detection performance and faster inference speeds than Swin Transformer models. These results highlight the potential for further model combinations and improvements in real-time object detection with Transformers. The paper concludes by emphasizing the broader implications of YotoR, including its potential to enhance transformer-based models for image-related tasks.

* 16 pages, 5 figures

Via

Access Paper or Ask Questions

A Survey on Deep Learning Methods for Robot Vision

Mar 28, 2018

Javier Ruiz-del-Solar, Patricio Loncomilla, Naiomi Soto

Figure 1 for A Survey on Deep Learning Methods for Robot Vision

Figure 2 for A Survey on Deep Learning Methods for Robot Vision

Figure 3 for A Survey on Deep Learning Methods for Robot Vision

Figure 4 for A Survey on Deep Learning Methods for Robot Vision

Abstract:Deep learning has allowed a paradigm shift in pattern recognition, from using hand-crafted features together with statistical classifiers to using general-purpose learning procedures for learning data-driven representations, features, and classifiers together. The application of this new paradigm has been particularly successful in computer vision, in which the development of deep learning methods for vision applications has become a hot research topic. Given that deep learning has already attracted the attention of the robot vision community, the main purpose of this survey is to address the use of deep learning in robot vision. To achieve this, a comprehensive overview of deep learning and its usage in computer vision is given, that includes a description of the most frequently used neural models and their main application areas. Then, the standard methodology and tools used for designing deep-learning based vision systems are presented. Afterwards, a review of the principal work using deep learning in robot vision is presented, as well as current and future trends related to the use of deep learning in robotics. This survey is intended to be a guide for the developers of robot vision systems.

Via

Access Paper or Ask Questions