Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Not just Birds and Cars: Generic, Scalable and Explainable Models for Professional Visual Recognition

Mar 08, 2024

Junde Wu, Jiayuan Zhu, Min Xu, Yueming Jin

Figure 1 for Not just Birds and Cars: Generic, Scalable and Explainable Models for Professional Visual Recognition

Figure 2 for Not just Birds and Cars: Generic, Scalable and Explainable Models for Professional Visual Recognition

Figure 3 for Not just Birds and Cars: Generic, Scalable and Explainable Models for Professional Visual Recognition

Figure 4 for Not just Birds and Cars: Generic, Scalable and Explainable Models for Professional Visual Recognition

Share this with someone who'll enjoy it:

Abstract:Some visual recognition tasks are more challenging then the general ones as they require professional categories of images. The previous efforts, like fine-grained vision classification, primarily introduced models tailored to specific tasks, like identifying bird species or car brands with limited scalability and generalizability. This paper aims to design a scalable and explainable model to solve Professional Visual Recognition tasks from a generic standpoint. We introduce a biologically-inspired structure named Pro-NeXt and reveal that Pro-NeXt exhibits substantial generalizability across diverse professional fields such as fashion, medicine, and art-areas previously considered disparate. Our basic-sized Pro-NeXt-B surpasses all preceding task-specific models across 12 distinct datasets within 5 diverse domains. Furthermore, we find its good scaling property that scaling up Pro-NeXt in depth and width with increasing GFlops can consistently enhances its accuracy. Beyond scalability and adaptability, the intermediate features of Pro-NeXt achieve reliable object detection and segmentation performance without extra training, highlighting its solid explainability. We will release the code to foster further research in this area.

* 20 pages including reference. arXiv admin note: text overlap with arXiv:2211.15672

View paper on

Share this with someone who'll enjoy it:

Title:Not just Birds and Cars: Generic, Scalable and Explainable Models for Professional Visual Recognition

Paper and Code