Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mateusz Pach

LucidPPN: Unambiguous Prototypical Parts Network for User-centric Interpretable Computer Vision

May 23, 2024

Mateusz Pach, Dawid Rymarczyk, Koryna Lewandowska, Jacek Tabor, Bartosz Zieliński

Figure 1 for LucidPPN: Unambiguous Prototypical Parts Network for User-centric Interpretable Computer Vision

Figure 2 for LucidPPN: Unambiguous Prototypical Parts Network for User-centric Interpretable Computer Vision

Figure 3 for LucidPPN: Unambiguous Prototypical Parts Network for User-centric Interpretable Computer Vision

Figure 4 for LucidPPN: Unambiguous Prototypical Parts Network for User-centric Interpretable Computer Vision

Abstract:Prototypical parts networks combine the power of deep learning with the explainability of case-based reasoning to make accurate, interpretable decisions. They follow the this looks like that reasoning, representing each prototypical part with patches from training images. However, a single image patch comprises multiple visual features, such as color, shape, and texture, making it difficult for users to identify which feature is important to the model. To reduce this ambiguity, we introduce the Lucid Prototypical Parts Network (LucidPPN), a novel prototypical parts network that separates color prototypes from other visual features. Our method employs two reasoning branches: one for non-color visual features, processing grayscale images, and another focusing solely on color information. This separation allows us to clarify whether the model's decisions are based on color, shape, or texture. Additionally, LucidPPN identifies prototypical parts corresponding to semantic parts of classified objects, making comparisons between data classes more intuitive, e.g., when two bird species might differ primarily in belly color. Our experiments demonstrate that the two branches are complementary and together achieve results comparable to baseline methods. More importantly, LucidPPN generates less ambiguous prototypical parts, enhancing user understanding.

* Work in the review process. The code will be available upon acceptance

Via

Access Paper or Ask Questions

Token Recycling for Efficient Sequential Inference with Vision Transformers

Nov 26, 2023

Jan Olszewski, Dawid Rymarczyk, Piotr Wójcik, Mateusz Pach, Bartosz Zieliński

Figure 1 for Token Recycling for Efficient Sequential Inference with Vision Transformers

Figure 2 for Token Recycling for Efficient Sequential Inference with Vision Transformers

Figure 3 for Token Recycling for Efficient Sequential Inference with Vision Transformers

Figure 4 for Token Recycling for Efficient Sequential Inference with Vision Transformers

Abstract:Vision Transformers (ViTs) overpass Convolutional Neural Networks in processing incomplete inputs because they do not require the imputation of missing values. Therefore, ViTs are well suited for sequential decision-making, e.g. in the Active Visual Exploration problem. However, they are computationally inefficient because they perform a full forward pass each time a piece of new sequential information arrives. To reduce this computational inefficiency, we introduce the TOken REcycling (TORE) modification for the ViT inference, which can be used with any architecture. TORE divides ViT into two parts, iterator and aggregator. An iterator processes sequential information separately into midway tokens, which are cached. The aggregator processes midway tokens jointly to obtain the prediction. This way, we can reuse the results of computations made by iterator. Except for efficient sequential inference, we propose a complementary training policy, which significantly reduces the computational burden associated with sequential decision-making while achieving state-of-the-art accuracy.

* The code will be released upon acceptance

Via

Access Paper or Ask Questions