Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Konstantinos P. Panousis

Clarity: The Flexibility-Interpretability Trade-Off in Sparsity-aware Concept Bottleneck Models

Jan 29, 2026

Konstantinos P. Panousis, Diego Marcos

Abstract:The widespread adoption of Vision-Language Models (VLMs) across fields has amplified concerns about model interpretability. Distressingly, these models are often treated as black-boxes, with limited or non-existent investigation of their decision making process. Despite numerous post- and ante-hoc interepretability methods, systematic and objective evaluation of the learned representations remains limited, particularly for sparsity-aware methods that are increasingly considered to "induce interpretability". In this work, we focus on Concept Bottleneck Models and investigate how different modeling decisions affect the emerging representations. We introduce the notion of clarity, a measure, capturing the interplay between the downstream performance and the sparsity and precision of the concept representation, while proposing an interpretability assessment framework using datasets with ground truth concept annotations. We consider both VLM- and attribute predictor-based CBMs, and three different sparsity-inducing strategies: per example $\ell_1, \ell_0$ and Bernoulli-based formulations. Our experiments reveal a critical trade-off between flexibility and interpretability, under which a given method can exhibit markedly different behaviors even at comparable performance levels. The code will be made publicly available upon publication.

Via

Access Paper or Ask Questions

DISCOVER: Making Vision Networks Interpretable via Competition and Dissection

Oct 07, 2023

Konstantinos P. Panousis, Sotirios Chatzis

Figure 1 for DISCOVER: Making Vision Networks Interpretable via Competition and Dissection

Figure 2 for DISCOVER: Making Vision Networks Interpretable via Competition and Dissection

Figure 3 for DISCOVER: Making Vision Networks Interpretable via Competition and Dissection

Figure 4 for DISCOVER: Making Vision Networks Interpretable via Competition and Dissection

Abstract:Modern deep networks are highly complex and their inferential outcome very hard to interpret. This is a serious obstacle to their transparent deployment in safety-critical or bias-aware applications. This work contributes to post-hoc interpretability, and specifically Network Dissection. Our goal is to present a framework that makes it easier to discover the individual functionality of each neuron in a network trained on a vision task; discovery is performed in terms of textual description generation. To achieve this objective, we leverage: (i) recent advances in multimodal vision-text models and (ii) network layers founded upon the novel concept of stochastic local competition between linear units. In this setting, only a small subset of layer neurons are activated for a given input, leading to extremely high activation sparsity (as low as only $\approx 4\%$). Crucially, our proposed method infers (sparse) neuron activation patterns that enables the neurons to activate/specialize to inputs with specific characteristics, diversifying their individual functionality. This capacity of our method supercharges the potential of dissection processes: human understandable descriptions are generated only for the very few active neurons, thus facilitating the direct investigation of the network's decision process. As we experimentally show, our approach: (i) yields Vision Networks that retain or improve classification performance, and (ii) realizes a principled framework for text-based description and examination of the generated neuronal representations.

* Accepted @ NeuIPS 2023

Via

Access Paper or Ask Questions

A New Dataset for End-to-End Sign Language Translation: The Greek Elementary School Dataset

Oct 07, 2023

Andreas Voskou, Konstantinos P. Panousis, Harris Partaourides, Kyriakos Tolias, Sotirios Chatzis

Figure 1 for A New Dataset for End-to-End Sign Language Translation: The Greek Elementary School Dataset

Figure 2 for A New Dataset for End-to-End Sign Language Translation: The Greek Elementary School Dataset

Figure 3 for A New Dataset for End-to-End Sign Language Translation: The Greek Elementary School Dataset

Figure 4 for A New Dataset for End-to-End Sign Language Translation: The Greek Elementary School Dataset

Abstract:Automatic Sign Language Translation (SLT) is a research avenue of great societal impact. End-to-End SLT facilitates the interaction of Hard-of-Hearing (HoH) with hearing people, thus improving their social life and opportunities for participation in social life. However, research within this frame of reference is still in its infancy, and current resources are particularly limited. Existing SLT methods are either of low translation ability or are trained and evaluated on datasets of restricted vocabulary and questionable real-world value. A characteristic example is Phoenix2014T benchmark dataset, which only covers weather forecasts in German Sign Language. To address this shortage of resources, we introduce a newly constructed collection of 29653 Greek Sign Language video-translation pairs which is based on the official syllabus of Greek Elementary School. Our dataset covers a wide range of subjects. We use this novel dataset to train recent state-of-the-art Transformer-based methods widely used in SLT research. Our results demonstrate the potential of our introduced dataset to advance SLT research by offering a favourable balance between usability and real-world value.

* Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023. p. 1966-1975
* ICCVW2023 - ACVR

Via

Access Paper or Ask Questions

Hierarchical Concept Discovery Models: A Concept Pyramid Scheme

Oct 03, 2023

Konstantinos P. Panousis, Dino Ienco, Diego Marcos

Figure 1 for Hierarchical Concept Discovery Models: A Concept Pyramid Scheme

Figure 2 for Hierarchical Concept Discovery Models: A Concept Pyramid Scheme

Figure 3 for Hierarchical Concept Discovery Models: A Concept Pyramid Scheme

Figure 4 for Hierarchical Concept Discovery Models: A Concept Pyramid Scheme

Abstract:Deep Learning algorithms have recently gained significant attention due to their impressive performance. However, their high complexity and un-interpretable mode of operation hinders their confident deployment in real-world safety-critical tasks. This work targets ante hoc interpretability, and specifically Concept Bottleneck Models (CBMs). Our goal is to design a framework that admits a highly interpretable decision making process with respect to human understandable concepts, on multiple levels of granularity. To this end, we propose a novel hierarchical concept discovery formulation leveraging: (i) recent advances in image-text models, and (ii) an innovative formulation for multi-level concept selection via data-driven and sparsity inducing Bayesian arguments. Within this framework, concept information does not solely rely on the similarity between the whole image and general unstructured concepts; instead, we introduce the notion of concept hierarchy to uncover and exploit more granular concept information residing in patch-specific regions of the image scene. As we experimentally show, the proposed construction not only outperforms recent CBM approaches, but also yields a principled framework towards interpetability.

Via

Access Paper or Ask Questions

Sparse Linear Concept Discovery Models

Aug 21, 2023

Konstantinos P. Panousis, Dino Ienco, Diego Marcos

Figure 1 for Sparse Linear Concept Discovery Models

Figure 2 for Sparse Linear Concept Discovery Models

Figure 3 for Sparse Linear Concept Discovery Models

Figure 4 for Sparse Linear Concept Discovery Models

Abstract:The recent mass adoption of DNNs, even in safety-critical scenarios, has shifted the focus of the research community towards the creation of inherently intrepretable models. Concept Bottleneck Models (CBMs) constitute a popular approach where hidden layers are tied to human understandable concepts allowing for investigation and correction of the network's decisions. However, CBMs usually suffer from: (i) performance degradation and (ii) lower interpretability than intended due to the sheer amount of concepts contributing to each decision. In this work, we propose a simple yet highly intuitive interpretable framework based on Contrastive Language Image models and a single sparse linear layer. In stark contrast to related approaches, the sparsity in our framework is achieved via principled Bayesian arguments by inferring concept presence via a data-driven Bernoulli distribution. As we experimentally show, our framework not only outperforms recent CBM approaches accuracy-wise, but it also yields high per example concept sparsity, facilitating the individual investigation of the emerging concepts.

* Accepted @ ICCVW CLVL 2023

Via

Access Paper or Ask Questions

Competing Mutual Information Constraints with Stochastic Competition-based Activations for Learning Diversified Representations

Jan 10, 2022

Konstantinos P. Panousis, Anastasios Antoniadis, Sotirios Chatzis

Figure 1 for Competing Mutual Information Constraints with Stochastic Competition-based Activations for Learning Diversified Representations

Figure 2 for Competing Mutual Information Constraints with Stochastic Competition-based Activations for Learning Diversified Representations

Figure 3 for Competing Mutual Information Constraints with Stochastic Competition-based Activations for Learning Diversified Representations

Figure 4 for Competing Mutual Information Constraints with Stochastic Competition-based Activations for Learning Diversified Representations

Abstract:This work aims to address the long-established problem of learning diversified representations. To this end, we combine information-theoretic arguments with stochastic competition-based activations, namely Stochastic Local Winner-Takes-All (LWTA) units. In this context, we ditch the conventional deep architectures commonly used in Representation Learning, that rely on non-linear activations; instead, we replace them with sets of locally and stochastically competing linear units. In this setting, each network layer yields sparse outputs, determined by the outcome of the competition between units that are organized into blocks of competitors. We adopt stochastic arguments for the competition mechanism, which perform posterior sampling to determine the winner of each block. We further endow the considered networks with the ability to infer the sub-part of the network that is essential for modeling the data at hand; we impose appropriate stick-breaking priors to this end. To further enrich the information of the emerging representations, we resort to information-theoretic principles, namely the Information Competing Process (ICP). Then, all the components are tied together under the stochastic Variational Bayes framework for inference. We perform a thorough experimental investigation for our approach using benchmark datasets on image classification. As we experimentally show, the resulting networks yield significant discriminative representation learning abilities. In addition, the introduced paradigm allows for a principled investigation mechanism of the emerging intermediate network representations.

* Accepted AAAI-22

Via

Access Paper or Ask Questions

Stochastic Local Winner-Takes-All Networks Enable Profound Adversarial Robustness

Dec 05, 2021

Konstantinos P. Panousis, Sotirios Chatzis, Sergios Theodoridis

Figure 1 for Stochastic Local Winner-Takes-All Networks Enable Profound Adversarial Robustness

Figure 2 for Stochastic Local Winner-Takes-All Networks Enable Profound Adversarial Robustness

Figure 3 for Stochastic Local Winner-Takes-All Networks Enable Profound Adversarial Robustness

Figure 4 for Stochastic Local Winner-Takes-All Networks Enable Profound Adversarial Robustness

Abstract:This work explores the potency of stochastic competition-based activations, namely Stochastic Local Winner-Takes-All (LWTA), against powerful (gradient-based) white-box and black-box adversarial attacks; we especially focus on Adversarial Training settings. In our work, we replace the conventional ReLU-based nonlinearities with blocks comprising locally and stochastically competing linear units. The output of each network layer now yields a sparse output, depending on the outcome of winner sampling in each block. We rely on the Variational Bayesian framework for training and inference; we incorporate conventional PGD-based adversarial training arguments to increase the overall adversarial robustness. As we experimentally show, the arising networks yield state-of-the-art robustness against powerful adversarial attacks while retaining very high classification rate in the benign case.

* Bayesian Deep Learning Workshop, NeurIPS 2021

Via

Access Paper or Ask Questions

Stochastic Transformer Networks with Linear Competing Units: Application to end-to-end SL Translation

Oct 01, 2021

Andreas Voskou, Konstantinos P. Panousis, Dimitrios Kosmopoulos, Dimitris N. Metaxas, Sotirios Chatzis

Figure 1 for Stochastic Transformer Networks with Linear Competing Units: Application to end-to-end SL Translation

Figure 2 for Stochastic Transformer Networks with Linear Competing Units: Application to end-to-end SL Translation

Figure 3 for Stochastic Transformer Networks with Linear Competing Units: Application to end-to-end SL Translation

Figure 4 for Stochastic Transformer Networks with Linear Competing Units: Application to end-to-end SL Translation

Abstract:Automating sign language translation (SLT) is a challenging real world application. Despite its societal importance, though, research progress in the field remains rather poor. Crucially, existing methods that yield viable performance necessitate the availability of laborious to obtain gloss sequence groundtruth. In this paper, we attenuate this need, by introducing an end-to-end SLT model that does not entail explicit use of glosses; the model only needs text groundtruth. This is in stark contrast to existing end-to-end models that use gloss sequence groundtruth, either in the form of a modality that is recognized at an intermediate model stage, or in the form of a parallel output process, jointly trained with the SLT model. Our approach constitutes a Transformer network with a novel type of layers that combines: (i) local winner-takes-all (LWTA) layers with stochastic winner sampling, instead of conventional ReLU layers, (ii) stochastic weights with posterior distributions estimated via variational inference, and (iii) a weight compression technique at inference time that exploits estimated posterior variance to perform massive, almost lossless compression. We demonstrate that our approach can reach the currently best reported BLEU-4 score on the PHOENIX 2014T benchmark, but without making use of glosses for model training, and with a memory footprint reduced by more than 70%.

* In Proceedings of ICCV 2021

Via

Access Paper or Ask Questions

Local Competition and Stochasticity for Adversarial Robustness in Deep Learning

Jan 04, 2021

Konstantinos P. Panousis, Sotirios Chatzis, Antonios Alexos, Sergios Theodoridis

Figure 1 for Local Competition and Stochasticity for Adversarial Robustness in Deep Learning

Figure 2 for Local Competition and Stochasticity for Adversarial Robustness in Deep Learning

Figure 3 for Local Competition and Stochasticity for Adversarial Robustness in Deep Learning

Figure 4 for Local Competition and Stochasticity for Adversarial Robustness in Deep Learning

Abstract:This work addresses adversarial robustness in deep learning by considering deep networks with stochastic local winner-takes-all (LWTA) nonlinearities. This type of network units result in sparse representations from each model layer, as the units are organized in blocks where only one unit generates non-zero output. The main operating principle of the introduced units lies on stochastic arguments, as the network performs posterior sampling over competing units to select the winner. We combine these LWTA arguments with tools from the field of Bayesian non-parametrics, specifically the stick-breaking construction of the Indian Buffet Process, to allow for inferring the sub-part of each layer that is essential for modeling the data at hand. Inference for the proposed network is performed by means of stochastic variational Bayes. We perform a thorough experimental evaluation of our model using benchmark datasets, assuming gradient-based adversarial attacks. As we show, our method achieves high robustness to adversarial perturbations, with state-of-the-art performance in powerful white-box attacks.

* Under Review AISTATS 2021. arXiv admin note: substantial text overlap with arXiv:2006.10620

Via

Access Paper or Ask Questions

Local Competition and Uncertainty for Adversarial Robustness in Deep Learning

Jun 18, 2020

Antonios Alexos, Konstantinos P. Panousis, Sotirios Chatzis

Figure 1 for Local Competition and Uncertainty for Adversarial Robustness in Deep Learning

Figure 2 for Local Competition and Uncertainty for Adversarial Robustness in Deep Learning

Figure 3 for Local Competition and Uncertainty for Adversarial Robustness in Deep Learning

Figure 4 for Local Competition and Uncertainty for Adversarial Robustness in Deep Learning

Abstract:This work attempts to address adversarial robustness of deep networks by means of novel learning arguments. Specifically, inspired from results in neuroscience, we propose a local competition principle as a means of adversarially-robust deep learning. We argue that novel local winner-takes-all (LWTA) nonlinearities, combined with posterior sampling schemes, can greatly improve the adversarial robustness of traditional deep networks against difficult adversarial attack schemes. We combine these LWTA arguments with tools from the field of Bayesian non-parametrics, specifically the stick-breaking construction of the Indian Buffet Process, to flexibly account for the inherent uncertainty in data-driven modeling. As we experimentally show, the new proposed model achieves high robustness to adversarial perturbations on MNIST and CIFAR10 datasets. Our model achieves state-of-the-art results in powerful white-box attacks, while at the same time retaining its benign accuracy to a high degree. Equally importantly, our approach achieves this result while requiring far less trainable model parameters than the existing state-of-the-art.

* under review

Via

Access Paper or Ask Questions