Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Divya Jyoti Bajpai

FREE: Fast and Robust Vision Language Models with Early Exits

Jun 07, 2025

Divya Jyoti Bajpai, Manjesh Kumar Hanawal

Abstract:In recent years, Vision-Language Models (VLMs) have shown remarkable performance improvements in Vision-Language tasks. However, their large size poses challenges for real-world applications where inference latency is a concern. To tackle this issue, we propose employing Early Exit (EE) strategies in VLMs. However, training exit classifiers in VLMs is challenging, particularly with limited labeled training data. To address this, we introduce FREE, an adversarial training approach within a GAN-based framework. Here, each exit consists of a transformer layer and a classifier. The transformer layer is adversarially trained to produce feature representations similar to the final layer, while a feature classifier serves as the discriminator. Our method focuses on performing input-adaptive inference that increases inference speed with minimal drop in performance. Experimental results demonstrate the effectiveness of our approach in enhancing accuracy and model robustness by mitigating overthinking and the phenomenon of mid-crisis that we highlight. We experimentally validate that our method speeds up the inference process by more than 1.51x while retaining comparable performance. The source code is available at https://github.com/Div290/FREE.

* To appear at the Association of Computational Linguistics (ACL) 2025 Conference

Via

Access Paper or Ask Questions

BEEM: Boosting Performance of Early Exit DNNs using Multi-Exit Classifiers as Experts

Feb 02, 2025

Divya Jyoti Bajpai, Manjesh Kumar Hanawal

Abstract:Early Exit (EE) techniques have emerged as a means to reduce inference latency in Deep Neural Networks (DNNs). The latency improvement and accuracy in these techniques crucially depend on the criteria used to make exit decisions. We propose a new decision criterion where exit classifiers are treated as experts BEEM and aggregate their confidence scores. The confidence scores are aggregated only if neighbouring experts are consistent in prediction as the samples pass through them, thus capturing their ensemble effect. A sample exits when the aggregated confidence value exceeds a threshold. The threshold is set using the error rates of the intermediate exits aiming to surpass the performance of conventional DNN inference. Experimental results on the COCO dataset for Image captioning and GLUE datasets for various language tasks demonstrate that our method enhances the performance of state-of-the-art EE methods, achieving improvements in speed-up by a factor 1.5x to 2.1x. When compared to the final layer, its accuracy is comparable in harder Image Captioning and improves in the easier language tasks. The source code for this work is publicly available at https://github.com/Div290/BEEM1/tree/main

* Published at International Conference on Learning Representations (ICLR) 2025

Via

Access Paper or Ask Questions

A Survey of Early Exit Deep Neural Networks in NLP

Jan 13, 2025

Divya Jyoti Bajpai, Manjesh Kumar Hanawal

Abstract:Deep Neural Networks (DNNs) have grown increasingly large in size to achieve state of the art performance across a wide range of tasks. However, their high computational requirements make them less suitable for resource-constrained applications. Also, real-world datasets often consist of a mixture of easy and complex samples, necessitating adaptive inference mechanisms that account for sample difficulty. Early exit strategies offer a promising solution by enabling adaptive inference, where simpler samples are classified using the initial layers of the DNN, thereby accelerating the overall inference process. By attaching classifiers at different layers, early exit methods not only reduce inference latency but also improve the model robustness against adversarial attacks. This paper presents a comprehensive survey of early exit methods and their applications in NLP.

Via

Access Paper or Ask Questions

Distributed Inference on Mobile Edge and Cloud: An Early Exit based Clustering Approach

Oct 06, 2024

Divya Jyoti Bajpai, Manjesh Kumar Hanawal

Figure 1 for Distributed Inference on Mobile Edge and Cloud: An Early Exit based Clustering Approach

Figure 2 for Distributed Inference on Mobile Edge and Cloud: An Early Exit based Clustering Approach

Figure 3 for Distributed Inference on Mobile Edge and Cloud: An Early Exit based Clustering Approach

Figure 4 for Distributed Inference on Mobile Edge and Cloud: An Early Exit based Clustering Approach

Abstract:Recent advances in Deep Neural Networks (DNNs) have demonstrated outstanding performance across various domains. However, their large size is a challenge for deployment on resource-constrained devices such as mobile, edge, and IoT platforms. To overcome this, a distributed inference setup can be used where a small-sized DNN (initial few layers) can be deployed on mobile, a bigger version on the edge, and the full-fledged, on the cloud. A sample that has low complexity (easy) could be then inferred on mobile, that has moderate complexity (medium) on edge, and higher complexity (hard) on the cloud. As the complexity of each sample is not known beforehand, the following question arises in distributed inference: how to decide complexity so that it is processed by enough layers of DNNs. We develop a novel approach named DIMEE that utilizes Early Exit (EE) strategies developed to minimize inference latency in DNNs. DIMEE aims to improve the accuracy, taking into account the offloading cost from mobile to edge/cloud. Experimental validation on GLUE datasets, encompassing various NLP tasks, shows that our method significantly reduces the inference cost (> 43%) while maintaining a minimal drop in accuracy (< 0.3%) compared to the case where all the inference is made in cloud.

* 8 pages, 3 figures

Via

Access Paper or Ask Questions

CAPEEN: Image Captioning with Early Exits and Knowledge Distillation

Oct 06, 2024

Divya Jyoti Bajpai, Manjesh Kumar Hanawal

Figure 1 for CAPEEN: Image Captioning with Early Exits and Knowledge Distillation

Figure 2 for CAPEEN: Image Captioning with Early Exits and Knowledge Distillation

Figure 3 for CAPEEN: Image Captioning with Early Exits and Knowledge Distillation

Figure 4 for CAPEEN: Image Captioning with Early Exits and Knowledge Distillation

Abstract:Deep neural networks (DNNs) have made significant progress in recognizing visual elements and generating descriptive text in image-captioning tasks. However, their improved performance comes from increased computational burden and inference latency. Early Exit (EE) strategies can be used to enhance their efficiency, but their adaptation presents challenges in image captioning as it requires varying levels of semantic information for accurate predictions. To overcome this, we introduce CAPEEN to improve the performance of EE strategies using knowledge distillation. Inference in CAPEEN is completed at intermediary layers if prediction confidence exceeds a predefined value learned from the training data. To account for real-world deployments, where target distributions could drift from that of training samples, we introduce a variant A-CAPEEN to adapt the thresholds on the fly using Multiarmed bandits framework. Experiments on the MS COCO and Flickr30k datasets show that CAPEEN gains speedup of 1.77x while maintaining competitive performance compared to the final layer, and A-CAPEEN additionally offers robustness against distortions. The source code is available at https://github.com/Div290/CapEEN

* To appear in EMNLP (finding) 2024

Via

Access Paper or Ask Questions

DAdEE: Unsupervised Domain Adaptation in Early Exit PLMs

Oct 06, 2024

Divya Jyoti Bajpai, Manjesh Kumar Hanawal

Figure 1 for DAdEE: Unsupervised Domain Adaptation in Early Exit PLMs

Figure 2 for DAdEE: Unsupervised Domain Adaptation in Early Exit PLMs

Figure 3 for DAdEE: Unsupervised Domain Adaptation in Early Exit PLMs

Figure 4 for DAdEE: Unsupervised Domain Adaptation in Early Exit PLMs

Abstract:Pre-trained Language Models (PLMs) exhibit good accuracy and generalization ability across various tasks using self-supervision, but their large size results in high inference latency. Early Exit (EE) strategies handle the issue by allowing the samples to exit from classifiers attached to the intermediary layers, but they do not generalize well, as exit classifiers can be sensitive to domain changes. To address this, we propose Unsupervised Domain Adaptation in EE framework (DADEE) that employs multi-level adaptation using knowledge distillation. DADEE utilizes GAN-based adversarial adaptation at each layer to achieve domain-invariant representations, reducing the domain gap between the source and target domain across all layers. The attached exits not only speed up inference but also enhance domain adaptation by reducing catastrophic forgetting and mode collapse, making it more suitable for real-world scenarios. Experiments on tasks such as sentiment analysis, entailment classification, and natural language inference demonstrate that DADEE consistently outperforms not only early exit methods but also various domain adaptation methods under domain shift scenarios. The anonymized source code is available at https://github.com/Div290/DAdEE.

* To appear in EMNLP (findings) 2024

Via

Access Paper or Ask Questions

CEEBERT: Cross-Domain Inference in Early Exit BERT

May 23, 2024

Divya Jyoti Bajpai, Manjesh Kumar Hanawal

Figure 1 for CEEBERT: Cross-Domain Inference in Early Exit BERT

Figure 2 for CEEBERT: Cross-Domain Inference in Early Exit BERT

Figure 3 for CEEBERT: Cross-Domain Inference in Early Exit BERT

Figure 4 for CEEBERT: Cross-Domain Inference in Early Exit BERT

Abstract:Pre-trained Language Models (PLMs), like BERT, with self-supervision objectives exhibit remarkable performance and generalization across various tasks. However, they suffer in inference latency due to their large size. To address this issue, side branches are attached at intermediate layers, enabling early inference of samples without requiring them to pass through all layers. However, the challenge is to decide which layer to infer and exit each sample so that the accuracy and latency are balanced. Moreover, the distribution of the samples to be inferred may differ from that used for training necessitating cross-domain adaptation. We propose an online learning algorithm named Cross-Domain Inference in Early Exit BERT (CeeBERT) that dynamically determines early exits of samples based on the level of confidence at each exit point. CeeBERT learns optimal thresholds from domain-specific confidence observed at intermediate layers on the fly, eliminating the need for labeled data. Experimental results on five distinct datasets with BERT and ALBERT models demonstrate CeeBERT's ability to improve latency by reducing unnecessary computations with minimal drop in performance. By adapting to the threshold values, CeeBERT can speed up the BERT/ALBERT models by $2\times$ - $3.5\times$ with minimal drop in accuracy.

* Accepted at ACL 2024

Via

Access Paper or Ask Questions

FAIR: Filtering of Automatically Induced Rules

Feb 23, 2024

Divya Jyoti Bajpai, Ayush Maheshwari, Manjesh Kumar Hanawal, Ganesh Ramakrishnan

Abstract:The availability of large annotated data can be a critical bottleneck in training machine learning algorithms successfully, especially when applied to diverse domains. Weak supervision offers a promising alternative by accelerating the creation of labeled training data using domain-specific rules. However, it requires users to write a diverse set of high-quality rules to assign labels to the unlabeled data. Automatic Rule Induction (ARI) approaches circumvent this problem by automatically creating rules from features on a small labeled set and filtering a final set of rules from them. In the ARI approach, the crucial step is to filter out a set of a high-quality useful subset of rules from the large set of automatically created rules. In this paper, we propose an algorithm (Filtering of Automatically Induced Rules) to filter rules from a large number of automatically induced rules using submodular objective functions that account for the collective precision, coverage, and conflicts of the rule set. We experiment with three ARI approaches and five text classification datasets to validate the superior performance of our algorithm with respect to several semi-supervised label aggregation approaches. Further, we show that achieves statistically significant results in comparison to existing rule-filtering approaches.

* Published at EACL 2024

Via

Access Paper or Ask Questions

I-SplitEE: Image classification in Split Computing DNNs with Early Exits

Jan 19, 2024

Divya Jyoti Bajpai, Aastha Jaiswal, Manjesh Kumar Hanawal

Figure 1 for I-SplitEE: Image classification in Split Computing DNNs with Early Exits

Figure 2 for I-SplitEE: Image classification in Split Computing DNNs with Early Exits

Figure 3 for I-SplitEE: Image classification in Split Computing DNNs with Early Exits

Figure 4 for I-SplitEE: Image classification in Split Computing DNNs with Early Exits

Abstract:The recent advances in Deep Neural Networks (DNNs) stem from their exceptional performance across various domains. However, their inherent large size hinders deploying these networks on resource-constrained devices like edge, mobile, and IoT platforms. Strategies have emerged, from partial cloud computation offloading (split computing) to integrating early exits within DNN layers. Our work presents an innovative unified approach merging early exits and split computing. We determine the 'splitting layer', the optimal depth in the DNN for edge device computations, and whether to infer on edge device or be offloaded to the cloud for inference considering accuracy, computational efficiency, and communication costs. Also, Image classification faces diverse environmental distortions, influenced by factors like time of day, lighting, and weather. To adapt to these distortions, we introduce I-SplitEE, an online unsupervised algorithm ideal for scenarios lacking ground truths and with sequential data. Experimental validation using Caltech-256 and Cifar-10 datasets subjected to varied distortions showcases I-SplitEE's ability to reduce costs by a minimum of 55% with marginal performance degradation of at most 5%.

* To appear in proceedings of IEEE International Conference on Communications 2024

Via

Access Paper or Ask Questions