Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ramasubramanian Balasubramanian

ChunkWise LoRA: Adaptive Sequence Partitioning for Memory-Efficient Low-Rank Adaptation and Accelerated LLM Inference

Jan 28, 2026

Ketan Thakkar, Maitreyi Chatterjee, Ramasubramanian Balasubramanian, Achyuthan Jootoo, Rajendra Ugrani

Abstract:Recent advances in low-rank adaptation (LoRA) have enabled efficient fine-tuning of large language models (LLMs) with minimal additional parameters. However, existing LoRA methods apply static rank configurations uniformly across all input tokens, ignoring variation in token complexity and computational requirements. In this work, we propose ChunkWise LoRA, a dynamic and adaptive approach that partitions sequences into variable-length chunks based on token complexity and assigns each chunk a tailored low-rank configuration. Our system introduces a runtime scheduler that estimates token difficulty, performs adaptive chunking, and selects per-chunk LoRA rank and scaling using a rank-ladder mechanism. To preserve output consistency, we further introduce a boundary-safe composition module and integrate policy-driven KV-cache strategies. Experiments on benchmark datasets such as Wikitext-103 and SQuAD demonstrate that ChunkWise LoRA achieves up to 34\% lower latency and 38% memory reduction compared to baseline LoRA, while maintaining or improving task performance metrics like BLEU, EM, and perplexity. The proposed framework remains fully compatible with existing transformer architectures and inference frameworks, providing a practical solution for real-world deployment of parameter-efficient LLMs.

* Presented at 13th IEEE International Conference on Intelligent Systems and Embedded Design

Via

Access Paper or Ask Questions

Variational Inference for Category Recommendation in E-Commerce platforms

Apr 19, 2021

Ramasubramanian Balasubramanian, Venugopal Mani, Abhinav Mathur, Sushant Kumar, Kannan Achan

Figure 1 for Variational Inference for Category Recommendation in E-Commerce platforms

Figure 2 for Variational Inference for Category Recommendation in E-Commerce platforms

Figure 3 for Variational Inference for Category Recommendation in E-Commerce platforms

Figure 4 for Variational Inference for Category Recommendation in E-Commerce platforms

Abstract:Category recommendation for users on an e-Commerce platform is an important task as it dictates the flow of traffic through the website. It is therefore important to surface precise and diverse category recommendations to aid the users' journey through the platform and to help them discover new groups of items. An often understated part in category recommendation is users' proclivity to repeat purchases. The structure of this temporal behavior can be harvested for better category recommendations and in this work, we attempt to harness this through variational inference. Further, to enhance the variational inference based optimization, we initialize the optimizer at better starting points through the well known Metapath2Vec algorithm. We demonstrate our results on two real-world datasets and show that our model outperforms standard baseline methods.

* 8 pages, 3 figures, 2 tables

Via

Access Paper or Ask Questions

On Variational Inference for User Modeling in Attribute-Driven Collaborative Filtering

Dec 02, 2020

Venugopal Mani, Ramasubramanian Balasubramanian, Sushant Kumar, Abhinav Mathur, Kannan Achan

Figure 1 for On Variational Inference for User Modeling in Attribute-Driven Collaborative Filtering

Figure 2 for On Variational Inference for User Modeling in Attribute-Driven Collaborative Filtering

Figure 3 for On Variational Inference for User Modeling in Attribute-Driven Collaborative Filtering

Abstract:Recommender Systems have become an integral part of online e-Commerce platforms, driving customer engagement and revenue. Most popular recommender systems attempt to learn from users' past engagement data to understand behavioral traits of users and use that to predict future behavior. In this work, we present an approach to use causal inference to learn user-attribute affinities through temporal contexts. We formulate this objective as a Probabilistic Machine Learning problem and apply a variational inference based method to estimate the model parameters. We demonstrate the performance of the proposed method on the next attribute prediction task on two real world datasets and show that it outperforms standard baseline methods.

* 9 pages, 2 figures, 1 algorithm

Via

Access Paper or Ask Questions