Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kshiteej Sheth

Indian Institute of Technology Gandhinagar, India

BalanceKV: KV Cache Compression through Discrepancy Theory

Feb 11, 2025

Insu Han, Michael Kapralov, Ekaterina Kochetkova, Kshiteej Sheth, Amir Zandieh

Abstract:Large language models (LLMs) have achieved impressive success, but their high memory requirements present challenges for long-context token generation. The memory complexity of long-context LLMs is primarily due to the need to store Key-Value (KV) embeddings in their KV cache. We present BalanceKV, a KV cache compression method based on geometric sampling process stemming from Banaszczyk's vector balancing theory, which introduces dependencies informed by the geometry of keys and value tokens, and improves precision. BalanceKV offers both theoretically proven and empirically validated performance improvements over existing methods.

Via

Access Paper or Ask Questions

Improved Linear Embeddings via Lagrange Duality

Dec 14, 2017

Kshiteej Sheth, Dinesh Garg, Anirban Dasgupta

Figure 1 for Improved Linear Embeddings via Lagrange Duality

Figure 2 for Improved Linear Embeddings via Lagrange Duality

Figure 3 for Improved Linear Embeddings via Lagrange Duality

Figure 4 for Improved Linear Embeddings via Lagrange Duality

Abstract:Near isometric orthogonal embeddings to lower dimensions are a fundamental tool in data science and machine learning. In this paper, we present the construction of such embeddings that minimizes the maximum distortion for a given set of points. We formulate the problem as a non convex constrained optimization problem. We first construct a primal relaxation and then use the theory of Lagrange duality to create dual relaxation. We also suggest a polynomial time algorithm based on the theory of convex optimization to solve the dual relaxation provably. We provide a theoretical upper bound on the approximation guarantees for our algorithm, which depends only on the spectral properties of the dataset. We experimentally demonstrate the superiority of our algorithm compared to baselines in terms of the scalability and the ability to achieve lower distortion.

* 20 pages

Via

Access Paper or Ask Questions

Deep-Learnt Classification of Light Curves

Sep 19, 2017

Ashish Mahabal, Kshiteej Sheth, Fabian Gieseke, Akshay Pai, S. George Djorgovski, Andrew Drake, Matthew Graham, the CSS/CRTS/PTF Collaboration

Figure 1 for Deep-Learnt Classification of Light Curves

Figure 2 for Deep-Learnt Classification of Light Curves

Figure 3 for Deep-Learnt Classification of Light Curves

Figure 4 for Deep-Learnt Classification of Light Curves

Abstract:Astronomy light curves are sparse, gappy, and heteroscedastic. As a result standard time series methods regularly used for financial and similar datasets are of little help and astronomers are usually left to their own instruments and techniques to classify light curves. A common approach is to derive statistical features from the time series and to use machine learning methods, generally supervised, to separate objects into a few of the standard classes. In this work, we transform the time series to two-dimensional light curve representations in order to classify them using modern deep learning techniques. In particular, we show that convolutional neural networks based classifiers work well for broad characterization and classification. We use labeled datasets of periodic variables from CRTS survey and show how this opens doors for a quick classification of diverse classes with several possible exciting extensions.

* 2017 IEEE Symposium Series on Computational Intelligence (SSCI), Honolulu, HI, USA, 2017, p2757
* 8 pages, 9 figures, 6 tables, 2 listings. Accepted to 2017 IEEE Symposium Series on Computational Intelligence (SSCI)

Via

Access Paper or Ask Questions

Deep Neural Networks for HDR imaging

Sep 04, 2016

Kshiteej Sheth

Figure 1 for Deep Neural Networks for HDR imaging

Figure 2 for Deep Neural Networks for HDR imaging

Figure 3 for Deep Neural Networks for HDR imaging

Figure 4 for Deep Neural Networks for HDR imaging

Abstract:We propose novel methods of solving two tasks using Convolutional Neural Networks, firstly the task of generating HDR map of a static scene using differently exposed LDR images of the scene captured using conventional cameras and secondly the task of finding an optimal tone mapping operator that would give a better score on the TMQI metric compared to the existing methods. We quantitatively show the performance of our networks and illustrate the cases where our networks performs good as well as bad.

* 9 pages

Via

Access Paper or Ask Questions