Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nikhil Verma

The Hidden Space of Safety: Understanding Preference-Tuned LLMs in Multilingual context

Apr 03, 2025

Nikhil Verma, Manasa Bharadwaj

Abstract:Alignment tuning has enabled large language models to excel in reasoning, instruction-following, and minimizing harmful generations. However, despite their widespread deployment, these models exhibit a monolingual bias, raising concerns about the effectiveness of alignment across languages. Current alignment methods predominantly focus on English, leaving it unclear how alignment mechanism generalize to multilingual settings. To address this, we conduct a systematic analysis of distributional shifts in the embedding space of LLMs before and after alignment, uncovering its impact on model behavior across diverse languages. We leverage the alignment-induced separation in safety space as a quantitative tool to measure how alignment enforces safety constraints. Our study evaluates seven LLMs using balanced toxicity datasets and parallel text-detoxification benchmarks, revealing substantial disparities in the latent representation space between high-resource and low-resource languages. These findings underscore the need for language-specific fine-tuning to ensure fair, reliable and robust multilingual alignment. Our insights provide a foundation for developing truly safe multilingual LLMs, emphasizing the urgency of addressing alignment gaps in underrepresented languages.

* 14 pages, 11 Figures, 2 Tables, currently under review at ACL 2025

Via

Access Paper or Ask Questions

Large Language Models for Page Stream Segmentation

Aug 21, 2024

Hunter Heidenreich, Ratish Dalvi, Rohith Mukku, Nikhil Verma, Neven Pičuljan

Abstract:Page Stream Segmentation (PSS) is an essential prerequisite for automated document processing at scale. However, research progress has been limited by the absence of realistic public benchmarks. This paper works towards addressing this gap by introducing TABME++, an enhanced benchmark featuring commercial Optical Character Recognition (OCR) annotations. We evaluate the performance of large language models (LLMs) on PSS, focusing on decoder-based models fine-tuned with parameter-efficient methods. Our results show that decoder-based LLMs outperform smaller multimodal encoders. Through a review of existing PSS research and datasets, we identify key challenges and advancements in the field. Our findings highlight the key importance of robust OCR, providing valuable insights for the development of more effective document processing systems.

Via

Access Paper or Ask Questions

GPT-DETOX: An In-Context Learning-Based Paraphraser for Text Detoxification

Apr 03, 2024

Ali Pesaranghader, Nikhil Verma, Manasa Bharadwaj

Abstract:Harmful and offensive communication or content is detrimental to social bonding and the mental state of users on social media platforms. Text detoxification is a crucial task in natural language processing (NLP), where the goal is removing profanity and toxicity from text while preserving its content. Supervised and unsupervised learning are common approaches for designing text detoxification solutions. However, these methods necessitate fine-tuning, leading to computational overhead. In this paper, we propose GPT-DETOX as a framework for prompt-based in-context learning for text detoxification using GPT-3.5 Turbo. We utilize zero-shot and few-shot prompting techniques for detoxifying input sentences. To generate few-shot prompts, we propose two methods: word-matching example selection (WMES) and context-matching example selection (CMES). We additionally take into account ensemble in-context learning (EICL) where the ensemble is shaped by base prompts from zero-shot and all few-shot settings. We use ParaDetox and APPDIA as benchmark detoxification datasets. Our experimental results show that the zero-shot solution achieves promising performance, while our best few-shot setting outperforms the state-of-the-art models on ParaDetox and shows comparable results on APPDIA. Our EICL solutions obtain the greatest performance, adding at least 10% improvement, against both datasets.

* 7 pages, 8 tables. Published in: 2023 International Conference on Machine Learning and Applications (ICMLA)

Via

Access Paper or Ask Questions

Image Reconstruction using Enhanced Vision Transformer

Jul 11, 2023

Nikhil Verma, Deepkamal Kaur, Lydia Chau

Abstract:Removing noise from images is a challenging and fundamental problem in the field of computer vision. Images captured by modern cameras are inevitably degraded by noise which limits the accuracy of any quantitative measurements on those images. In this project, we propose a novel image reconstruction framework which can be used for tasks such as image denoising, deblurring or inpainting. The model proposed in this project is based on Vision Transformer (ViT) that takes 2D images as input and outputs embeddings which can be used for reconstructing denoised images. We incorporate four additional optimization techniques in the framework to improve the model reconstruction capability, namely Locality Sensitive Attention (LSA), Shifted Patch Tokenization (SPT), Rotary Position Embeddings (RoPE) and adversarial loss function inspired from Generative Adversarial Networks (GANs). LSA, SPT and RoPE enable the transformer to learn from the dataset more efficiently, while the adversarial loss function enhances the resolution of the reconstructed images. Based on our experiments, the proposed architecture outperforms the benchmark U-Net model by more than 3.5\% structural similarity (SSIM) for the reconstruction tasks of image denoising and inpainting. The proposed enhancements further show an improvement of \textasciitilde5\% SSIM over the benchmark for both tasks.

Via

Access Paper or Ask Questions

Diffusion idea exploration for art generation

Jul 11, 2023

Nikhil Verma

Figure 1 for Diffusion idea exploration for art generation

Figure 2 for Diffusion idea exploration for art generation

Figure 3 for Diffusion idea exploration for art generation

Figure 4 for Diffusion idea exploration for art generation

Abstract:Cross-Modal learning tasks have picked up pace in recent times. With plethora of applications in diverse areas, generation of novel content using multiple modalities of data has remained a challenging problem. To address the same, various generative modelling techniques have been proposed for specific tasks. Novel and creative image generation is one important aspect for industrial application which could help as an arm for novel content generation. Techniques proposed previously used Generative Adversarial Network(GAN), autoregressive models and Variational Autoencoders (VAE) for accomplishing similar tasks. These approaches are limited in their capability to produce images guided by either text instructions or rough sketch images decreasing the overall performance of image generator. We used state of the art diffusion models to generate creative art by primarily leveraging text with additional support of rough sketches. Diffusion starts with a pattern of random dots and slowly converts that pattern into a design image using the guiding information fed into the model. Diffusion models have recently outperformed other generative models in image generation tasks using cross modal data as guiding information. The initial experiments for this task of novel image generation demonstrated promising qualitative results.

* Report Submitted for degree completion of Master of Science in Applied Computing at University of Toronto

Via

Access Paper or Ask Questions

Responsive parallelized architecture for deploying deep learning models in production environments

Dec 15, 2021

Nikhil Verma, Krishna Prasad

Figure 1 for Responsive parallelized architecture for deploying deep learning models in production environments

Figure 2 for Responsive parallelized architecture for deploying deep learning models in production environments

Figure 3 for Responsive parallelized architecture for deploying deep learning models in production environments

Figure 4 for Responsive parallelized architecture for deploying deep learning models in production environments

Abstract:Recruiters can easily shortlist candidates for jobs via viewing their curriculum vitae document. Unstructured document CV beholds candidates portfolio and named entities listing details. The main aim of this study is to design and propose a web oriented, highly responsive, computational pipeline that systematically predicts CV entities using hierarchically refined label attention networks.

* 20 Pages

Via

Access Paper or Ask Questions

Stability Based Filter Pruning for Accelerating Deep CNNs

Nov 20, 2018

Pravendra Singh, Vinay Sameer Raja Kadi, Nikhil Verma, Vinay P. Namboodiri

Figure 1 for Stability Based Filter Pruning for Accelerating Deep CNNs

Figure 2 for Stability Based Filter Pruning for Accelerating Deep CNNs

Figure 3 for Stability Based Filter Pruning for Accelerating Deep CNNs

Figure 4 for Stability Based Filter Pruning for Accelerating Deep CNNs

Abstract:Convolutional neural networks (CNN) have achieved impressive performance on the wide variety of tasks (classification, detection, etc.) across multiple domains at the cost of high computational and memory requirements. Thus, leveraging CNNs for real-time applications necessitates model compression approaches that not only reduce the total number of parameters but reduce the overall computation as well. In this work, we present a stability-based approach for filter-level pruning of CNNs. We evaluate our proposed approach on different architectures (LeNet, VGG-16, ResNet, and Faster RCNN) and datasets and demonstrate its generalizability through extensive experiments. Moreover, our compressed models can be used at run-time without requiring any special libraries or hardware. Our model compression method reduces the number of FLOPS by an impressive factor of 6.03X and GPU memory footprint by more than 17X, significantly outperforming other state-of-the-art filter pruning methods.

* IEEE Winter Conference on Applications of Computer Vision (WACV), 2019

Via

Access Paper or Ask Questions