Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ankush Chopra

Enhanced Transformer-Based Tracking for Skiing Events: Overcoming Multi-Camera Challenges, Scale Variations and Rapid Motion -- SkiTB Visual Tracking Challenge 2025

Feb 26, 2025

Akhil Penta, Vaibhav Adwani, Ankush Chopra

Abstract:Accurate skier tracking is essential for performance analysis, injury prevention, and optimizing training strategies in alpine sports. Traditional tracking methods often struggle with occlusions, dynamic movements, and varying environmental conditions, limiting their effectiveness. In this work, we used STARK (Spatio-Temporal Transformer Network for Visual Tracking), a transformer-based model, to track skiers. We adapted STARK to address domain-specific challenges such as camera movements, camera changes, occlusions, etc. by optimizing the model's architecture and hyperparameters to better suit the dataset.

Via

Access Paper or Ask Questions

Learning Semantic Text Similarity to rank Hypernyms of Financial Terms

Mar 20, 2023

Sohom Ghosh, Ankush Chopra, Sudip Kumar Naskar

Abstract:Over the years, there has been a paradigm shift in how users access financial services. With the advancement of digitalization more users have been preferring the online mode of performing financial activities. This has led to the generation of a huge volume of financial content. Most investors prefer to go through these contents before making decisions. Every industry has terms that are specific to the domain it operates in. Banking and Financial Services are not an exception to this. In order to fully comprehend these contents, one needs to have a thorough understanding of the financial terms. Getting a basic idea about a term becomes easy when it is explained with the help of the broad category to which it belongs. This broad category is referred to as hypernym. For example, "bond" is a hypernym of the financial term "alternative debenture". In this paper, we propose a system capable of extracting and ranking hypernyms for a given financial term. The system has been trained with financial text corpora obtained from various sources like DBpedia [4], Investopedia, Financial Industry Business Ontology (FIBO), prospectus and so on. Embeddings of these terms have been extracted using FinBERT [3], FinISH [1] and fine-tuned using SentenceBERT [54]. A novel approach has been used to augment the training set with negative samples. It uses the hierarchy present in FIBO. Finally, we benchmark the system performance with that of the existing ones. We establish that it performs better than the existing ones and is also scalable.

* Our code base: https://github.com/sohomghosh/FinSim_Financial_Hypernym_detection

Via

Access Paper or Ask Questions

Delivery Issues Identification from Customer Feedback Data

Dec 26, 2021

Ankush Chopra, Mahima Arora, Shubham Pandey

Figure 1 for Delivery Issues Identification from Customer Feedback Data

Figure 2 for Delivery Issues Identification from Customer Feedback Data

Figure 3 for Delivery Issues Identification from Customer Feedback Data

Figure 4 for Delivery Issues Identification from Customer Feedback Data

Abstract:Millions of packages are delivered successfully by online and local retail stores across the world every day. The proper delivery of packages is needed to ensure high customer satisfaction and repeat purchases. These deliveries suffer various problems despite the best efforts from the stores. These issues happen not only due to the large volume and high demand for low turnaround time but also due to mechanical operations and natural factors. These issues range from receiving wrong items in the package to delayed shipment to damaged packages because of mishandling during transportation. Finding solutions to various delivery issues faced by both sending and receiving parties plays a vital role in increasing the efficiency of the entire process. This paper shows how to find these issues using customer feedback from the text comments and uploaded images. We used transfer learning for both Text and Image models to minimize the demand for thousands of labeled examples. The results show that the model can find different issues. Furthermore, it can also be used for tasks like bottleneck identification, process improvement, automating refunds, etc. Compared with the existing process, the ensemble of text and image models proposed in this paper ensures the identification of several types of delivery issues, which is more suitable for the real-life scenarios of delivery of items in retail businesses. This method can supply a new idea of issue detection for the delivery of packages in similar industries.

* Accepted to be part of MLDS 2022, and will be Published in Lattice journal

Via

Access Paper or Ask Questions

Using Natural Language Processing to Understand Reasons and Motivators Behind Customer Calls in Financial Domain

Oct 18, 2021

Ankit Patil, Ankush Chopra, Sohom Ghosh, Vamshi Vadla

Figure 1 for Using Natural Language Processing to Understand Reasons and Motivators Behind Customer Calls in Financial Domain

Figure 2 for Using Natural Language Processing to Understand Reasons and Motivators Behind Customer Calls in Financial Domain

Figure 3 for Using Natural Language Processing to Understand Reasons and Motivators Behind Customer Calls in Financial Domain

Figure 4 for Using Natural Language Processing to Understand Reasons and Motivators Behind Customer Calls in Financial Domain

Abstract:In this era of abundant digital information, customer satisfaction has become one of the prominent factors in the success of any business. Customers want a one-click solution for almost everything. They tend to get unsatisfied if they have to call about something which they could have done online. Moreover, incoming calls are a high-cost component for any business. Thus, it is essential to develop a framework capable of mining the reasons and motivators behind customer calls. This paper proposes two models. Firstly, an attention-based stacked bidirectional Long Short Term Memory Network followed by Hierarchical Clustering for extracting these reasons from transcripts of inbound calls. Secondly, a set of ensemble models based on probabilities from Support Vector Machines and Logistic Regression. It is capable of detecting factors that led to these calls. Extensive evaluation proves the effectiveness of these models.

* Accepted at ICCMDE-2021. To be published in Springer - Lecture Notes on Data Engineering and Communications Technologies

Via

Access Paper or Ask Questions

Data Driven Content Creation using Statistical and Natural Language Processing Techniques for Financial Domain

Sep 07, 2021

Ankush Chopra, Prateek Nagwanshi, Sohom Ghosh

Figure 1 for Data Driven Content Creation using Statistical and Natural Language Processing Techniques for Financial Domain

Figure 2 for Data Driven Content Creation using Statistical and Natural Language Processing Techniques for Financial Domain

Figure 3 for Data Driven Content Creation using Statistical and Natural Language Processing Techniques for Financial Domain

Figure 4 for Data Driven Content Creation using Statistical and Natural Language Processing Techniques for Financial Domain

Abstract:Over the years customers' expectation of getting information instantaneously has given rise to the increased usage of channels like virtual assistants. Typically, customers try to get their questions answered by low-touch channels like search and virtual assistant first, before getting in touch with a live chat agent or the phone representative. Higher usage of these low-touch systems is a win-win for both customers and the organization since it enables organizations to attain a low cost of service while customers get served without delay. In this paper, we propose a two-part framework where the first part describes methods to combine the information from different interaction channels like call, search, and chat. We do this by summarizing (using a stacked Bi-LSTM network) the high-touch interaction channel data such as call and chat into short searchquery like customer intents and then creating an organically grown intent taxonomy from interaction data (using Hierarchical Agglomerative Clustering). The second part of the framework focuses on extracting customer questions by analyzing interaction data sources. It calculates similarity scores using TF-IDF and BERT(Devlin et al., 2019). It also maps these identified questions to the output of the first part of the framework using syntactic and semantic similarity.

* In Proceedings of The 3rd Financial Narrative Processing Workshop (FNP 2021) [To be published in ACL Anthology]

Via

Access Paper or Ask Questions

Term Expansion and FinBERT fine-tuning for Hypernym and Synonym Ranking of Financial Terms

Jul 29, 2021

Ankush Chopra, Sohom Ghosh

Figure 1 for Term Expansion and FinBERT fine-tuning for Hypernym and Synonym Ranking of Financial Terms

Figure 2 for Term Expansion and FinBERT fine-tuning for Hypernym and Synonym Ranking of Financial Terms

Figure 3 for Term Expansion and FinBERT fine-tuning for Hypernym and Synonym Ranking of Financial Terms

Figure 4 for Term Expansion and FinBERT fine-tuning for Hypernym and Synonym Ranking of Financial Terms

Abstract:Hypernym and synonym matching are one of the mainstream Natural Language Processing (NLP) tasks. In this paper, we present systems that attempt to solve this problem. We designed these systems to participate in the FinSim-3, a shared task of FinNLP workshop at IJCAI-2021. The shared task is focused on solving this problem for the financial domain. We experimented with various transformer based pre-trained embeddings by fine-tuning these for either classification or phrase similarity tasks. We also augmented the provided dataset with abbreviations derived from prospectus provided by the organizers and definitions of the financial terms from DBpedia [Auer et al., 2007], Investopedia, and the Financial Industry Business Ontology (FIBO). Our best performing system uses both FinBERT [Araci, 2019] and data augmentation from the afore-mentioned sources. We observed that term expansion using data augmentation in conjunction with semantic similarity is beneficial for this task and could be useful for the other tasks that deal with short phrases. Our best performing model (Accuracy: 0.917, Rank: 1.156) was developed by fine-tuning SentenceBERT [Reimers et al., 2019] (with FinBERT at the backend) over an extended labelled set created using the hierarchy of labels present in FIBO.

* Accepted at 3rd Workshop on Financial Technology and Natural Language Processing (FinNLP) in conjunction with 30th International Joint Conference on Artificial Intelligence (IJCAI) 2021

Via

Access Paper or Ask Questions

Using Transformer based Ensemble Learning to classify Scientific Articles

Feb 19, 2021

Sohom Ghosh, Ankush Chopra

Figure 1 for Using Transformer based Ensemble Learning to classify Scientific Articles

Figure 2 for Using Transformer based Ensemble Learning to classify Scientific Articles

Figure 3 for Using Transformer based Ensemble Learning to classify Scientific Articles

Figure 4 for Using Transformer based Ensemble Learning to classify Scientific Articles

Abstract:Many time reviewers fail to appreciate novel ideas of a researcher and provide generic feedback. Thus, proper assignment of reviewers based on their area of expertise is necessary. Moreover, reading each and every paper from end-to-end for assigning it to a reviewer is a tedious task. In this paper, we describe a system which our team FideLIPI submitted in the shared task of SDPRA-2021 [14]. It comprises four independent sub-systems capable of classifying abstracts of scientific literature to one of the given seven classes. The first one is a RoBERTa [10] based model built over these abstracts. Adding topic models / Latent dirichlet allocation (LDA) [2] based features to the first model results in the second sub-system. The third one is a sentence level RoBERTa [10] model. The fourth one is a Logistic Regression model built using Term Frequency Inverse Document Frequency (TF-IDF) features. We ensemble predictions of these four sub-systems using majority voting to develop the final system which gives a F1 score of 0.93 on the test and validation set. This outperforms the existing State Of The Art (SOTA) model SciBERT's [1] in terms of F1 score on the validation set.Our codebase is available at https://github.com/SDPRA-2021/shared-task/tree/main/FideLIPI

* 8 pages, 3 tables, 1 figure, Accepted at SDPRA-2021 (Collocated with PAKDD 2021)

Via

Access Paper or Ask Questions

Applying Transfer Learning for Improving Domain-Specific Search Experience Using Query to Question Similarity

Jan 07, 2021

Ankush Chopra, Shruti Agrawal, Sohom Ghosh

Figure 1 for Applying Transfer Learning for Improving Domain-Specific Search Experience Using Query to Question Similarity

Figure 2 for Applying Transfer Learning for Improving Domain-Specific Search Experience Using Query to Question Similarity

Figure 3 for Applying Transfer Learning for Improving Domain-Specific Search Experience Using Query to Question Similarity

Figure 4 for Applying Transfer Learning for Improving Domain-Specific Search Experience Using Query to Question Similarity

Abstract:Search is one of the most common platforms used to seek information. However, users mostly get overloaded with results whenever they use such a platform to resolve their queries. Nowadays, direct answers to queries are being provided as a part of the search experience. The question-answer (QA) retrieval process plays a significant role in enriching the search experience. Most off-the-shelf Semantic Textual Similarity models work fine for well-formed search queries, but their performances degrade when applied to a domain-specific setting having incomplete or grammatically ill-formed search queries in prevalence. In this paper, we discuss a framework for calculating similarities between a given input query and a set of predefined questions to retrieve the question which matches to it the most. We have used it for the financial domain, but the framework is generalized for any domain-specific search engine and can be used in other domains as well. We use Siamese network [6] over Long Short-Term Memory (LSTM) [3] models to train a classifier which generates unnormalized and normalized similarity scores for a given pair of questions. Moreover, for each of these question pairs, we calculate three other similarity scores: cosine similarity between their average word2vec embeddings [15], cosine similarity between their sentence embeddings [7] generated using RoBERTa [17] and their customized fuzzy-match score. Finally, we develop a metaclassifier using Support Vector Machines [19] for combining these five scores to detect if a given pair of questions is similar. We benchmark our model's performance against existing State Of The Art (SOTA) models on Quora Question Pairs (QQP) dataset as well as a dataset specific to the financial domain.

* 8 pages, accepted in the Proceedings of the 3rd International Conference on Algorithms, Computing and Artificial Intelligence (ACAI), 2020

Via

Access Paper or Ask Questions