Abstract:Deep learning techniques have gained a lot of traction in the field of NLP research. The aim of this paper is to predict the age and gender of an individual by inspecting their written text. We propose a supervised BERT-based classification technique in order to predict the age and gender of bloggers. The dataset used contains 681284 rows of data, with the information of the blogger's age, gender, and text of the blog written by them. We compare our algorithm to previous works in the same domain and achieve a better accuracy and F1 score. The accuracy reported for the prediction of age group was 84.2%, while the accuracy for the prediction of gender was 86.32%. This study relies on the raw capabilities of BERT to predict the classes of textual data efficiently. This paper shows promising capability in predicting the demographics of the author with high accuracy and can have wide applicability across multiple domains.
Abstract:In today's modern digital world, we have a number of online Question and Answer platforms like Stack Exchange, Quora, and GFG that serve as a medium for people to communicate and help each other. In this paper, we analyzed the effectiveness of Stack Overflow in helping newbies to programming. Every user on this platform goes through a journey. For the first 12 months, we consider them to be a newbie. Post 12 months they come under one of the following categories: Experienced, Lurkers, or Inquisitive. Each question asked has tags assigned to it and we observe that questions with some specific tags have a faster response time indicating an active community in that field over others. The platform had a steady growth up to 2013 after which it started declining, but recently during the pandemic 2020, we can see rejuvenated activity on the platform.