Abstract:Social support, conveyed through a multitude of interactions and platforms such as social media, plays a pivotal role in fostering a sense of belonging, aiding resilience in the face of challenges, and enhancing overall well-being. This paper introduces Social Support Detection (SSD) as a Natural language processing (NLP) task aimed at identifying supportive interactions within online communities. The study presents the task of Social Support Detection (SSD) in three subtasks: two binary classification tasks and one multiclass task, with labels detailed in the dataset section. We conducted experiments on a dataset comprising 10,000 YouTube comments. Traditional machine learning models were employed, utilizing various feature combinations that encompass linguistic, psycholinguistic, emotional, and sentiment information. Additionally, we experimented with neural network-based models using various word embeddings to enhance the performance of our models across these subtasks.The results reveal a prevalence of group-oriented support in online dialogues, reflecting broader societal patterns. The findings demonstrate the effectiveness of integrating psycholinguistic, emotional, and sentiment features with n-grams in detecting social support and distinguishing whether it is directed toward an individual or a group. The best results for different subtasks across all experiments range from 0.72 to 0.82.
Abstract:This study performs analysis of Predictive statements, Hope speech, and Regret Detection behaviors within cryptocurrency-related discussions, leveraging advanced natural language processing techniques. We introduce a novel classification scheme named "Prediction statements," categorizing comments into Predictive Incremental, Predictive Decremental, Predictive Neutral, or Non-Predictive categories. Employing GPT-4o, a cutting-edge large language model, we explore sentiment dynamics across five prominent cryptocurrencies: Cardano, Binance, Matic, Fantom, and Ripple. Our analysis reveals distinct patterns in predictive sentiments, with Matic demonstrating a notably higher propensity for optimistic predictions. Additionally, we investigate hope and regret sentiments, uncovering nuanced interplay between these emotions and predictive behaviors. Despite encountering limitations related to data volume and resource availability, our study reports valuable discoveries concerning investor behavior and sentiment trends within the cryptocurrency market, informing strategic decision-making and future research endeavors.
Abstract:In this paper, we introduce FC-KAN, a Kolmogorov-Arnold Network (KAN) that leverages combinations of popular mathematical functions such as B-splines, wavelets, and radial basis functions on low-dimensional data through element-wise operations. We explore several methods for combining the outputs of these functions, including sum, element-wise product, the addition of sum and element-wise product, quadratic function representation, and concatenation. In our experiments, we compare FC-KAN with multi-layer perceptron network (MLP) and other existing KANs, such as BSRBF-KAN, EfficientKAN, FastKAN, and FasterKAN, on the MNIST and Fashion-MNIST datasets. A variant of FC-KAN, which uses a combination of outputs from B-splines and Difference of Gaussians (DoG) in the form of a quadratic function, outperformed all other models on the average of 5 independent training runs. We expect that FC-KAN can leverage function combinations to design future KANs. Our repository is publicly available at: https://github.com/hoangthangta/FC_KAN.
Abstract:Identifying misogyny using artificial intelligence is a form of combating online toxicity against women. However, the subjective nature of interpreting misogyny poses a significant challenge to model the phenomenon. In this paper, we propose a multitask learning approach that leverages the subjectivity of this task to enhance the performance of the misogyny identification systems. We incorporated diverse perspectives from annotators in our model design, considering gender and age across six profile groups, and conducted extensive experiments and error analysis using two language models to validate our four alternative designs of the multitask learning technique to identify misogynistic content in English tweets. The results demonstrate that incorporating various viewpoints enhances the language models' ability to interpret different forms of misogyny. This research advances content moderation and highlights the importance of embracing diverse perspectives to build effective online moderation systems.
Abstract:In the japanese anime industry, predicting whether an upcoming product will be popular is crucial. This paper presents a dataset and methods on predicting anime popularity using a multimodal textimage dataset constructed exclusively from freely available internet sources. The dataset was built following rigorous standards based on real-life investment experiences. A deep neural network architecture leveraging GPT-2 and ResNet-50 to embed the data was employed to investigate the correlation between the multimodal text-image input and a popularity score, discovering relevant strengths and weaknesses in the dataset. To measure the accuracy of the model, mean squared error (MSE) was used, obtaining a best result of 0.011 when considering all inputs and the full version of the deep neural network, compared to the benchmark MSE 0.412 obtained with traditional TF-IDF and PILtotensor vectorizations. This is the first proposal to address such task with multimodal datasets, revealing the substantial benefit of incorporating image information, even when a relatively small model (ResNet-50) was used to embed them.
Abstract:This study delves into the relationship between emotional trends from X platform data and the market dynamics of well-known cryptocurrencies Cardano, Binance, Fantom, Matic, and Ripple over the period from October 2022 to March 2023. Leveraging SenticNet, we identified emotions like Fear and Anxiety, Rage and Anger, Grief and Sadness, Delight and Pleasantness, Enthusiasm and Eagerness, and Delight and Joy. Following data extraction, we segmented each month into bi-weekly intervals, replicating this process for price data obtained from Finance-Yahoo. Consequently, a comparative analysis was conducted, establishing connections between emotional trends observed across bi-weekly intervals and cryptocurrency prices, uncovering significant correlations between emotional sentiments and coin valuations.
Abstract:The intricate relationship between human decision-making and emotions, particularly guilt and regret, has significant implications on behavior and well-being. Yet, these emotions subtle distinctions and interplay are often overlooked in computational models. This paper introduces a dataset tailored to dissect the relationship between guilt and regret and their unique textual markers, filling a notable gap in affective computing research. Our approach treats guilt and regret recognition as a binary classification task and employs three machine learning and six transformer-based deep learning techniques to benchmark the newly created dataset. The study further implements innovative reasoning methods like chain-of-thought and tree-of-thought to assess the models interpretive logic. The results indicate a clear performance edge for transformer-based models, achieving a 90.4% macro F1 score compared to the 85.3% scored by the best machine learning classifier, demonstrating their superior capability in distinguishing complex emotional states.
Abstract:In recent years, language models and deep learning techniques have revolutionized natural language processing tasks, including emotion detection. However, the specific emotion of guilt has received limited attention in this field. In this research, we explore the applicability of three transformer-based language models for detecting guilt in text and compare their performance for general emotion detection and guilt detection. Our proposed model outformed BERT and RoBERTa models by two and one points respectively. Additionally, we analyze the challenges in developing accurate guilt-detection models and evaluate our model's effectiveness in detecting related emotions like "shame" through qualitative analysis of results.
Abstract:In natural language processing (NLP), lexical function is a concept to unambiguously represent semantic and syntactic features of words and phrases in text first crafted in the Meaning-Text Theory. Hierarchical classification of lexical functions involves organizing these features into a tree-like hierarchy of categories or labels. This is a challenging task as it requires a good understanding of the context and the relationships among words and phrases in text. It also needs large amounts of labeled data to train language models effectively. In this paper, we present a dataset of most frequent Spanish verb-noun collocations and sentences where they occur, each collocation is assigned to one of 37 lexical functions defined as classes for a hierarchical classification task. Each class represents a relation between the noun and the verb in a collocation involving their semantic and syntactic features. We combine the classes in a tree-based structure, and introduce classification objectives for each level of the structure. The dataset was created by dependency tree parsing and matching of the phrases in Spanish news. We provide baselines and data splits for each objective.
Abstract:In this paper, we investigate the issue of hate speech by presenting a novel task of translating hate speech into non-hate speech text while preserving its meaning. As a case study, we use Spanish texts. We provide a dataset and several baselines as a starting point for further research in the task. We evaluated our baseline results using multiple metrics, including BLEU scores. The aim of this study is to contribute to the development of more effective methods for reducing the spread of hate speech in online communities.