Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hamid Beigy

AI Group, Computer Engineering Department, Sharif University of Technology

Zero-Shot Learning and Key Points Are All You Need for Automated Fact-Checking

Aug 15, 2024

Mohammad Ghiasvand Mohammadkhani, Ali Ghiasvand Mohammadkhani, Hamid Beigy

Figure 1 for Zero-Shot Learning and Key Points Are All You Need for Automated Fact-Checking

Figure 2 for Zero-Shot Learning and Key Points Are All You Need for Automated Fact-Checking

Abstract:Automated fact-checking is an important task because determining the accurate status of a proposed claim within the vast amount of information available online is a critical challenge. This challenge requires robust evaluation to prevent the spread of false information. Modern large language models (LLMs) have demonstrated high capability in performing a diverse range of Natural Language Processing (NLP) tasks. By utilizing proper prompting strategies, their versatility due to their understanding of large context sizes and zero-shot learning ability enables them to simulate human problem-solving intuition and move towards being an alternative to humans for solving problems. In this work, we introduce a straightforward framework based on Zero-Shot Learning and Key Points (ZSL-KeP) for automated fact-checking, which despite its simplicity, performed well on the AVeriTeC shared task dataset by robustly improving the baseline and achieving 10th place.

Via

Access Paper or Ask Questions

Consistency Training by Synthetic Question Generation for Conversational Question Answering

Apr 17, 2024

Hamed Hematian Hemati, Hamid Beigy

Abstract:Efficiently modeling historical information is a critical component in addressing user queries within a conversational question-answering (QA) context, as historical context plays a vital role in clarifying the user's questions. However, irrelevant history induces noise in the reasoning process, especially for those questions with a considerable historical context. In our novel model-agnostic approach, referred to as CoTaH (Consistency-Trained augmented History), we augment the historical information with synthetic questions and subsequently employ consistency training to train a model that utilizes both real and augmented historical data to implicitly make the reasoning robust to irrelevant history. To the best of our knowledge, this is the first instance of research using question generation as a form of data augmentation to model conversational QA settings. By citing a common modeling error prevalent in previous research, we introduce a new baseline model and compare our model's performance against it, demonstrating an improvement in results, particularly when dealing with questions that include a substantial amount of historical context. The source code can be found on our GitHub page.

Via

Access Paper or Ask Questions

Multi-BERT: Leveraging Adapters and Prompt Tuning for Low-Resource Multi-Domain Adaptation

Apr 02, 2024

Parham Abed Azad, Hamid Beigy

Abstract:The rapid expansion of texts' volume and diversity presents formidable challenges in multi-domain settings. These challenges are also visible in the Persian name entity recognition (NER) settings. Traditional approaches, either employing a unified model for multiple domains or individual models for each domain, frequently pose significant limitations. Single models often struggle to capture the nuances of diverse domains, while utilizing multiple large models can lead to resource constraints, rendering the training of a model for each domain virtually impractical. Therefore, this paper introduces a novel approach composed of one core model with multiple sets of domain-specific parameters. We utilize techniques such as prompt tuning and adapters, combined with the incorporation of additional layers, to add parameters that we can train for the specific domains. This enables the model to perform comparably to individual models for each domain. Experimental results on different formal and informal datasets show that by employing these added parameters, the proposed model significantly surpasses existing practical models in performance. Remarkably, the proposed model requires only one instance for training and storage, yet achieves outstanding results across all domains, even surpassing the state-of-the-art in some. Moreover, we analyze each adaptation strategy, delineating its strengths, weaknesses, and optimal hyper-parameters for the Persian NER settings. Finally, we introduce a document-based domain detection pipeline tailored for scenarios with unknown text domains, enhancing the adaptability and practicality of this paper in real-world applications.

Via

Access Paper or Ask Questions

FaBERT: Pre-training BERT on Persian Blogs

Feb 09, 2024

Mostafa Masumi, Seyed Soroush Majd, Mehrnoush Shamsfard, Hamid Beigy

Abstract:We introduce FaBERT, a Persian BERT-base model pre-trained on the HmBlogs corpus, encompassing both informal and formal Persian texts. FaBERT is designed to excel in traditional Natural Language Understanding (NLU) tasks, addressing the intricacies of diverse sentence structures and linguistic styles prevalent in the Persian language. In our comprehensive evaluation of FaBERT on 12 datasets in various downstream tasks, encompassing Sentiment Analysis (SA), Named Entity Recognition (NER), Natural Language Inference (NLI), Question Answering (QA), and Question Paraphrasing (QP), it consistently demonstrated improved performance, all achieved within a compact model size. The findings highlight the importance of utilizing diverse and cleaned corpora, such as HmBlogs, to enhance the performance of language models like BERT in Persian Natural Language Processing (NLP) applications. FaBERT is openly accessible at https://huggingface.co/sbunlp/fabert

Via

Access Paper or Ask Questions

PCoQA: Persian Conversational Question Answering Dataset

Dec 07, 2023

Hamed Hematian Hemati, Atousa Toghyani, Atena Souri, Sayed Hesam Alavian, Hossein Sameti, Hamid Beigy

Figure 1 for PCoQA: Persian Conversational Question Answering Dataset

Figure 2 for PCoQA: Persian Conversational Question Answering Dataset

Figure 3 for PCoQA: Persian Conversational Question Answering Dataset

Figure 4 for PCoQA: Persian Conversational Question Answering Dataset

Abstract:Humans seek information regarding a specific topic through performing a conversation containing a series of questions and answers. In the pursuit of conversational question answering research, we introduce the PCoQA, the first \textbf{P}ersian \textbf{Co}nversational \textbf{Q}uestion \textbf{A}nswering dataset, a resource comprising information-seeking dialogs encompassing a total of 9,026 contextually-driven questions. Each dialog involves a questioner, a responder, and a document from the Wikipedia; The questioner asks several inter-connected questions from the text and the responder provides a span of the document as the answer for each question. PCoQA is designed to present novel challenges compared to previous question answering datasets including having more open-ended non-factual answers, longer answers, and fewer lexical overlaps. This paper not only presents the comprehensive PCoQA dataset but also reports the performance of various benchmark models. Our models include baseline models and pre-trained models, which are leveraged to boost the performance of the model. The dataset and benchmarks are available at our Github page.

Via

Access Paper or Ask Questions

KhabarChin: Automatic Detection of Important News in the Persian Language

Dec 06, 2023

Hamed Hematian Hemati, Arash Lagzian, Moein Salimi Sartakhti, Hamid Beigy, Ehsaneddin Asgari

Abstract:Being aware of important news is crucial for staying informed and making well-informed decisions efficiently. Natural Language Processing (NLP) approaches can significantly automate this process. This paper introduces the detection of important news, in a previously unexplored area, and presents a new benchmarking dataset (Khabarchin) for detecting important news in the Persian language. We define important news articles as those deemed significant for a considerable portion of society, capable of influencing their mindset or decision-making. The news articles are obtained from seven different prominent Persian news agencies, resulting in the annotation of 7,869 samples and the creation of the dataset. Two challenges of high disagreement and imbalance between classes were faced, and solutions were provided for them. We also propose several learning-based models, ranging from conventional machine learning to state-of-the-art transformer models, to tackle this task. Furthermore, we introduce the second task of important sentence detection in news articles, as they often come with a significant contextual length that makes it challenging for readers to identify important information. We identify these sentences in a weakly supervised manner.

* 8 pages, 2 figures

Via

Access Paper or Ask Questions

Persian Natural Language Inference: A Meta-learning approach

May 18, 2022

Heydar Soudani, Mohammad Hassan Mojab, Hamid Beigy

Figure 1 for Persian Natural Language Inference: A Meta-learning approach

Figure 2 for Persian Natural Language Inference: A Meta-learning approach

Figure 3 for Persian Natural Language Inference: A Meta-learning approach

Figure 4 for Persian Natural Language Inference: A Meta-learning approach

Abstract:Incorporating information from other languages can improve the results of tasks in low-resource languages. A powerful method of building functional natural language processing systems for low-resource languages is to combine multilingual pre-trained representations with cross-lingual transfer learning. In general, however, shared representations are learned separately, either across tasks or across languages. This paper proposes a meta-learning approach for inferring natural language in Persian. Alternately, meta-learning uses different task information (such as QA in Persian) or other language information (such as natural language inference in English). Also, we investigate the role of task augmentation strategy for forming additional high-quality tasks. We evaluate the proposed method using four languages and an auxiliary task. Compared to the baseline approach, the proposed model consistently outperforms it, improving accuracy by roughly six percent. We also examine the effect of finding appropriate initial parameters using zero-shot evaluation and CCA similarity.

* 8 pages, 3 figures, code is available at https://github.com/HassanMojab/MetaNLI, to be published in COLING 2022

Via

Access Paper or Ask Questions

Gransformer: Transformer-based Graph Generation

Mar 25, 2022

Ahmad Khajenezhad, Seyed Ali Osia, Mahmood Karimian, Hamid Beigy

Figure 1 for Gransformer: Transformer-based Graph Generation

Figure 2 for Gransformer: Transformer-based Graph Generation

Figure 3 for Gransformer: Transformer-based Graph Generation

Figure 4 for Gransformer: Transformer-based Graph Generation

Abstract:Transformers have become widely used in modern models for various tasks such as natural language processing and machine vision. This paper, proposes Gransformer, an algorithm for generating graphs that takes advantage of the transformer. We extend a simple autoregressive transformer encoder to exploit the structural information of the graph through efficient modifications. The attention mechanism is modified to consider the presence or absence of edges between each pair of nodes. We also introduce a graph-based familiarity measure that applies to both the attention and the positional coding. This autoregressive criterion, inspired by message passing algorithms, contains structural information about the graph. In the output layer, we also use a masked autoencoder for density estimation to efficiently model the generation of dependent edges. We also propose a technique to prevent the model from generating isolated nodes. We evaluate this method on two real-world datasets and compare it with some state-of-the-art autoregressive graph generation methods. Experimental results have shown that the proposed method performs comparative to these methods, including recurrent models and graph convolutional networks.

Via

Access Paper or Ask Questions

Supervised Fuzzy Partitioning

Oct 22, 2018

Pooya Ashtari, Fateme Nateghi Haredasht, Hamid Beigy

Figure 1 for Supervised Fuzzy Partitioning

Figure 2 for Supervised Fuzzy Partitioning

Figure 3 for Supervised Fuzzy Partitioning

Figure 4 for Supervised Fuzzy Partitioning

Abstract:Centroid-based methods including k-means and fuzzy c-means are known as effective and easy-to-implement approaches to clustering purposes in many applications. However, these algorithms cannot be directly applied to supervised tasks. This paper thus presents a generative model extending the centroid-based clustering approach to be applicable to classification and regression tasks. Given an arbitrary loss function, the proposed approach, termed Supervised Fuzzy Partitioning (SFP), incorporates labels information into its objective function through a surrogate term penalizing the empirical risk. Entropy-based regularization is also employed to fuzzify the partition and to weight features, enabling the method to capture more complex patterns, identify significant features, and yield better performance facing high-dimensional data. An iterative algorithm based on block coordinate descent scheme is formulated to efficiently find a local optimum. Extensive classification experiments on synthetic, real-world, and high-dimensional datasets demonstrate that the predictive performance of SFP is competitive with state-of-the-art algorithms such as random forest and SVM. The SFP has a major advantage over such methods, in that it not only leads to a flexible, nonlinear model but also can exploit any convex loss function in the training phase without compromising computational efficiency.

* 29 pages, 5 figures, 6 tables, 1 algorithm

Via

Access Paper or Ask Questions

Cascading Randomized Weighted Majority: A New Online Ensemble Learning Algorithm

Feb 02, 2015

Mohammadzaman Zamani, Hamid Beigy, Amirreza Shaban

Figure 1 for Cascading Randomized Weighted Majority: A New Online Ensemble Learning Algorithm

Figure 2 for Cascading Randomized Weighted Majority: A New Online Ensemble Learning Algorithm

Figure 3 for Cascading Randomized Weighted Majority: A New Online Ensemble Learning Algorithm

Figure 4 for Cascading Randomized Weighted Majority: A New Online Ensemble Learning Algorithm

Abstract:With the increasing volume of data in the world, the best approach for learning from this data is to exploit an online learning algorithm. Online ensemble methods are online algorithms which take advantage of an ensemble of classifiers to predict labels of data. Prediction with expert advice is a well-studied problem in the online ensemble learning literature. The Weighted Majority algorithm and the randomized weighted majority (RWM) are the most well-known solutions to this problem, aiming to converge to the best expert. Since among some expert, the best one does not necessarily have the minimum error in all regions of data space, defining specific regions and converging to the best expert in each of these regions will lead to a better result. In this paper, we aim to resolve this defect of RWM algorithms by proposing a novel online ensemble algorithm to the problem of prediction with expert advice. We propose a cascading version of RWM to achieve not only better experimental results but also a better error bound for sufficiently large datasets.

* 15 pages, 3 figures

Via

Access Paper or Ask Questions