Abstract:This study presents Weighted Sampled Split Learning (WSSL), an innovative framework tailored to bolster privacy, robustness, and fairness in distributed machine learning systems. Unlike traditional approaches, WSSL disperses the learning process among multiple clients, thereby safeguarding data confidentiality. Central to WSSL's efficacy is its utilization of weighted sampling. This approach ensures equitable learning by tactically selecting influential clients based on their contributions. Our evaluation of WSSL spanned various client configurations and employed two distinct datasets: Human Gait Sensor and CIFAR-10. We observed three primary benefits: heightened model accuracy, enhanced robustness, and maintained fairness across diverse client compositions. Notably, our distributed frameworks consistently surpassed centralized counterparts, registering accuracy peaks of 82.63% and 75.51% for the Human Gait Sensor and CIFAR-10 datasets, respectively. These figures contrast with the top accuracies of 81.12% and 58.60% achieved by centralized systems. Collectively, our findings champion WSSL as a potent and scalable successor to conventional centralized learning, marking it as a pivotal stride forward in privacy-focused, resilient, and impartial distributed machine learning.
Abstract:Digital cryptocurrencies such as Bitcoin have exploded in recent years in both popularity and value. By their novelty, cryptocurrencies tend to be both volatile and highly speculative. The capricious nature of these coins is helped facilitated by social media networks such as Twitter. However, not everyone's opinion matters equally, with most posts garnering little to no attention. Additionally, the majority of tweets are retweeted from popular posts. We must determine whose opinion matters and the difference between influential and non-influential users. This study separates these two groups and analyzes the differences between them. It uses Hypertext-induced Topic Selection (HITS) algorithm, which segregates the dataset based on influence. Topic modeling is then employed to uncover differences in each group's speech types and what group may best represent the entire community. We found differences in language and interest between these two groups regarding Bitcoin and that the opinion leaders of Twitter are not aligned with the majority of users. There were 2559 opinion leaders (0.72% of users) who accounted for 80% of the authority and the majority (99.28%) users for the remaining 20% out of a total of 355,139 users.
Abstract:Online news and information sources are convenient and accessible ways to learn about current issues. For instance, more than 300 million people engage with posts on Twitter globally, which provides the possibility to disseminate misleading information. There are numerous cases where violent crimes have been committed due to fake news. This research presents the CovidMis20 dataset (COVID-19 Misinformation 2020 dataset), which consists of 1,375,592 tweets collected from February to July 2020. CovidMis20 can be automatically updated to fetch the latest news and is publicly available at: https://github.com/everythingguy/CovidMis20. This research was conducted using Bi-LSTM deep learning and an ensemble CNN+Bi-GRU for fake news detection. The results showed that, with testing accuracy of 92.23% and 90.56%, respectively, the ensemble CNN+Bi-GRU model consistently provided higher accuracy than the Bi-LSTM model.