Abstract:Artificial Intelligence (AI) is revolutionizing various fields, including public health surveillance. In Africa, where health systems frequently encounter challenges such as limited resources, inadequate infrastructure, failed health information systems and a shortage of skilled health professionals, AI offers a transformative opportunity. This paper investigates the applications of AI in public health surveillance across the continent, presenting successful case studies and examining the benefits, opportunities, and challenges of implementing AI technologies in African healthcare settings. Our paper highlights AI's potential to enhance disease monitoring and health outcomes, and support effective public health interventions. The findings presented in the paper demonstrate that AI can significantly improve the accuracy and timeliness of disease detection and prediction, optimize resource allocation, and facilitate targeted public health strategies. Additionally, our paper identified key barriers to the widespread adoption of AI in African public health systems and proposed actionable recommendations to overcome these challenges.
Abstract:Deep learning (DL) has emerged as a crucial tool in network anomaly detection (NAD) for cybersecurity. While DL models for anomaly detection excel at extracting features and learning patterns from data, they are vulnerable to data contamination -- the inadvertent inclusion of attack-related data in training sets presumed benign. This study evaluates the robustness of six unsupervised DL algorithms against data contamination using our proposed evaluation protocol. Results demonstrate significant performance degradation in state-of-the-art anomaly detection algorithms when exposed to contaminated data, highlighting the critical need for self-protection mechanisms in DL-based NAD models. To mitigate this vulnerability, we propose an enhanced auto-encoder with a constrained latent representation, allowing normal data to cluster more densely around a learnable center in the latent space. Our evaluation reveals that this approach exhibits improved resistance to data contamination compared to existing methods, offering a promising direction for more robust NAD systems.
Abstract:The increasing sophistication of cyber threats necessitates innovative approaches to cybersecurity. In this paper, we explore the potential of psychological profiling techniques, particularly focusing on the utilization of Large Language Models (LLMs) and psycholinguistic features. We investigate the intersection of psychology and cybersecurity, discussing how LLMs can be employed to analyze textual data for identifying psychological traits of threat actors. We explore the incorporation of psycholinguistic features, such as linguistic patterns and emotional cues, into cybersecurity frameworks. \iffalse Through case studies and experiments, we discuss the effectiveness of these methods in enhancing threat detection and mitigation strategies.\fi Our research underscores the importance of integrating psychological perspectives into cybersecurity practices to bolster defense mechanisms against evolving threats.
Abstract:This paper provides a comprehensive survey of recent advancements in leveraging machine learning techniques, particularly Transformer models, for predicting human mobility patterns during epidemics. Understanding how people move during epidemics is essential for modeling the spread of diseases and devising effective response strategies. Forecasting population movement is crucial for informing epidemiological models and facilitating effective response planning in public health emergencies. Predicting mobility patterns can enable authorities to better anticipate the geographical and temporal spread of diseases, allocate resources more efficiently, and implement targeted interventions. We review a range of approaches utilizing both pretrained language models like BERT and Large Language Models (LLMs) tailored specifically for mobility prediction tasks. These models have demonstrated significant potential in capturing complex spatio-temporal dependencies and contextual patterns in textual data.
Abstract:This paper scrutinizes a database of over 4900 YouTube videos to characterize financial market coverage. Financial market coverage generates a large number of videos. Therefore, watching these videos to derive actionable insights could be challenging and complex. In this paper, we leverage Whisper, a speech-to-text model from OpenAI, to generate a text corpus of market coverage videos from Bloomberg and Yahoo Finance. We employ natural language processing to extract insights regarding language use from the market coverage. Moreover, we examine the prominent presence of trending topics and their evolution over time, and the impacts that some individuals and organizations have on the financial market. Our characterization highlights the dynamics of the financial market coverage and provides valuable insights reflecting broad discussions regarding recent financial events and the world economy.
Abstract:Electronic Health Record (EHR) has become an essential tool in the healthcare ecosystem, providing authorized clinicians with patients' health-related information for better treatment. While most developed countries are taking advantage of EHRs to improve their healthcare system, it remains challenging in developing countries to support clinical decision-making and public health using a computerized patient healthcare information system. This paper proposes a novel EHR architecture suitable for developing countries--an architecture that fosters inclusion and provides solutions tailored to all social classes and socioeconomic statuses. Our architecture foresees an internet-free (offline) solution to allow medical transactions between healthcare organizations, and the storage of EHRs in geographically underserved and rural areas. Moreover, we discuss how artificial intelligence can leverage anonymous health-related information to enable better public health policy and surveillance.
Abstract:Anomaly detection has many applications ranging from bank-fraud detection and cyber-threat detection to equipment maintenance and health monitoring. However, choosing a suitable algorithm for a given application remains a challenging design decision, often informed by the literature on anomaly detection algorithms. We extensively reviewed twelve of the most popular unsupervised anomaly detection methods. We observed that, so far, they have been compared using inconsistent protocols - the choice of the class of interest or the positive class, the split of training and test data, and the choice of hyperparameters - leading to ambiguous evaluations. This observation led us to define a coherent evaluation protocol which we then used to produce an updated and more precise picture of the relative performance of the twelve methods on five widely used tabular datasets. While our evaluation cannot pinpoint a method that outperforms all the others on all datasets, it identifies those that stand out and revise misconceived knowledge about their relative performances.