Abstract:Authorship has entangled style and content inside. Authors frequently write about the same topics in the same style, so when different authors write about the exact same topic the easiest way out to distinguish them is by understanding the nuances of their style. Modern neural models for authorship can pick up these features using contrastive learning, however, some amount of content leakage is always present. Our aim is to reduce the inevitable impact and correlation between content and authorship. We present a technique to use contrastive learning (InfoNCE) with additional hard negatives synthetically created using a semantic similarity model. This disentanglement technique aims to distance the content embedding space from the style embedding space, leading to embeddings more informed by style. We demonstrate the performance with ablations on two different datasets and compare them on out-of-domain challenges. Improvements are clearly shown on challenging evaluations on prolific authors with up to a 10% increase in accuracy when the settings are particularly hard. Trials on challenges also demonstrate the preservation of zero-shot capabilities of this method as fine tuning.
Abstract:Visual analytics is essential for studying large time series due to its ability to reveal trends, anomalies, and insights. DeepVATS is a tool that merges Deep Learning (Deep) with Visual Analytics (VA) for the analysis of large time series data (TS). It has three interconnected modules. The Deep Learning module, developed in R, manages the load of datasets and Deep Learning models from and to the Storage module. This module also supports models training and the acquisition of the embeddings from the latent space of the trained model. The Storage module operates using the Weights and Biases system. Subsequently, these embeddings can be analyzed in the Visual Analytics module. This module, based on an R Shiny application, allows the adjustment of the parameters related to the projection and clustering of the embeddings space. Once these parameters are set, interactive plots representing both the embeddings, and the time series are shown. This paper introduces the tool and examines its scalability through log analytics. The execution time evolution is examined while the length of the time series is varied. This is achieved by resampling a large data series into smaller subsets and logging the main execution and rendering times for later analysis of scalability.
Abstract:The Three-Body Problem has fascinated scientists for centuries and it has been crucial in the design of modern space missions. Recent developments in Generative Artificial Intelligence hold transformative promise for addressing this longstanding problem. This work investigates the use of Variational Autoencoder (VAE) and its internal representation to generate periodic orbits. We utilize a comprehensive dataset of periodic orbits in the Circular Restricted Three-Body Problem (CR3BP) to train deep-learning architectures that capture key orbital characteristics, and we set up physical evaluation metrics for the generated trajectories. Through this investigation, we seek to enhance the understanding of how Generative AI can improve space mission planning and astrodynamics research, leading to novel, data-driven approaches in the field.
Abstract:Introduction: This article introduces DisTrack, a methodology and a tool developed for tracking and analyzing misinformation within Online Social Networks (OSNs). DisTrack is designed to combat the spread of misinformation through a combination of Natural Language Processing (NLP) Social Network Analysis (SNA) and graph visualization. The primary goal is to detect misinformation, track its propagation, identify its sources, and assess the influence of various actors within the network. Methods: DisTrack's architecture incorporates a variety of methodologies including keyword search, semantic similarity assessments, and graph generation techniques. These methods collectively facilitate the monitoring of misinformation, the categorization of content based on alignment with known false claims, and the visualization of dissemination cascades through detailed graphs. The tool is tailored to capture and analyze the dynamic nature of misinformation spread in digital environments. Results: The effectiveness of DisTrack is demonstrated through three case studies focused on different themes: discredit/hate speech, anti-vaccine misinformation, and false narratives about the Russia-Ukraine conflict. These studies show DisTrack's capabilities in distinguishing posts that propagate falsehoods from those that counteract them, and tracing the evolution of misinformation from its inception. Conclusions: The research confirms that DisTrack is a valuable tool in the field of misinformation analysis. It effectively distinguishes between different types of misinformation and traces their development over time. By providing a comprehensive approach to understanding and combating misinformation in digital spaces, DisTrack proves to be an essential asset for researchers and practitioners working to mitigate the impact of false information in online social environments.
Abstract:Over the last decade, Unmanned Aerial Vehicles (UAVs) have been extensively used in many commercial applications due to their manageability and risk avoidance. One of the main problems considered is the Mission Planning for multiple UAVs, where a solution plan must be found satisfying the different constraints of the problem. This problem has multiple variables that must be optimized simultaneously, such as the makespan, the cost of the mission or the risk. Therefore, the problem has a lot of possible optimal solutions, and the operator must select the final solution to be executed among them. In order to reduce the workload of the operator in this decision process, a Decision Support System (DSS) becomes necessary. In this work, a DSS consisting of ranking and filtering systems, which order and reduce the optimal solutions, has been designed. With regard to the ranking system, a wide range of Multi-Criteria Decision Making (MCDM) methods, including some fuzzy MCDM, are compared on a multi-UAV mission planning scenario, in order to study which method could fit better in a multi-UAV decision support system. Expert operators have evaluated the solutions returned, and the results show, on the one hand, that fuzzy methods generally achieve better average scores, and on the other, that all of the tested methods perform better when the preferences of the operators are biased towards a specific variable, and worse when their preferences are balanced. For the filtering system, a similarity function based on the proximity of the solutions has been designed, and on top of that, a threshold is tuned empirically to decide how to filter solutions without losing much of the hypervolume of the space of solutions.
Abstract:Management and mission planning over a swarm of unmanned aerial vehicle (UAV) remains to date as a challenging research trend in what regards to this particular type of aircrafts. These vehicles are controlled by a number of ground control station (GCS), from which they are commanded to cooperatively perform different tasks in specific geographic areas of interest. Mathematically the problem of coordinating and assigning tasks to a swarm of UAV can be modeled as a constraint satisfaction problem, whose complexity and multiple conflicting criteria has hitherto motivated the adoption of multi-objective solvers such as multi-objective evolutionary algorithm (MOEA). The encoding approach consists of different alleles representing the decision variables, whereas the fitness function checks that all constraints are fulfilled, minimizing the optimization criteria of the problem. In problems of high complexity involving several tasks, UAV and GCS, where the space of search is huge compared to the space of valid solutions, the convergence rate of the algorithm increases significantly. To overcome this issue, this work proposes a weighted random generator for the creation and mutation of new individuals. The main objective of this work is to reduce the convergence rate of the MOEA solver for multi-UAV mission planning using weighted random strategies that focus the search on potentially better regions of the solution space. Extensive experimental results over a diverse range of scenarios evince the benefits of the proposed approach, which notably improves this convergence rate with respect to a na\"ive MOEA approach.
Abstract:Unmanned Aerial Vehicle (UAVs) have become very popular in the last decade due to some advantages such as strong terrain adaptation, low cost, zero casualties, and so on. One of the most interesting advances in this field is the automation of mission planning (task allocation) and real-time replanning, which are highly useful to increase the autonomy of the vehicle and reduce the operator workload. These automated mission planning and replanning systems require a Human Computer Interface (HCI) that facilitates the visualization and selection of plans that will be executed by the vehicles. In addition, most missions should be assessed before their real-life execution. This paper extends QGroundControl, an open-source simulation environment for flight control of multiple vehicles, by adding a mission designer that permits the operator to build complex missions with tasks and other scenario items; an interface for automated mission planning and replanning, which works as a test bed for different algorithms, and a Decision Support System (DSS) that helps the operator in the selection of the plan. In this work, a complete guide of these systems and some practical use cases are provided.
Abstract:This paper proposes an Intrusion Detection System (IDS) employing the Harris Hawks Optimization algorithm (HHO) to optimize Multilayer Perceptron learning by optimizing bias and weight parameters. HHO-MLP aims to select optimal parameters in its learning process to minimize intrusion detection errors in networks. HHO-MLP has been implemented using EvoloPy NN framework, an open-source Python tool specialized for training MLPs using evolutionary algorithms. For purposes of comparing the HHO model against other evolutionary methodologies currently available, specificity and sensitivity measures, accuracy measures, and mse and rmse measures have been calculated using KDD datasets. Experiments have demonstrated the HHO MLP method is effective at identifying malicious patterns. HHO-MLP has been tested against evolutionary algorithms like Butterfly Optimization Algorithm (BOA), Grasshopper Optimization Algorithms (GOA), and Black Widow Optimizations (BOW), with validation by Random Forest (RF), XG-Boost. HHO-MLP showed superior performance by attaining top scores with accuracy rate of 93.17%, sensitivity level of 89.25%, and specificity percentage of 95.41%.
Abstract:Adversarial attacks represent a substantial challenge in Natural Language Processing (NLP). This study undertakes a systematic exploration of this challenge in two distinct phases: vulnerability evaluation and resilience enhancement of Transformer-based models under adversarial attacks. In the evaluation phase, we assess the susceptibility of three Transformer configurations, encoder-decoder, encoder-only, and decoder-only setups, to adversarial attacks of escalating complexity across datasets containing offensive language and misinformation. Encoder-only models manifest a 14% and 21% performance drop in offensive language detection and misinformation detection tasks, respectively. Decoder-only models register a 16% decrease in both tasks, while encoder-decoder models exhibit a maximum performance drop of 14% and 26% in the respective tasks. The resilience-enhancement phase employs adversarial training, integrating pre-camouflaged and dynamically altered data. This approach effectively reduces the performance drop in encoder-only models to an average of 5% in offensive language detection and 2% in misinformation detection tasks. Decoder-only models, occasionally exceeding original performance, limit the performance drop to 7% and 2% in the respective tasks. Although not surpassing the original performance, Encoder-decoder models can reduce the drop to an average of 6% and 2% respectively. Results suggest a trade-off between performance and robustness, with some models maintaining similar performance while gaining robustness. Our study and adversarial training techniques have been incorporated into an open-source tool for generating camouflaged datasets. However, methodology effectiveness depends on the specific camouflage technique and data encountered, emphasizing the need for continued exploration.
Abstract:Over the last decade, developments in unmanned aerial vehicles (UAVs) has greatly increased, and they are being used in many fields including surveillance, crisis management or automated mission planning. This last field implies the search of plans for missions with multiple tasks, UAVs and ground control stations; and the optimization of several objectives, including makespan, fuel consumption or cost, among others. In this work, this problem has been solved using a multi-objective evolutionary algorithm combined with a constraint satisfaction problem model, which is used in the fitness function of the algorithm. The algorithm has been tested on several missions of increasing complexity, and the computational complexity of the different element considered in the missions has been studied.