Abstract:NL2SQL (Natural Language to Structured Query Language) transformation has seen wide adoption in Business Intelligence (BI) applications in recent years. However, existing NL2SQL benchmarks are not suitable for production BI scenarios, as they are not designed for common business intelligence questions. To address this gap, we have developed a new benchmark focused on typical NL questions in industrial BI scenarios. We discuss the challenges of constructing a BI-focused benchmark and the shortcomings of existing benchmarks. Additionally, we introduce question categories in our benchmark that reflect common BI inquiries. Lastly, we propose two novel semantic similarity evaluation metrics for assessing NL2SQL capabilities in BI applications and services.
Abstract:Measuring Efficiency in neural network system development is an open research problem. This paper presents an experimental framework to measure the training efficiency of a neural architecture. To demonstrate our approach, we analyze the training efficiency of Convolutional Neural Networks and Bayesian equivalents on the MNIST and CIFAR-10 tasks. Our results show that training efficiency decays as training progresses and varies across different stopping criteria for a given neural model and learning task. We also find a non-linear relationship between training stopping criteria, training Efficiency, model size, and training Efficiency. Furthermore, we illustrate the potential confounding effects of overtraining on measuring the training efficiency of a neural architecture. Regarding relative training efficiency across different architectures, our results indicate that CNNs are more efficient than BCNNs on both datasets. More generally, as a learning task becomes more complex, the relative difference in training efficiency between different architectures becomes more pronounced.
Abstract:This study presents a novel methodology incorporating safety constraints into a robotic simulation during the training of deep reinforcement learning (DRL). The framework integrates specific parts of the safety requirements, such as velocity constraints, as specified by ISO 10218, directly within the DRL model that becomes a part of the robot's learning algorithm. The study then evaluated the efficiency of these safety constraints by subjecting the DRL model to various scenarios, including grasping tasks with and without obstacle avoidance. The validation process involved comprehensive simulation-based testing of the DRL model's responses to potential hazards and its compliance. Also, the performance of the system is carried out by the functional safety standards IEC 61508 to determine the safety integrity level. The study indicated a significant improvement in the safety performance of the robotic system. The proposed DRL model anticipates and mitigates hazards while maintaining operational efficiency. This study was validated in a testbed with a collaborative robotic arm with safety sensors and assessed with metrics such as the average number of safety violations, obstacle avoidance, and the number of successful grasps. The proposed approach outperforms the conventional method by a 16.5% average success rate on the tested scenarios in the simulations and 2.5% in the testbed without safety violations. The project repository is available at https://github.com/ammar-n-abbas/sim2real-ur-gym-gazebo.
Abstract:Mass customization and shorter manufacturing cycles are becoming more important among small and medium-sized companies. However, classical industrial robots struggle to cope with product variation and dynamic environments. In this paper, we present CoBT, a collaborative programming by demonstration framework for generating reactive and modular behavior trees. CoBT relies on a single demonstration and a combination of data-driven machine learning methods with logic-based declarative learning to learn a task, thus eliminating the need for programming expertise or long development times. The proposed framework is experimentally validated on 7 manipulation tasks and we show that CoBT achieves approx. 93% success rate overall with an average of 7.5s programming time. We conduct a pilot study with non-expert users to provide feedback regarding the usability of CoBT.
Abstract:Transformer-based Neural Language Models achieve state-of-the-art performance on various natural language processing tasks. However, an open question is the extent to which these models rely on word-order/syntactic or word co-occurrence/topic-based information when processing natural language. This work contributes to this debate by addressing the question of whether these models primarily use topic as a signal, by exploring the relationship between Transformer-based models' (BERT and RoBERTa's) performance on a range of probing tasks in English, from simple lexical tasks such as sentence length prediction to complex semantic tasks such as idiom token identification, and the sensitivity of these tasks to the topic information. To this end, we propose a novel probing method which we call topic-aware probing. Our initial results indicate that Transformer-based models encode both topic and non-topic information in their intermediate layers, but also that the facility of these models to distinguish idiomatic usage is primarily based on their ability to identify and encode topic. Furthermore, our analysis of these models' performance on other standard probing tasks suggests that tasks that are relatively insensitive to the topic information are also tasks that are relatively difficult for these models.
Abstract:In complex industrial and chemical process control rooms, effective decision-making is crucial for safety and efficiency. The experiments in this paper evaluate the impact and applications of an AI-based decision support system integrated into an improved human-machine interface, using dynamic influence diagrams, a hidden Markov model, and deep reinforcement learning. The enhanced support system aims to reduce operator workload, improve situational awareness, and provide different intervention strategies to the operator adapted to the current state of both the system and human performance. Such a system can be particularly useful in cases of information overload when many alarms and inputs are presented all within the same time window, or for junior operators during training. A comprehensive cross-data analysis was conducted, involving 47 participants and a diverse range of data sources such as smartwatch metrics, eye-tracking data, process logs, and responses from questionnaires. The results indicate interesting insights regarding the effectiveness of the approach in aiding decision-making, decreasing perceived workload, and increasing situational awareness for the scenarios considered. Additionally, the results provide valuable insights to compare differences between styles of information gathering when using the system by individual participants. These findings are particularly relevant when predicting the overall performance of the individual participant and their capacity to successfully handle a plant upset and the alarms connected to it using process and human-machine interaction logs in real-time. These predictions enable the development of more effective intervention strategies.
Abstract:In this paper we introduce TWIG (Topologically-Weighted Intelligence Generation), a novel, embedding-free paradigm for simulating the output of KGEs that uses a tiny fraction of the parameters. TWIG learns weights from inputs that consist of topological features of the graph data, with no coding for latent representations of entities or edges. Our experiments on the UMLS dataset show that a single TWIG neural network can predict the results of state-of-the-art ComplEx-N3 KGE model nearly exactly on across all hyperparameter configurations. To do this it uses a total of 2590 learnable parameters, but accurately predicts the results of 1215 different hyperparameter combinations with a combined cost of 29,322,000 parameters. Based on these results, we make two claims: 1) that KGEs do not learn latent semantics, but only latent representations of structural patterns; 2) that hyperparameter choice in KGEs is a deterministic function of the KGE model and graph structure. We further hypothesise that, as TWIG can simulate KGEs without embeddings, that node and edge embeddings are not needed to learn to accurately predict new facts in KGs. Finally, we formulate all of our findings under the umbrella of the ``Structural Generalisation Hypothesis", which suggests that ``twiggy" embedding-free / data-structure-based learning methods can allow a single neural network to simulate KGE performance, and perhaps solve the Link Prediction task, across many KGs from diverse domains and with different semantics.
Abstract:The difficulty of identifying the physical model of complex systems has led to exploring methods that do not rely on such complex modeling of the systems. Deep reinforcement learning has been the pioneer for solving this problem without the need for relying on the physical model of complex systems by just interacting with it. However, it uses a black-box learning approach that makes it difficult to be applied within real-world and safety-critical systems without providing explanations of the actions derived by the model. Furthermore, an open research question in deep reinforcement learning is how to focus the policy learning of critical decisions within a sparse domain. This paper proposes a novel approach for the use of deep reinforcement learning in safety-critical systems. It combines the advantages of probabilistic modeling and reinforcement learning with the added benefits of interpretability and works in collaboration and synchronization with conventional decision-making strategies. The BC-SRLA is activated in specific situations which are identified autonomously through the fused information of probabilistic model and reinforcement learning, such as abnormal conditions or when the system is near-to-failure. Further, it is initialized with a baseline policy using policy cloning to allow minimum interactions with the environment to address the challenges associated with using RL in safety-critical industries. The effectiveness of the BC-SRLA is demonstrated through a case study in maintenance applied to turbofan engines, where it shows superior performance to the prior art and other baselines.
Abstract:This paper discusses the methods that we used for our submissions to the WMT 2023 Terminology Shared Task for German-to-English (DE-EN), English-to-Czech (EN-CS), and Chinese-to-English (ZH-EN) language pairs. The task aims to advance machine translation (MT) by challenging participants to develop systems that accurately translate technical terms, ultimately enhancing communication and understanding in specialised domains. To this end, we conduct experiments that utilise large language models (LLMs) for two purposes: generating synthetic bilingual terminology-based data, and post-editing translations generated by an MT model through incorporating pre-approved terms. Our system employs a four-step process: (i) using an LLM to generate bilingual synthetic data based on the provided terminology, (ii) fine-tuning a generic encoder-decoder MT model, with a mix of the terminology-based synthetic data generated in the first step and a randomly sampled portion of the original generic training data, (iii) generating translations with the fine-tuned MT model, and (iv) finally, leveraging an LLM for terminology-constrained automatic post-editing of the translations that do not include the required terms. The results demonstrate the effectiveness of our proposed approach in improving the integration of pre-approved terms into translations. The number of terms incorporated into the translations of the blind dataset increases from an average of 36.67% with the generic model to an average of 72.88% by the end of the process. In other words, successful utilisation of terms nearly doubles across the three language pairs.
Abstract:Traditional controllers have limitations as they rely on prior knowledge about the physics of the problem, require modeling of dynamics, and struggle to adapt to abnormal situations. Deep reinforcement learning has the potential to address these problems by learning optimal control policies through exploration in an environment. For safety-critical environments, it is impractical to explore randomly, and replacing conventional controllers with black-box models is also undesirable. Also, it is expensive in continuous state and action spaces, unless the search space is constrained. To address these challenges we propose a specialized deep residual policy safe reinforcement learning with a cycle of learning approach adapted for complex and continuous state-action spaces. Residual policy learning allows learning a hybrid control architecture where the reinforcement learning agent acts in synchronous collaboration with the conventional controller. The cycle of learning initiates the policy through the expert trajectory and guides the exploration around it. Further, the specialization through the input-output hidden Markov model helps to optimize policy that lies within the region of interest (such as abnormality), where the reinforcement learning agent is required and is activated. The proposed solution is validated on the Tennessee Eastman process control.