Abstract:Relation extraction is a Natural Language Processing task aiming to extract relationships from textual data. It is a critical step for information extraction. Due to its wide-scale applicability, research in relation extraction has rapidly scaled to using highly advanced neural networks. Despite their computational superiority, modern relation extractors fail to handle complicated extraction scenarios. However, a comprehensive performance analysis of the state-of-the-art relation extractors that compile these challenges has been missing from the literature, and this paper aims to bridge this gap. The goal has been to investigate the possible data-centric characteristics that impede neural relation extraction. Based on extensive experiments conducted using 15 state-of-the-art relation extraction algorithms ranging from recurrent architectures to large language models and seven large-scale datasets, this research suggests that modern relation extractors are not robust to complex data and relation characteristics. It emphasizes pivotal issues, such as contextual ambiguity, correlating relations, long-tail data, and fine-grained relation distributions. In addition, it sets a marker for future directions to alleviate these issues, thereby proving to be a critical resource for novice and advanced researchers. Efficient handling of the challenges described can have significant implications for the field of information extraction, which is a critical part of popular systems such as search engines and chatbots. Data and relevant code can be found at https://github.com/anushkasw/MaxRE.
Abstract:A Bill of Materials (BoM) is a list of all components on a printed circuit board (PCB). Since BoMs are useful for hardware assurance, automatic BoM extraction (AutoBoM) is of great interest to the government and electronics industry. To achieve a high-accuracy AutoBoM process, domain knowledge of PCB text and logos must be utilized. In this study, we discuss the challenges associated with automatic PCB marking extraction and propose 1) a plan for collecting salient PCB marking data, and 2) a framework for incorporating this data for automatic PCB assurance. Given the proposed dataset plan and framework, subsequent future work, implications, and open research possibilities are detailed.
Abstract:Hardware assurance of electronics is a challenging task and is of great interest to the government and the electronics industry. Physical inspection-based methods such as reverse engineering (RE) and Trojan scanning (TS) play an important role in hardware assurance. Therefore, there is a growing demand for automation in RE and TS. Many state-of-the-art physical inspection methods incorporate an iterative imaging and delayering workflow. In practice, uniform delayering can be challenging if the thickness of the initial layer of material is non-uniform. Moreover, this non-uniformity can reoccur at any stage during delayering and must be corrected. Therefore, it is critical to evaluate the thickness of the layers to be removed in a real-time fashion. Our proposed method uses electron beam voltage imaging, image processing, and Monte Carlo simulation to measure the thickness of remaining silicon to guide a uniform delayering process
Abstract:Artificial intelligence (AI) and machine learning (ML) techniques have been increasingly used in several fields to improve performance and the level of automation. In recent years, this use has exponentially increased due to the advancement of high-performance computing and the ever increasing size of data. One of such fields is that of hardware design; specifically the design of digital and analog integrated circuits~(ICs), where AI/ ML techniques have been extensively used to address ever-increasing design complexity, aggressive time-to-market, and the growing number of ubiquitous interconnected devices (IoT). However, the security concerns and issues related to IC design have been highly overlooked. In this paper, we summarize the state-of-the-art in AL/ML for circuit design/optimization, security and engineering challenges, research in security-aware CAD/EDA, and future research directions and needs for using AI/ML for security-aware circuit design.
Abstract:The continued outsourcing of printed circuit board (PCB) fabrication to overseas venues necessitates increased hardware assurance capabilities. Toward this end, several automated optical inspection (AOI) techniques have been proposed in the past exploring various aspects of PCB images acquired using digital cameras. In this work, we review state-of-the-art AOI techniques and observed the strong, rapid trend toward machine learning (ML) solutions. These require significant amounts of labeled ground truth data, which is lacking in the publicly available PCB data space. We propose the FICS PBC Image Collection (FPIC) dataset to address this bottleneck in available large-volume, diverse, semantic annotations. Additionally, this work covers the potential increase in hardware security capabilities and observed methodological distinctions highlighted during data collection.