Abstract:The growing threats of uncertainties, anomalies, and cyberattacks on power grids are driving a critical need to advance situational awareness which allows system operators to form a complete and accurate picture of the present and future state. Simulation and estimation are foundational tools in this process. However, existing tools lack the robustness and efficiency required to achieve the level of situational awareness needed for the ever-evolving threat landscape. Industry-standard (steady-state) simulators are not robust to blackouts, often leading to non-converging or non-actionable results. Estimation tools lack robustness to anomalous data, returning erroneous system states. Efficiency is the other major concern as nonlinearities and scalability issues make large systems slow to converge. This thesis addresses robustness and efficiency gaps through a dual-fold contribution. We first address the inherent limitations in the existing physics-based and data-driven worlds; and then transcend the boundaries of conventional algorithmic design in the direction of a new paradigm -- Physics-ML Synergy -- which integrates the strengths of the two worlds. Our approaches are built on circuit formulation which provides a unified framework that applies to both transmission and distribution. Sparse optimization acts as the key enabler to make these tools intrinsically robust and immune to random threats, pinpointing dominant sources of (random) blackouts and data errors. Further, we explore sparsity-exploiting optimizations to develop lightweight ML models whose prediction and detection capabilities are a complement to physics-based tools; and whose lightweight designs advance generalization and scalability. Finally, Physics-ML Synergy brings robustness and efficiency further against targeted cyberthreats, by interconnecting our physics-based tools with lightweight ML.
Abstract:Recent years have seen a rich literature of data-driven approaches designed for power grid applications. However, insufficient consideration of domain knowledge can impose a high risk to the practicality of the methods. Specifically, ignoring the grid-specific spatiotemporal patterns (in load, generation, and topology, etc.) can lead to outputting infeasible, unrealizable, or completely meaningless predictions on new inputs. To address this concern, this paper investigates real-world operational data to provide insights into power grid behavioral patterns, including the time-varying topology, load, and generation, as well as the spatial differences (in peak hours, diverse styles) between individual loads and generations. Then based on these observations, we evaluate the generalization risks in some existing ML works causedby ignoring these grid-specific patterns in model design and training.
Abstract:When applied to a real-world safety critical system like the power grid, general machine learning methods suffer from expensive training, non-physical solutions, and limited interpretability. To address these challenges for power grids, many recent works have explored the inclusion of grid physics (i.e., domain expertise) into their method design, primarily through including system constraints and technical limits, reducing search space and defining meaningful features in latent space. Yet, there is no general methodology to evaluate the practicality of these approaches in power grid tasks, and limitations exist regarding scalability, generalization, interpretability, etc. This work formalizes a new concept of physical interpretability which assesses how a ML model makes predictions in a physically meaningful way and introduces an evaluation methodology that identifies a set of attributes that a practical method should satisfy. Inspired by the evaluation attributes, the paper further develops a novel contingency analysis warm starter for MadIoT cyberattack, based on a conditional Gaussian random field. This method serves as an instance of an ML model that can incorporate diverse domain knowledge and improve on these identified attributes. Experiments validate that the warm starter significantly boosts the efficiency of contingency analysis for MadIoT attack even with shallow NN architectures.
Abstract:An accurate and up-to-date grid topology is critical for situational awareness. However, it is non-trivial to obtain due to inaccurate switch status data caused by physical damage, communication error, or cyber-attack. This paper formulates a circuit-theoretic node-breaker (NB) model to create a generalized state estimation (GSE) method that is scalable and easily solvable for a practical grid with RTU and PMU measurements. We demonstrate that all switching devices (with discrete status) and meters (with continuous measurements) can be replaced with linear circuit models without relaxation so that the entire grid is mapped to an expanded linear circuit. Using this grid model, the state estimation is formulated as a Linear Programming (LP) problem whose solution includes a sparse vector of noise terms, which localizes suspicious wrong status and bad data separately. The proposed method provides the benefits of convexity and a reliable state estimation with intrinsic robustness against wrong switch status and bad measurement data.
Abstract:Proliferation of grid resources on the distribution network along with the inability to forecast them accurately will render the existing methodology of grid operation and control untenable in the future. Instead, a more distributed yet coordinated approach for grid operation and control will emerge that models and analyzes the grid with a larger footprint and deeper hierarchy to unify control of disparate T&D grid resources under a common framework. Such approach will require AC state-estimation (ACSE) of joint T&D networks. Today, no practical method for realizing combined T&D ACSE exists. This paper addresses that gap from circuit-theoretic perspective through realizing a combined T&D ACSE solution methodology that is fast, convex and robust against bad-data. To address daunting challenges of problem size (million+ variables) and data-privacy, the approach is distributed both in memory and computing resources. To ensure timely convergence, the approach constructs a distributed circuit model for combined T&D networks and utilizes node-tearing techniques for efficient parallelism. To demonstrate the efficacy of the approach, combined T&D ACSE algorithm is run on large test networks that comprise of multiple T&D feeders. The results reflect the accuracy of the estimates in terms of root mean-square error and algorithm scalability in terms of wall-clock time.
Abstract:Given sensor readings over time from a power grid, how can we accurately detect when an anomaly occurs? A key part of achieving this goal is to use the network of power grid sensors to quickly detect, in real-time, when any unusual events, whether natural faults or malicious, occur on the power grid. Existing bad-data detectors in the industry lack the sophistication to robustly detect broad types of anomalies, especially those due to emerging cyber-attacks, since they operate on a single measurement snapshot of the grid at a time. New ML methods are more widely applicable, but generally do not consider the impact of topology change on sensor measurements and thus cannot accommodate regular topology adjustments in historical data. Hence, we propose DYNWATCH, a domain knowledge based and topology-aware algorithm for anomaly detection using sensors placed on a dynamic grid. Our approach is accurate, outperforming existing approaches by 20% or more (F-measure) in experiments; and fast, running in less than 1.7ms on average per time tick per sensor on a 60K+ branch case using a laptop computer, and scaling linearly in the size of the graph.