Abstract:Image generation and editing have seen a great deal of advancements with the rise of large-scale diffusion models that allow user control of different modalities such as text, mask, depth maps, etc. However, controlled editing of videos still lags behind. Prior work in this area has focused on using 2D diffusion models to globally change the style of an existing video. On the other hand, in many practical applications, editing localized parts of the video is critical. In this work, we propose a method to edit videos using a pre-trained inpainting image diffusion model. We systematically redesign the forward path of the model by replacing the self-attention modules with an extended version of attention modules that creates frame-level dependencies. In this way, we ensure that the edited information will be consistent across all the video frames no matter what the shape and position of the masked area is. We qualitatively compare our results with state-of-the-art in terms of accuracy on several video editing tasks like object retargeting, object replacement, and object removal tasks. Simulations demonstrate the superior performance of the proposed strategy.
Abstract:Non-intrusive Load Monitoring (NILM) is an established technique for effective and cost-efficient electricity consumption management. The method is used to estimate appliance-level power consumption from aggregated power measurements. This paper presents a hybrid learning approach, consisting of a convolutional neural network (CNN) and a bidirectional long short-term memory (BILSTM), featuring an integrated attention mechanism, all within the context of disaggregating low-frequency power data. While prior research has been mainly focused on high-frequency data disaggregation, our study takes a distinct direction by concentrating on low-frequency data. The proposed hybrid CNN-BILSTM model is adept at extracting both temporal (time-related) and spatial (location-related) features, allowing it to precisely identify energy consumption patterns at the appliance level. This accuracy is further enhanced by the attention mechanism, which aids the model in pinpointing crucial parts of the data for more precise event detection and load disaggregation. We conduct simulations using the existing low-frequency REDD dataset to assess our model performance. The results demonstrate that our proposed approach outperforms existing methods in terms of accuracy and computation time.
Abstract:A key question in the problem of 3D reconstruction is how to train a machine or a robot to model 3D objects. Many tasks like navigation in real-time systems such as autonomous vehicles directly depend on this problem. These systems usually have limited computational power. Despite considerable progress in 3D reconstruction systems in recent years, applying them to real-time systems such as navigation systems in autonomous vehicles is still challenging due to the high complexity and computational demand of the existing methods. This study addresses current problems in reconstructing objects displayed in a single-view image in a faster (real-time) fashion. To this end, a simple yet powerful deep neural framework is developed. The proposed framework consists of two components: the feature extractor module and the 3D generator module. We use point cloud representation for the output of our reconstruction module. The ShapeNet dataset is utilized to compare the method with the existing results in terms of computation time and accuracy. Simulations demonstrate the superior performance of the proposed method. Index Terms-Real-time 3D reconstruction, single-view reconstruction, supervised learning, deep neural network
Abstract:A novel semi-analytical method is proposed to develop the pseudo-rigid-body~(PRB) model of robots made of highly flexible members (HFM), such as flexures and continuum robots, with no limit on the degrees of freedom of the PRB model. The proposed method has a simple formulation yet high precision. Furthermore, it can describe HFMs with variable curvature and stiffness along their length. The method offers a semi-analytical solution for the highly coupled nonlinear constrained optimization problem of PRB modeling and can be extended to variable-length robots comprised of HFM, such as catheter and concentric tube robots. We also show that this method can obtain a PRB model of uniformly stiff HFMs, with only three parameters. The versatility of the method is investigated in various applications of HFM in continuum robots. Simulations demonstrate substantial improvement in the precision of the PRB model in general and a reduction in the complexity of the formulation.
Abstract:In this paper, a new numerical method to solve the forward kinematics (FK) of a parallel manipulator with three-limb spherical-prismatic-revolute (3SPR) structure is presented. Unlike the existing numerical approaches that rely on computation of the manipulator's Jacobian matrix and its inverse at each iteration, the proposed algorithm requires much less computations to estimate the FK parameters. A cost function is introduced that measures the difference of the estimates from the actual FK values. At each iteration, the problem is decomposed into two steps. First, the estimates of the platform orientation from the heave estimates are obtained. Then, heave estimates are updated by moving in the gradient direction of the proposed cost function. To validate the performance of the proposed algorithm, it is compared against a Jacobian-based (JB) approach for a 3SPR parallel manipulator.
Abstract:In this paper, we study the global convergence of model-based and model-free policy gradient descent and natural policy gradient descent algorithms for linear quadratic deep structured teams. In such systems, agents are partitioned into a few sub-populations wherein the agents in each sub-population are coupled in the dynamics and cost function through a set of linear regressions of the states and actions of all agents. Every agent observes its local state and the linear regressions of states, called deep states. For a sufficiently small risk factor and/or sufficiently large population, we prove that model-based policy gradient methods globally converge to the optimal solution. Given an arbitrary number of agents, we develop model-free policy gradient and natural policy gradient algorithms for the special case of risk-neutral cost function. The proposed algorithms are scalable with respect to the number of agents due to the fact that the dimension of their policy space is independent of the number of agents in each sub-population. Simulations are provided to verify the theoretical results.
Abstract:In this paper, structural controllability of a leader-follower multi-agent system with multiple leaders is studied from a graph-theoretic point of view. The problem of preservation of structural controllability under simultaneous failures in both the communication links and the agents is investigated. The effects of the loss of agents and communication links on the controllability of an information flow graph are previously studied. In this work, the corresponding results are exploited to introduce some useful indices and importance measures that help characterize and quantify the role of individual links and agents in the controllability of the overall network. Existing results are then extended by considering the effects of losses in both links and agents at the same time. To this end, the concepts of joint (r,s)-controllability and joint t-controllability are introduced as quantitative measures of reliability for a multi-agent system, and their important properties are investigated. Lastly, the class of jointly critical digraphs is introduced and it is stated that if a digraph is jointly critical, then joint t-controllability is a necessary and sufficient condition for remaining controllable following the failure of any set of links and agents, with cardinality less than t. Various examples are exploited throughout the paper to elaborate on the analytical findings.
Abstract:In this work, the ability to distinguish digraphs from the output response of some observing agents in a multi-agent network under the agreement protocol has been studied. Given a fixed observation point, it is desired to find sufficient graphical conditions under which the failure of a set of edges in the network information flow digraph is distinguishable from another set. When the latter is empty, this corresponds to the detectability of the former link set given the response of the observing agent. In developing the results, a powerful extension of the all-minors matrix tree theorem in algebraic graph theory is proved which relates the minors of the transformed Laplacian of a directed graph to the number and length of the shortest paths between its vertices. The results reveal an intricate relationship between the ability to distinguish the responses of a healthy and a faulty multi-agent network and the inter-nodal paths in their information flow digraphs. The results have direct implications for the operation and design of multi-agent systems subject to multiple link losses. Simulations and examples are presented to illustrate the analytic findings.