IBM
Abstract:Realizing the vision of using AI agents to automate critical IT tasks depends on the ability to measure and understand effectiveness of proposed solutions. We introduce ITBench, a framework that offers a systematic methodology for benchmarking AI agents to address real-world IT automation tasks. Our initial release targets three key areas: Site Reliability Engineering (SRE), Compliance and Security Operations (CISO), and Financial Operations (FinOps). The design enables AI researchers to understand the challenges and opportunities of AI agents for IT automation with push-button workflows and interpretable metrics. ITBench includes an initial set of 94 real-world scenarios, which can be easily extended by community contributions. Our results show that agents powered by state-of-the-art models resolve only 13.8% of SRE scenarios, 25.2% of CISO scenarios, and 0% of FinOps scenarios. We expect ITBench to be a key enabler of AI-driven IT automation that is correct, safe, and fast.
Abstract:Collecting sufficient amount of data that can represent various acoustic environmental attributes is a critical problem for distributed acoustic machine learning. Several audio data augmentation techniques have been introduced to address this problem but they tend to remain in simple manipulation of existing data and are insufficient to cover the variability of the environments. We propose a method to extend a technique that has been used for transferring acoustic style textures between audio data. The method transfers audio signatures between environments for distributed acoustic data augmentation. This paper devises metrics to evaluate the generated acoustic data, based on classification accuracy and content preservation. A series of experiments were conducted using UrbanSound8K dataset and the results show that the proposed method generates better audio data with transferred environmental features while preserving content features.
Abstract:Creativity, a process that generates novel and valuable ideas, involves increased association between task-positive (control) and task-negative (default) networks in brain. Inspired by this seminal finding, in this study we propose a creative decoder that directly modulates the neuronal activation pattern, while sampling from the learned latent space. The proposed approach is fully unsupervised and can be used as off-the-shelf. Our experiments on three different image datasets (MNIST, FMNIST, CELEBA) reveal that the co-activation between task-positive and task-negative neurons during decoding in a deep neural net enables generation of novel artifacts. We further identify sufficient conditions on several novelty metrics towards measuring the creativity of generated samples.