Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ravi Iyer

AI and the Future of Digital Public Squares

Dec 13, 2024

Beth Goldberg, Diana Acosta-Navas, Michiel Bakker, Ian Beacock, Matt Botvinick, Prateek Buch, Renée DiResta, Nandika Donthi, Nathanael Fast, Ravi Iyer(+17 more)

Figure 1 for AI and the Future of Digital Public Squares

Figure 2 for AI and the Future of Digital Public Squares

Figure 3 for AI and the Future of Digital Public Squares

Figure 4 for AI and the Future of Digital Public Squares

Abstract:Two substantial technological advances have reshaped the public square in recent decades: first with the advent of the internet and second with the recent introduction of large language models (LLMs). LLMs offer opportunities for a paradigm shift towards more decentralized, participatory online spaces that can be used to facilitate deliberative dialogues at scale, but also create risks of exacerbating societal schisms. Here, we explore four applications of LLMs to improve digital public squares: collective dialogue systems, bridging systems, community moderation, and proof-of-humanity systems. Building on the input from over 70 civil society experts and technologists, we argue that LLMs both afford promising opportunities to shift the paradigm for conversations at scale and pose distinct risks for digital public squares. We lay out an agenda for future research and investments in AI that will strengthen digital public squares and safeguard against potential misuses of AI.

* 40 pages, 5 figures

Via

Access Paper or Ask Questions

Mem-Rec: Memory Efficient Recommendation System using Alternative Representation

May 15, 2023

Gopi Krishna Jha, Anthony Thomas, Nilesh Jain, Sameh Gobriel, Tajana Rosing, Ravi Iyer

Abstract:Deep learning-based recommendation systems (e.g., DLRMs) are widely used AI models to provide high-quality personalized recommendations. Training data used for modern recommendation systems commonly includes categorical features taking on tens-of-millions of possible distinct values. These categorical tokens are typically assigned learned vector representations, that are stored in large embedding tables, on the order of 100s of GB. Storing and accessing these tables represent a substantial burden in commercial deployments. Our work proposes MEM-REC, a novel alternative representation approach for embedding tables. MEM-REC leverages bloom filters and hashing methods to encode categorical features using two cache-friendly embedding tables. The first table (token embedding) contains raw embeddings (i.e. learned vector representation), and the second table (weight embedding), which is much smaller, contains weights to scale these raw embeddings to provide better discriminative capability to each data point. We provide a detailed architecture, design and analysis of MEM-REC addressing trade-offs in accuracy and computation requirements, in comparison with state-of-the-art techniques. We show that MEM-REC can not only maintain the recommendation quality and significantly reduce the memory footprint for commercial scale recommendation models but can also improve the embedding latency. In particular, based on our results, MEM-REC compresses the MLPerf CriteoTB benchmark DLRM model size by 2900x and performs up to 3.4x faster embeddings while achieving the same AUC as that of the full uncompressed model.

Via

Access Paper or Ask Questions

RAPID: Enabling Fast Online Policy Learning in Dynamic Public Cloud Environments

Apr 10, 2023

Drew Penney, Bin Li, Lizhong Chen, Jaroslaw J. Sydir, Anna Drewek-Ossowicka, Ramesh Illikkal, Charlie Tai, Ravi Iyer, Andrew Herdrich

Figure 1 for RAPID: Enabling Fast Online Policy Learning in Dynamic Public Cloud Environments

Figure 2 for RAPID: Enabling Fast Online Policy Learning in Dynamic Public Cloud Environments

Figure 3 for RAPID: Enabling Fast Online Policy Learning in Dynamic Public Cloud Environments

Figure 4 for RAPID: Enabling Fast Online Policy Learning in Dynamic Public Cloud Environments

Abstract:Resource sharing between multiple workloads has become a prominent practice among cloud service providers, motivated by demand for improved resource utilization and reduced cost of ownership. Effective resource sharing, however, remains an open challenge due to the adverse effects that resource contention can have on high-priority, user-facing workloads with strict Quality of Service (QoS) requirements. Although recent approaches have demonstrated promising results, those works remain largely impractical in public cloud environments since workloads are not known in advance and may only run for a brief period, thus prohibiting offline learning and significantly hindering online learning. In this paper, we propose RAPID, a novel framework for fast, fully-online resource allocation policy learning in highly dynamic operating environments. RAPID leverages lightweight QoS predictions, enabled by domain-knowledge-inspired techniques for sample efficiency and bias reduction, to decouple control from conventional feedback sources and guide policy learning at a rate orders of magnitude faster than prior work. Evaluation on a real-world server platform with representative cloud workloads confirms that RAPID can learn stable resource allocation policies in minutes, as compared with hours in prior state-of-the-art, while improving QoS by 9.0x and increasing best-effort workload performance by 19-43%.

Via

Access Paper or Ask Questions

Streaming Encoding Algorithms for Scalable Hyperdimensional Computing

Sep 28, 2022

Anthony Thomas, Behnam Khaleghi, Gopi Krishna Jha, Sanjoy Dasgupta, Nageen Himayat, Ravi Iyer, Nilesh Jain, Tajana Rosing

Figure 1 for Streaming Encoding Algorithms for Scalable Hyperdimensional Computing

Figure 2 for Streaming Encoding Algorithms for Scalable Hyperdimensional Computing

Figure 3 for Streaming Encoding Algorithms for Scalable Hyperdimensional Computing

Figure 4 for Streaming Encoding Algorithms for Scalable Hyperdimensional Computing

Abstract:Hyperdimensional computing (HDC) is a paradigm for data representation and learning originating in computational neuroscience. HDC represents data as high-dimensional, low-precision vectors which can be used for a variety of information processing tasks like learning or recall. The mapping to high-dimensional space is a fundamental problem in HDC, and existing methods encounter scalability issues when the input data itself is high-dimensional. In this work, we explore a family of streaming encoding techniques based on hashing. We show formally that these methods enjoy comparable guarantees on performance for learning applications while being substantially more efficient than existing alternatives. We validate these results experimentally on a popular high-dimensional classification problem and show that our approach easily scales to very large data sets.

* Fixes some typos and formatting issues

Via

Access Paper or Ask Questions

Evolving Zero Cost Proxies For Neural Architecture Scoring

Sep 15, 2022

Yash Akhauri, J. Pablo Munoz, Nilesh Jain, Ravi Iyer

Figure 1 for Evolving Zero Cost Proxies For Neural Architecture Scoring

Figure 2 for Evolving Zero Cost Proxies For Neural Architecture Scoring

Figure 3 for Evolving Zero Cost Proxies For Neural Architecture Scoring

Figure 4 for Evolving Zero Cost Proxies For Neural Architecture Scoring

Abstract:Neural Architecture Search (NAS) has significantly improved productivity in the design and deployment of neural networks (NN). As NAS typically evaluates multiple models by training them partially or completely, the improved productivity comes at the cost of significant carbon footprint. To alleviate this expensive training routine, zero-shot/cost proxies analyze an NN at initialization to generate a score, which correlates highly with its true accuracy. Zero-cost proxies are currently designed by experts conducting multiple cycles of empirical testing on possible algorithms, data-sets, and neural architecture design spaces. This lowers productivity and is an unsustainable approach towards zero-cost proxy design as deep learning use-cases diversify in nature. Additionally, existing zero-cost proxies fail to generalize across neural architecture design spaces. In this paper, we propose a genetic programming framework to automate the discovery of zero-cost proxies for neural architecture scoring. Our methodology efficiently discovers an interpretable and generalizable zero-cost proxy that gives state of the art score-accuracy correlation on all data-sets and search spaces of NASBench-201 and Network Design Spaces (NDS). We believe that this research indicates a promising direction towards automatically discovering zero-cost proxies that can work across network architecture design spaces, data-sets, and tasks.

Via

Access Paper or Ask Questions

Improving Robustness and Efficiency in Active Learning with Contrastive Loss

Sep 13, 2021

Ranganath Krishnan, Nilesh Ahuja, Alok Sinha, Mahesh Subedar, Omesh Tickoo, Ravi Iyer

Figure 1 for Improving Robustness and Efficiency in Active Learning with Contrastive Loss

Figure 2 for Improving Robustness and Efficiency in Active Learning with Contrastive Loss

Figure 3 for Improving Robustness and Efficiency in Active Learning with Contrastive Loss

Figure 4 for Improving Robustness and Efficiency in Active Learning with Contrastive Loss

Abstract:This paper introduces supervised contrastive active learning (SCAL) by leveraging the contrastive loss for active learning in a supervised setting. We propose efficient query strategies in active learning to select unbiased and informative data samples of diverse feature representations. We demonstrate our proposed method reduces sampling bias, achieves state-of-the-art accuracy and model calibration in an active learning setup with the query computation 11x faster than CoreSet and 26x faster than Bayesian active learning by disagreement. Our method yields well-calibrated models even with imbalanced datasets. We also evaluate robustness to dataset shift and out-of-distribution in active learning setup and demonstrate our proposed SCAL method outperforms high performing compute-intensive methods by a bigger margin (average 8.9% higher AUROC for out-of-distribution detection and average 7.2% lower ECE under dataset shift).

* arXiv admin note: substantial text overlap with arXiv:2109.06321

Via

Access Paper or Ask Questions

Mitigating Sampling Bias and Improving Robustness in Active Learning

Sep 13, 2021

Ranganath Krishnan, Alok Sinha, Nilesh Ahuja, Mahesh Subedar, Omesh Tickoo, Ravi Iyer

Figure 1 for Mitigating Sampling Bias and Improving Robustness in Active Learning

Figure 2 for Mitigating Sampling Bias and Improving Robustness in Active Learning

Figure 3 for Mitigating Sampling Bias and Improving Robustness in Active Learning

Figure 4 for Mitigating Sampling Bias and Improving Robustness in Active Learning

Abstract:This paper presents simple and efficient methods to mitigate sampling bias in active learning while achieving state-of-the-art accuracy and model robustness. We introduce supervised contrastive active learning by leveraging the contrastive loss for active learning under a supervised setting. We propose an unbiased query strategy that selects informative data samples of diverse feature representations with our methods: supervised contrastive active learning (SCAL) and deep feature modeling (DFM). We empirically demonstrate our proposed methods reduce sampling bias, achieve state-of-the-art accuracy and model calibration in an active learning setup with the query computation 26x faster than Bayesian active learning by disagreement and 11x faster than CoreSet. The proposed SCAL method outperforms by a big margin in robustness to dataset shift and out-of-distribution.

* Human in the Loop Learning workshop at International Conference on Machine Learning (ICML 2021)

Via

Access Paper or Ask Questions

RHNAS: Realizable Hardware and Neural Architecture Search

Jun 17, 2021

Yash Akhauri, Adithya Niranjan, J. Pablo Muñoz, Suvadeep Banerjee, Abhijit Davare, Pasquale Cocchini, Anton A. Sorokin, Ravi Iyer, Nilesh Jain

Figure 1 for RHNAS: Realizable Hardware and Neural Architecture Search

Figure 2 for RHNAS: Realizable Hardware and Neural Architecture Search

Figure 3 for RHNAS: Realizable Hardware and Neural Architecture Search

Figure 4 for RHNAS: Realizable Hardware and Neural Architecture Search

Abstract:The rapidly evolving field of Artificial Intelligence necessitates automated approaches to co-design neural network architecture and neural accelerators to maximize system efficiency and address productivity challenges. To enable joint optimization of this vast space, there has been growing interest in differentiable NN-HW co-design. Fully differentiable co-design has reduced the resource requirements for discovering optimized NN-HW configurations, but fail to adapt to general hardware accelerator search spaces. This is due to the existence of non-synthesizable (invalid) designs in the search space of many hardware accelerators. To enable efficient and realizable co-design of configurable hardware accelerators with arbitrary neural network search spaces, we introduce RHNAS. RHNAS is a method that combines reinforcement learning for hardware optimization with differentiable neural architecture search. RHNAS discovers realizable NN-HW designs with 1.84x lower latency and 1.86x lower energy-delay product (EDP) on ImageNet and 2.81x lower latency and 3.30x lower EDP on CIFAR-10 over the default hardware accelerator design.

* 15 pages

Via

Access Paper or Ask Questions

ML-driven Malware that Targets AV Safety

Apr 24, 2020

Saurabh Jha, Shengkun Cui, Subho S. Banerjee, Timothy Tsai, Zbigniew Kalbarczyk, Ravi Iyer

Figure 1 for ML-driven Malware that Targets AV Safety

Figure 2 for ML-driven Malware that Targets AV Safety

Figure 3 for ML-driven Malware that Targets AV Safety

Figure 4 for ML-driven Malware that Targets AV Safety

Abstract:Ensuring the safety of autonomous vehicles (AVs) is critical for their mass deployment and public adoption. However, security attacks that violate safety constraints and cause accidents are a significant deterrent to achieving public trust in AVs, and that hinders a vendor's ability to deploy AVs. Creating a security hazard that results in a severe safety compromise (for example, an accident) is compelling from an attacker's perspective. In this paper, we introduce an attack model, a method to deploy the attack in the form of smart malware, and an experimental evaluation of its impact on production-grade autonomous driving software. We find that determining the time interval during which to launch the attack is{ critically} important for causing safety hazards (such as collisions) with a high degree of success. For example, the smart malware caused 33X more forced emergency braking than random attacks did, and accidents in 52.6% of the driving simulations.

* 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
* Accepted for DSN 2020

Via

Access Paper or Ask Questions

Neural Cache: Bit-Serial In-Cache Acceleration of Deep Neural Networks

May 09, 2018

Charles Eckert, Xiaowei Wang, Jingcheng Wang, Arun Subramaniyan, Ravi Iyer, Dennis Sylvester, David Blaauw, Reetuparna Das

Figure 1 for Neural Cache: Bit-Serial In-Cache Acceleration of Deep Neural Networks

Figure 2 for Neural Cache: Bit-Serial In-Cache Acceleration of Deep Neural Networks

Figure 3 for Neural Cache: Bit-Serial In-Cache Acceleration of Deep Neural Networks

Figure 4 for Neural Cache: Bit-Serial In-Cache Acceleration of Deep Neural Networks

Abstract:This paper presents the Neural Cache architecture, which re-purposes cache structures to transform them into massively parallel compute units capable of running inferences for Deep Neural Networks. Techniques to do in-situ arithmetic in SRAM arrays, create efficient data mapping and reducing data movement are proposed. The Neural Cache architecture is capable of fully executing convolutional, fully connected, and pooling layers in-cache. The proposed architecture also supports quantization in-cache. Our experimental results show that the proposed architecture can improve inference latency by 18.3x over state-of-art multi-core CPU (Xeon E5), 7.7x over server class GPU (Titan Xp), for Inception v3 model. Neural Cache improves inference throughput by 12.4x over CPU (2.2x over GPU), while reducing power consumption by 50% over CPU (53% over GPU).

* To appear in the 45th ACM/IEEE International Symposium on Computer Architecture (ISCA 2018)

Via

Access Paper or Ask Questions