Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jeff Zhang

Harvard University

Repurposing the scientific literature with vision-language models

Feb 26, 2025

Anton Alyakin, Jaden Stryker, Daniel Alexander Alber, Karl L. Sangwon, Brandon Duderstadt, Akshay Save, David Kurland, Spencer Frome, Shrutika Singh, Jeff Zhang(+21 more)

Abstract:Research in AI for Science often focuses on using AI technologies to augment components of the scientific process, or in some cases, the entire scientific method; how about AI for scientific publications? Peer-reviewed journals are foundational repositories of specialized knowledge, written in discipline-specific language that differs from general Internet content used to train most large language models (LLMs) and vision-language models (VLMs). We hypothesized that by combining a family of scientific journals with generative AI models, we could invent novel tools for scientific communication, education, and clinical care. We converted 23,000 articles from Neurosurgery Publications into a multimodal database - NeuroPubs - of 134 million words and 78,000 image-caption pairs to develop six datasets for building AI models. We showed that the content of NeuroPubs uniquely represents neurosurgery-specific clinical contexts compared with broader datasets and PubMed. For publishing, we employed generalist VLMs to automatically generate graphical abstracts from articles. Editorial board members rated 70% of these as ready for publication without further edits. For education, we generated 89,587 test questions in the style of the ABNS written board exam, which trainee and faculty neurosurgeons found indistinguishable from genuine examples 54% of the time. We used these questions alongside a curriculum learning process to track knowledge acquisition while training our 34 billion-parameter VLM (CNS-Obsidian). In a blinded, randomized controlled trial, we demonstrated the non-inferiority of CNS-Obsidian to GPT-4o (p = 0.1154) as a diagnostic copilot for a neurosurgical service. Our findings lay a novel foundation for AI with Science and establish a framework to elevate scientific communication using state-of-the-art generative artificial intelligence while maintaining rigorous quality standards.

Via

Access Paper or Ask Questions

SimPhony: A Device-Circuit-Architecture Cross-Layer Modeling and Simulation Framework for Heterogeneous Electronic-Photonic AI System

Nov 20, 2024

Ziang Yin, Meng Zhang, Amir Begovic, Rena Huang, Jeff Zhang, Jiaqi Gu

Figure 1 for SimPhony: A Device-Circuit-Architecture Cross-Layer Modeling and Simulation Framework for Heterogeneous Electronic-Photonic AI System

Figure 2 for SimPhony: A Device-Circuit-Architecture Cross-Layer Modeling and Simulation Framework for Heterogeneous Electronic-Photonic AI System

Figure 3 for SimPhony: A Device-Circuit-Architecture Cross-Layer Modeling and Simulation Framework for Heterogeneous Electronic-Photonic AI System

Figure 4 for SimPhony: A Device-Circuit-Architecture Cross-Layer Modeling and Simulation Framework for Heterogeneous Electronic-Photonic AI System

Abstract:Electronic-photonic integrated circuits (EPICs) offer transformative potential for next-generation high-performance AI but require interdisciplinary advances across devices, circuits, architecture, and design automation. The complexity of hybrid systems makes it challenging even for domain experts to understand distinct behaviors and interactions across design stack. The lack of a flexible, accurate, fast, and easy-to-use EPIC AI system simulation framework significantly limits the exploration of hardware innovations and system evaluations on common benchmarks. To address this gap, we propose SimPhony, a cross-layer modeling and simulation framework for heterogeneous electronic-photonic AI systems. SimPhony offers a platform that enables (1) generic, extensible hardware topology representation that supports heterogeneous multi-core architectures with diverse photonic tensor core designs; (2) optics-specific dataflow modeling with unique multi-dimensional parallelism and reuse beyond spatial/temporal dimensions; (3) data-aware energy modeling with realistic device responses, layout-aware area estimation, link budget analysis, and bandwidth-adaptive memory modeling; and (4) seamless integration with model training framework for hardware/software co-simulation. By providing a unified, versatile, and high-fidelity simulation platform, SimPhony enables researchers to innovate and evaluate EPIC AI hardware across multiple domains, facilitating the next leap in emerging AI hardware. We open-source our codes at https://github.com/ScopeX-ASU/SimPhony

* 7-page

Via

Access Paper or Ask Questions

Multi-Dimensional Reconfigurable, Physically Composable Hybrid Diffractive Optical Neural Network

Nov 08, 2024

Ziang Yin, Yu Yao, Jeff Zhang, Jiaqi Gu

$Figure 1 for Multi-Dimensional Reconfigurable, Physically Composable Hybrid Diffractive Optical Neural Network$

$Figure 2 for Multi-Dimensional Reconfigurable, Physically Composable Hybrid Diffractive Optical Neural Network$

$Figure 3 for Multi-Dimensional Reconfigurable, Physically Composable Hybrid Diffractive Optical Neural Network$

$Figure 4 for Multi-Dimensional Reconfigurable, Physically Composable Hybrid Diffractive Optical Neural Network$

Abstract:Diffractive optical neural networks (DONNs), leveraging free-space light wave propagation for ultra-parallel, high-efficiency computing, have emerged as promising artificial intelligence (AI) accelerators. However, their inherent lack of reconfigurability due to fixed optical structures post-fabrication hinders practical deployment in the face of dynamic AI workloads and evolving applications. To overcome this challenge, we introduce, for the first time, a multi-dimensional reconfigurable hybrid diffractive ONN system (MDR-HDONN), a physically composable architecture that unlocks a new degree of freedom and unprecedented versatility in DONNs. By leveraging full-system learnability, MDR-HDONN repurposes fixed fabricated optical hardware, achieving exponentially expanded functionality and superior task adaptability through the differentiable learning of system variables. Furthermore, MDR-HDONN adopts a hybrid optical/photonic design, combining the reconfigurability of integrated photonics with the ultra-parallelism of free-space diffractive systems. Extensive evaluations demonstrate that MDR-HDONN has digital-comparable accuracy on various task adaptations with 74x faster speed and 194x lower energy. Compared to prior DONNs, MDR-HDONN shows exponentially larger functional space with 5x faster training speed, paving the way for a new paradigm of versatile, composable, hybrid optical/photonic AI computing. We will open-source our codes.

* 7 pages

Via

Access Paper or Ask Questions

SCATTER: Algorithm-Circuit Co-Sparse Photonic Accelerator with Thermal-Tolerant, Power-Efficient In-situ Light Redistribution

Jul 07, 2024

Ziang Yin, Nicholas Gangi, Meng Zhang, Jeff Zhang, Rena Huang, Jiaqi Gu

Figure 1 for SCATTER: Algorithm-Circuit Co-Sparse Photonic Accelerator with Thermal-Tolerant, Power-Efficient In-situ Light Redistribution

Figure 2 for SCATTER: Algorithm-Circuit Co-Sparse Photonic Accelerator with Thermal-Tolerant, Power-Efficient In-situ Light Redistribution

Figure 3 for SCATTER: Algorithm-Circuit Co-Sparse Photonic Accelerator with Thermal-Tolerant, Power-Efficient In-situ Light Redistribution

Figure 4 for SCATTER: Algorithm-Circuit Co-Sparse Photonic Accelerator with Thermal-Tolerant, Power-Efficient In-situ Light Redistribution

Abstract:Photonic computing has emerged as a promising solution for accelerating computation-intensive artificial intelligence (AI) workloads. However, limited reconfigurability, high electrical-optical conversion cost, and thermal sensitivity limit the deployment of current optical analog computing engines to support power-restricted, performance-sensitive AI workloads at scale. Sparsity provides a great opportunity for hardware-efficient AI accelerators. However, current dense photonic accelerators fail to fully exploit the power-saving potential of algorithmic sparsity. It requires sparsity-aware hardware specialization with a fundamental re-design of photonic tensor core topology and cross-layer device-circuit-architecture-algorithm co-optimization aware of hardware non-ideality and power bottleneck. To trim down the redundant power consumption while maximizing robustness to thermal variations, we propose SCATTER, a novel algorithm-circuit co-sparse photonic accelerator featuring dynamically reconfigurable signal path via thermal-tolerant, power-efficient in-situ light redistribution and power gating. A power-optimized, crosstalk-aware dynamic sparse training framework is introduced to explore row-column structured sparsity and ensure marginal accuracy loss and maximum power efficiency. The extensive evaluation shows that our cross-stacked optimized accelerator SCATTER achieves a 511X area reduction and 12.4X power saving with superior crosstalk tolerance that enables unprecedented circuit layout compactness and on-chip power efficiency.

Via

Access Paper or Ask Questions

Skip-SCAR: A Modular Approach to ObjectGoal Navigation with Sparsity and Adaptive Skips

May 23, 2024

Yaotian Liu, Jeff Zhang

Figure 1 for Skip-SCAR: A Modular Approach to ObjectGoal Navigation with Sparsity and Adaptive Skips

Figure 2 for Skip-SCAR: A Modular Approach to ObjectGoal Navigation with Sparsity and Adaptive Skips

Figure 3 for Skip-SCAR: A Modular Approach to ObjectGoal Navigation with Sparsity and Adaptive Skips

Figure 4 for Skip-SCAR: A Modular Approach to ObjectGoal Navigation with Sparsity and Adaptive Skips

Abstract:In ObjectGoal navigation (ObjectNav), agents must locate specific objects within unseen environments, requiring effective observation, prediction, and navigation capabilities. This study found that traditional methods looking only for prediction accuracy often compromise on computational efficiency. To address this, we introduce "Skip-SCAR," a modular framework that enhances efficiency by leveraging sparsity and adaptive skips. The SparseConv-Augmented ResNet (SCAR) at the core of our approach uses sparse and dense feature processing in parallel, optimizing both the computation and memory footprint. Our adaptive skip technique further reduces computational demands by selectively bypassing unnecessary semantic segmentation steps based on environmental constancy. Tested on the HM3D ObjectNav datasets, Skip-SCAR not only minimizes resource use but also sets new performance benchmarks, demonstrating a robust method for improving efficiency and accuracy in robotic navigation tasks.

* 20 pages, 6 figures

Via

Access Paper or Ask Questions

Advanced Language Model-Driven Verilog Development: Enhancing Power, Performance, and Area Optimization in Code Synthesis

Dec 02, 2023

Kiran Thorat, Jiahui Zhao, Yaotian Liu, Hongwu Peng, Xi Xie, Bin Lei, Jeff Zhang, Caiwen Ding

Abstract:The increasing use of Advanced Language Models (ALMs) in diverse sectors, particularly due to their impressive capability to generate top-tier content following linguistic instructions, forms the core of this investigation. This study probes into ALMs' deployment in electronic hardware design, with a specific emphasis on the synthesis and enhancement of Verilog programming. We introduce an innovative framework, crafted to assess and amplify ALMs' productivity in this niche. The methodology commences with the initial crafting of Verilog programming via ALMs, succeeded by a distinct dual-stage refinement protocol. The premier stage prioritizes augmenting the code's operational and linguistic precision, while the latter stage is dedicated to aligning the code with Power-Performance-Area (PPA) benchmarks, a pivotal component in proficient hardware design. This bifurcated strategy, merging error remediation with PPA enhancement, has yielded substantial upgrades in the caliber of ALM-created Verilog programming. Our framework achieves an 81.37% rate in linguistic accuracy and 62.0% in operational efficacy in programming synthesis, surpassing current leading-edge techniques, such as 73% in linguistic accuracy and 46% in operational efficacy. These findings illuminate ALMs' aptitude in tackling complex technical domains and signal a positive shift in the mechanization of hardware design operations.

Via

Access Paper or Ask Questions

Path Planning Under Uncertainty to Localize mmWave Sources

Mar 08, 2023

Kai Pfeiffer, Yuze Jia, Mingsheng Yin, Akshaj Kumar Veldanda, Yaqi Hu, Amee Trivedi, Jeff Zhang, Siddharth Garg, Elza Erkip, Sundeep Rangan(+1 more)

Figure 1 for Path Planning Under Uncertainty to Localize mmWave Sources

Figure 2 for Path Planning Under Uncertainty to Localize mmWave Sources

Figure 3 for Path Planning Under Uncertainty to Localize mmWave Sources

Figure 4 for Path Planning Under Uncertainty to Localize mmWave Sources

Abstract:In this paper, we study a navigation problem where a mobile robot needs to locate a mmWave wireless signal. Using the directionality properties of the signal, we propose an estimation and path planning algorithm that can efficiently navigate in cluttered indoor environments. We formulate Extended Kalman filters for emitter location estimation in cases where the signal is received in line-of-sight or after reflections. We then propose to plan motion trajectories based on belief-space dynamics in order to minimize the uncertainty of the position estimates. The associated non-linear optimization problem is solved by a state-of-the-art constrained iLQR solver. In particular, we propose a method that can handle a large number of obstacles (~300) with reasonable computation times. We validate the approach in an extensive set of simulations. We show that our estimators can help increase navigation success rate and that planning to reduce estimation uncertainty can improve the overall task completion speed.

Via

Access Paper or Ask Questions

Millimeter Wave Wireless Assisted Robot Navigation with Link State Classification

Nov 05, 2021

Mingsheng Yin, Akshaj Veldanda, Amee Trivedi, Jeff Zhang, Kai Pfeiffer, Yaqi Hu, Siddharth Garg, Elza Erkip, Ludovic Righetti, Sundeep Rangan

Figure 1 for Millimeter Wave Wireless Assisted Robot Navigation with Link State Classification

Figure 2 for Millimeter Wave Wireless Assisted Robot Navigation with Link State Classification

Figure 3 for Millimeter Wave Wireless Assisted Robot Navigation with Link State Classification

Figure 4 for Millimeter Wave Wireless Assisted Robot Navigation with Link State Classification

Abstract:The millimeter wave (mmWave) bands have attracted considerable attention for high precision localization applications due to the ability to capture high angular and temporal resolution measurements. This paper explores mmWave-based positioning for a target localization problem where a fixed target broadcasts mmWave signals and a mobile robotic agent attempts to listen to the signals to locate and navigate to the target. A three strage procedure is proposed: First, the mobile agent uses tensor decomposition methods to detect the wireless paths and their angles. Second, a machine-learning trained classifier is then used to predict the link state, meaning if the strongest path is line-of-sight (LOS) or non-LOS (NLOS). For the NLOS case, the link state predictor also determines if the strongest path arrived via one or more reflections. Third, based on the link state, the agent either follows the estimated angles or explores the environment. The method is demonstrated on a large dataset of indoor environments supplemented with ray tracing to simulate the wireless propagation. The path estimation and link state classification are also integrated into a state-ofthe-art neural simultaneous localization and mapping (SLAM) module to augment camera and LIDAR-based navigation. It is shown that the link state classifier can successfully generalize to completely new environments outside the training set. In addition, the neural-SLAM module with the wireless path estimation and link state classifier provides rapid navigation to the target, close to a baseline that knows the target location.

* 13 pages, 20 figures

Via

Access Paper or Ask Questions

RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance

May 22, 2021

Udit Gupta, Samuel Hsia, Jeff Zhang, Mark Wilkening, Javin Pombra, Hsien-Hsin S. Lee, Gu-Yeon Wei, Carole-Jean Wu, David Brooks

Figure 1 for RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance

Figure 2 for RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance

Figure 3 for RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance

Figure 4 for RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance

Abstract:Deep learning recommendation systems must provide high quality, personalized content under strict tail-latency targets and high system loads. This paper presents RecPipe, a system to jointly optimize recommendation quality and inference performance. Central to RecPipe is decomposing recommendation models into multi-stage pipelines to maintain quality while reducing compute complexity and exposing distinct parallelism opportunities. RecPipe implements an inference scheduler to map multi-stage recommendation engines onto commodity, heterogeneous platforms (e.g., CPUs, GPUs).While the hardware-aware scheduling improves ranking efficiency, the commodity platforms suffer from many limitations requiring specialized hardware. Thus, we design RecPipeAccel (RPAccel), a custom accelerator that jointly optimizes quality, tail-latency, and system throughput. RPAc-cel is designed specifically to exploit the distinct design space opened via RecPipe. In particular, RPAccel processes queries in sub-batches to pipeline recommendation stages, implements dual static and dynamic embedding caches, a set of top-k filtering units, and a reconfigurable systolic array. Com-pared to prior-art and at iso-quality, we demonstrate that RPAccel improves latency and throughput by 3x and 6x.

Via

Access Paper or Ask Questions

FATE: Fast and Accurate Timing Error Prediction Framework for Low Power DNN Accelerator Design

Jul 02, 2018

Jeff Zhang, Siddharth Garg

Figure 1 for FATE: Fast and Accurate Timing Error Prediction Framework for Low Power DNN Accelerator Design

Figure 2 for FATE: Fast and Accurate Timing Error Prediction Framework for Low Power DNN Accelerator Design

Figure 3 for FATE: Fast and Accurate Timing Error Prediction Framework for Low Power DNN Accelerator Design

Figure 4 for FATE: Fast and Accurate Timing Error Prediction Framework for Low Power DNN Accelerator Design

Abstract:Deep neural networks (DNN) are increasingly being accelerated on application-specific hardware such as the Google TPU designed especially for deep learning. Timing speculation is a promising approach to further increase the energy efficiency of DNN accelerators. Architectural exploration for timing speculation requires detailed gate-level timing simulations that can be time-consuming for large DNNs that execute millions of multiply-and-accumulate (MAC) operations. In this paper we propose FATE, a new methodology for fast and accurate timing simulations of DNN accelerators like the Google TPU. FATE proposes two novel ideas: (i) DelayNet, a DNN based timing model for MAC units; and (ii) a statistical sampling methodology that reduces the number of MAC operations for which timing simulations are performed. We show that FATE results in between 8 times-58 times speed-up in timing simulations, while introducing less than 2% error in classification accuracy estimates. We demonstrate the use of FATE by comparing to conventional DNN accelerator that uses 2's complement (2C) arithmetic with an alternative implementation that uses signed magnitude representations (SMR). We show that that the SMR implementation provides 18% more energy savings for the same classification accuracy than 2C, a result that might be of independent interest.

* To appear at IEEE/ACM International Conference On Computer Aided Design 2018

Via

Access Paper or Ask Questions