Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xuanyi Zhao

Benchmarking Egocentric Multimodal Goal Inference for Assistive Wearable Agents

Oct 25, 2025

Vijay Veerabadran, Fanyi Xiao, Nitin Kamra, Pedro Matias, Joy Chen, Caley Drooff, Brett D Roads, Riley Williams, Ethan Henderson, Xuanyi Zhao(+3 more)

Abstract:There has been a surge of interest in assistive wearable agents: agents embodied in wearable form factors (e.g., smart glasses) who take assistive actions toward a user's goal/query (e.g. "Where did I leave my keys?"). In this work, we consider the important complementary problem of inferring that goal from multi-modal contextual observations. Solving this "goal inference" problem holds the promise of eliminating the effort needed to interact with such an agent. This work focuses on creating WAGIBench, a strong benchmark to measure progress in solving this problem using vision-language models (VLMs). Given the limited prior work in this area, we collected a novel dataset comprising 29 hours of multimodal data from 348 participants across 3,477 recordings, featuring ground-truth goals alongside accompanying visual, audio, digital, and longitudinal contextual observations. We validate that human performance exceeds model performance, achieving 93% multiple-choice accuracy compared with 84% for the best-performing VLM. Generative benchmark results that evaluate several families of modern vision-language models show that larger models perform significantly better on the task, yet remain far from practical usefulness, as they produce relevant goals only 55% of the time. Through a modality ablation, we show that models benefit from extra information in relevant modalities with minimal performance degradation from irrelevant modalities.

* Accepted as a spotlight paper at the 39th Conference on Neural Information Processing Systems (NeurIPS 2025)

Via

Access Paper or Ask Questions

A 5.3 GHz Al0.76Sc0.24N Two-Dimensional Resonant Rods Resonator with a kt2 of 23.9%

Apr 11, 2022

Xuanyi Zhao, Onurcan Kaya, Michele Pirro, Meruyert Assylbekova, Luca Colombo, Pietro Simeoni, Cristian Cassella

Figure 1 for A 5.3 GHz Al0.76Sc0.24N Two-Dimensional Resonant Rods Resonator with a kt2 of 23.9%

Figure 2 for A 5.3 GHz Al0.76Sc0.24N Two-Dimensional Resonant Rods Resonator with a kt2 of 23.9%

Figure 3 for A 5.3 GHz Al0.76Sc0.24N Two-Dimensional Resonant Rods Resonator with a kt2 of 23.9%

Figure 4 for A 5.3 GHz Al0.76Sc0.24N Two-Dimensional Resonant Rods Resonator with a kt2 of 23.9%

Abstract:This work reports on the measured performance of an Aluminum Scandium Nitride (AlScN) Two-Dimensional Resonant Rods resonator (2DRR), fabricated by using a Sc-doping concentration of 24%, characterized by a low off-resonance impedance (~25 Ohm) and exhibiting a record electromechanical coupling coefficient (kt2) of 23.9% for AlScN resonators. In order to achieve such performance, we identified and relied on optimized deposition and etching processes for highly-doped AlScN films, aiming at achieving high crystalline quality, low density of abnormally oriented grains in the 2DRR's active region and sharp lateral sidewalls. Also, the 2DRR's unit-cell has been acoustically engineered to maximize the piezo-generated mechanical energy within each rod and to ensure a low transduction of spurious modes around resonance. Due to its unprecedented kt2, the reported 2DRR opens exciting scenarios towards the development of next generation monolithic integrated radio-frequency (RF) filtering components. In fact, we show that 5th-order 2DRR-based ladder filters with fractional bandwidths (BW) of ~11%, insertion-loss (I.L) values of ~2.5 dB and with >30 dB out-of-band rejections can now be envisioned, paving an unprecedented path towards the development of ultra-wide band (UWB) filters for next-generation Super-High-Frequency (SHF) radio front-ends.

* 10 pages, 11 figures, 5 tables, submitted to JMEMS

Via

Access Paper or Ask Questions

Group-Sparse Matrix Factorization for Transfer Learning of Word Embeddings

Apr 18, 2021

Kan Xu, Xuanyi Zhao, Hamsa Bastani, Osbert Bastani

Figure 1 for Group-Sparse Matrix Factorization for Transfer Learning of Word Embeddings

Figure 2 for Group-Sparse Matrix Factorization for Transfer Learning of Word Embeddings

Figure 3 for Group-Sparse Matrix Factorization for Transfer Learning of Word Embeddings

Figure 4 for Group-Sparse Matrix Factorization for Transfer Learning of Word Embeddings

Abstract:Sparse regression has recently been applied to enable transfer learning from very limited data. We study an extension of this approach to unsupervised learning -- in particular, learning word embeddings from unstructured text corpora using low-rank matrix factorization. Intuitively, when transferring word embeddings to a new domain, we expect that the embeddings change for only a small number of words -- e.g., the ones with novel meanings in that domain. We propose a novel group-sparse penalty that exploits this sparsity to perform transfer learning when there is very little text data available in the target domain -- e.g., a single article of text. We prove generalization bounds for our algorithm. Furthermore, we empirically evaluate its effectiveness, both in terms of prediction accuracy in downstream tasks as well as the interpretability of the results.

Via

Access Paper or Ask Questions