Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Thayer Alshaabi

Fourier-Based 3D Multistage Transformer for Aberration Correction in Multicellular Specimens

Mar 16, 2025

Thayer Alshaabi, Daniel E. Milkie, Gaoxiang Liu, Cyna Shirazinejad, Jason L. Hong, Kemal Achour, Frederik Görlitz, Ana Milunovic-Jevtic, Cat Simmons, Ibrahim S. Abuzahriyeh(+10 more)

Abstract:High-resolution tissue imaging is often compromised by sample-induced optical aberrations that degrade resolution and contrast. While wavefront sensor-based adaptive optics (AO) can measure these aberrations, such hardware solutions are typically complex, expensive to implement, and slow when serially mapping spatially varying aberrations across large fields of view. Here, we introduce AOViFT (Adaptive Optical Vision Fourier Transformer) -- a machine learning-based aberration sensing framework built around a 3D multistage Vision Transformer that operates on Fourier domain embeddings. AOViFT infers aberrations and restores diffraction-limited performance in puncta-labeled specimens with substantially reduced computational cost, training time, and memory footprint compared to conventional architectures or real-space networks. We validated AOViFT on live gene-edited zebrafish embryos, demonstrating its ability to correct spatially varying aberrations using either a deformable mirror or post-acquisition deconvolution. By eliminating the need for the guide star and wavefront sensing hardware and simplifying the experimental workflow, AOViFT lowers technical barriers for high-resolution volumetric microscopy across diverse biological samples.

* 52 pages, 6 figures, 23 si figures, 8 si tables

Via

Access Paper or Ask Questions

Characterizing narrative time in books through fluctuations in power and danger arcs

Aug 24, 2022

Mikaela Irene Fudolig, Thayer Alshaabi, Kathryn Cramer, Christopher M. Danforth, Peter Sheridan Dodds

Figure 1 for Characterizing narrative time in books through fluctuations in power and danger arcs

Figure 2 for Characterizing narrative time in books through fluctuations in power and danger arcs

Figure 3 for Characterizing narrative time in books through fluctuations in power and danger arcs

Figure 4 for Characterizing narrative time in books through fluctuations in power and danger arcs

Abstract:While recent studies have focused on quantifying word usage to find the overall shapes of narrative emotional arcs, certain features of narratives within narratives remain to be explored. Here, we characterize the narrative time scale of sub-narratives by finding the length of text at which fluctuations in word usage begin to be relevant. We represent more than 30,000 Project Gutenberg books as time series using ousiometrics, a power-danger framework for essential meaning, itself a reinterpretation of the valence-arousal-dominance framework derived from semantic differentials. We decompose each book's power and danger time series using empirical mode decomposition into a sum of constituent oscillatory modes and a non-oscillatory trend. By comparing the decomposition of the original power and danger time series with those derived from shuffled text, we find that shorter books exhibit only a general trend, while longer books have fluctuations in addition to the general trend, similar to how subplots have arcs within an overall narrative arc. These fluctuations typically have a period of a few thousand words regardless of the book length or library classification code, but vary depending on the content and structure of the book. Our method provides a data-driven denoising approach that works for text of various lengths, in contrast to the more traditional approach of using large window sizes that may inadvertently smooth out relevant information, especially for shorter texts.

* corrected eq S1

Via

Access Paper or Ask Questions

Image-to-Image Regression with Distribution-Free Uncertainty Quantification and Applications in Imaging

Feb 10, 2022

Anastasios N Angelopoulos, Amit P Kohli, Stephen Bates, Michael I Jordan, Jitendra Malik, Thayer Alshaabi, Srigokul Upadhyayula, Yaniv Romano

Figure 1 for Image-to-Image Regression with Distribution-Free Uncertainty Quantification and Applications in Imaging

Figure 2 for Image-to-Image Regression with Distribution-Free Uncertainty Quantification and Applications in Imaging

Figure 3 for Image-to-Image Regression with Distribution-Free Uncertainty Quantification and Applications in Imaging

Figure 4 for Image-to-Image Regression with Distribution-Free Uncertainty Quantification and Applications in Imaging

Abstract:Image-to-image regression is an important learning task, used frequently in biological imaging. Current algorithms, however, do not generally offer statistical guarantees that protect against a model's mistakes and hallucinations. To address this, we develop uncertainty quantification techniques with rigorous statistical guarantees for image-to-image regression problems. In particular, we show how to derive uncertainty intervals around each pixel that are guaranteed to contain the true value with a user-specified confidence probability. Our methods work in conjunction with any base machine learning model, such as a neural network, and endow it with formal mathematical guarantees -- regardless of the true unknown data distribution or choice of model. Furthermore, they are simple to implement and computationally inexpensive. We evaluate our procedure on three image-to-image regression tasks: quantitative phase microscopy, accelerated magnetic resonance imaging, and super-resolution transmission electron microscopy of a Drosophila melanogaster brain.

* Code available at https://github.com/aangelopoulos/im2im-uq

Via

Access Paper or Ask Questions

Sentiment and structure in word co-occurrence networks on Twitter

Oct 01, 2021

Mikaela Irene Fudolig, Thayer Alshaabi, Michael V. Arnold, Christopher M. Danforth, Peter Sheridan Dodds

Figure 1 for Sentiment and structure in word co-occurrence networks on Twitter

Figure 2 for Sentiment and structure in word co-occurrence networks on Twitter

Figure 3 for Sentiment and structure in word co-occurrence networks on Twitter

Figure 4 for Sentiment and structure in word co-occurrence networks on Twitter

Abstract:We explore the relationship between context and happiness scores in political tweets using word co-occurrence networks, where nodes in the network are the words, and the weight of an edge is the number of tweets in the corpus for which the two connected words co-occur. In particular, we consider tweets with hashtags #imwithher and #crookedhillary, both relating to Hillary Clinton's presidential bid in 2016. We then analyze the network properties in conjunction with the word scores by comparing with null models to separate the effects of the network structure and the score distribution. Neutral words are found to be dominant and most words, regardless of polarity, tend to co-occur with neutral words. We do not observe any score homophily among positive and negative words. However, when we perform network backboning, community detection results in word groupings with meaningful narratives, and the happiness scores of the words in each group correspond to its respective theme. Thus, although we observe no clear relationship between happiness scores and co-occurrence at the node or edge level, a community-centric approach can isolate themes of competing sentiments in a corpus.

Via

Access Paper or Ask Questions

Augmenting semantic lexicons using word embeddings and transfer learning

Sep 18, 2021

Thayer Alshaabi, Colin Van Oort, Mikaela Fudolig, Michael V. Arnold, Christopher M. Danforth, Peter Sheridan Dodds

Figure 1 for Augmenting semantic lexicons using word embeddings and transfer learning

Figure 2 for Augmenting semantic lexicons using word embeddings and transfer learning

Figure 3 for Augmenting semantic lexicons using word embeddings and transfer learning

Figure 4 for Augmenting semantic lexicons using word embeddings and transfer learning

Abstract:Sentiment-aware intelligent systems are essential to a wide array of applications including marketing, political campaigns, recommender systems, behavioral economics, social psychology, and national security. These sentiment-aware intelligent systems are driven by language models which broadly fall into two paradigms: 1. Lexicon-based and 2. Contextual. Although recent contextual models are increasingly dominant, we still see demand for lexicon-based models because of their interpretability and ease of use. For example, lexicon-based models allow researchers to readily determine which words and phrases contribute most to a change in measured sentiment. A challenge for any lexicon-based approach is that the lexicon needs to be routinely expanded with new words and expressions. Crowdsourcing annotations for semantic dictionaries may be an expensive and time-consuming task. Here, we propose two models for predicting sentiment scores to augment semantic lexicons at a relatively low cost using word embeddings and transfer learning. Our first model establishes a baseline employing a simple and shallow neural network initialized with pre-trained word embeddings using a non-contextual approach. Our second model improves upon our baseline, featuring a deep Transformer-based network that brings to bear word definitions to estimate their lexical polarity. Our evaluation shows that both models are able to score new words with a similar accuracy to reviewers from Amazon Mechanical Turk, but at a fraction of the cost.

* 16 pages, 9 figures

Via

Access Paper or Ask Questions

Object Tracking and Geo-localization from Street Images

Jul 13, 2021

Daniel Wilson, Thayer Alshaabi, Colin Van Oort, Xiaohan Zhang, Jonathan Nelson, Safwan Wshah

Figure 1 for Object Tracking and Geo-localization from Street Images

Figure 2 for Object Tracking and Geo-localization from Street Images

Figure 3 for Object Tracking and Geo-localization from Street Images

Figure 4 for Object Tracking and Geo-localization from Street Images

Abstract:Geo-localizing static objects from street images is challenging but also very important for road asset mapping and autonomous driving. In this paper we present a two-stage framework that detects and geolocalizes traffic signs from low frame rate street videos. Our proposed system uses a modified version of RetinaNet (GPS-RetinaNet), which predicts a positional offset for each sign relative to the camera, in addition to performing the standard classification and bounding box regression. Candidate sign detections from GPS-RetinaNet are condensed into geolocalized signs by our custom tracker, which consists of a learned metric network and a variant of the Hungarian Algorithm. Our metric network estimates the similarity between pairs of detections, then the Hungarian Algorithm matches detections across images using the similarity scores provided by the metric network. Our models were trained using an updated version of the ARTS dataset, which contains 25,544 images and 47.589 sign annotations ~\cite{arts}. The proposed dataset covers a diverse set of environments gathered from a broad selection of roads. Each annotaiton contains a sign class label, its geospatial location, an assembly label, a side of road indicator, and unique identifiers that aid in the evaluation. This dataset will support future progress in the field, and the proposed system demonstrates how to take advantage of some of the unique characteristics of a realistic geolocalization dataset.

* 28 pages, 7 figures, to be submitted to Elsevier Pattern Recognition

Via

Access Paper or Ask Questions

Quantifying language changes surrounding mental health on Twitter

Jun 02, 2021

Anne Marie Stupinski, Thayer Alshaabi, Michael V. Arnold, Jane Lydia Adams, Joshua R. Minot, Matthew Price, Peter Sheridan Dodds, Christopher M. Danforth

Figure 1 for Quantifying language changes surrounding mental health on Twitter

Figure 2 for Quantifying language changes surrounding mental health on Twitter

Figure 3 for Quantifying language changes surrounding mental health on Twitter

Figure 4 for Quantifying language changes surrounding mental health on Twitter

Abstract:Mental health challenges are thought to afflict around 10% of the global population each year, with many going untreated due to stigma and limited access to services. Here, we explore trends in words and phrases related to mental health through a collection of 1- , 2-, and 3-grams parsed from a data stream of roughly 10% of all English tweets since 2012. We examine temporal dynamics of mental health language, finding that the popularity of the phrase 'mental health' increased by nearly two orders of magnitude between 2012 and 2018. We observe that mentions of 'mental health' spike annually and reliably due to mental health awareness campaigns, as well as unpredictably in response to mass shootings, celebrities dying by suicide, and popular fictional stories portraying suicide. We find that the level of positivity of messages containing 'mental health', while stable through the growth period, has declined recently. Finally, we use the ratio of original tweets to retweets to quantify the fraction of appearances of mental health language due to social amplification. Since 2015, mentions of mental health have become increasingly due to retweets, suggesting that stigma associated with discussion of mental health on Twitter has diminished with time.

* 12 pages, 5 figures, 1 table

Via

Access Paper or Ask Questions

Storywrangler: A massive exploratorium for sociolinguistic, cultural, socioeconomic, and political timelines using Twitter

Jul 25, 2020

Thayer Alshaabi, Jane L. Adams, Michael V. Arnold, Joshua R. Minot, David R. Dewhurst, Andrew J. Reagan, Christopher M. Danforth, Peter Sheridan Dodds

Figure 1 for Storywrangler: A massive exploratorium for sociolinguistic, cultural, socioeconomic, and political timelines using Twitter

Figure 2 for Storywrangler: A massive exploratorium for sociolinguistic, cultural, socioeconomic, and political timelines using Twitter

Figure 3 for Storywrangler: A massive exploratorium for sociolinguistic, cultural, socioeconomic, and political timelines using Twitter

Figure 4 for Storywrangler: A massive exploratorium for sociolinguistic, cultural, socioeconomic, and political timelines using Twitter

Abstract:In real-time, Twitter strongly imprints world events, popular culture, and the day-to-day; Twitter records an ever growing compendium of language use and change; and Twitter has been shown to enable certain kinds of prediction. Vitally, and absent from many standard corpora such as books and news archives, Twitter also encodes popularity and spreading through retweets. Here, we describe Storywrangler, an ongoing, day-scale curation of over 100 billion tweets containing around 1 trillion 1-grams from 2008 to 2020. For each day, we break tweets into 1-, 2-, and 3-grams across 150+ languages, record usage frequencies, and generate Zipf distributions. We make the data set available through an interactive time series viewer, and as downloadable time series and daily distributions. We showcase a few examples of the many possible avenues of study we aim to enable including how social amplification can be visualized through 'contagiograms'.

* Main text: 9 pages, 4 figures; Supplementary text: 11 pages, 8 figures, 4 tables. Website: https://storywrangling.org/ Code: https://gitlab.com/compstorylab/storywrangler

Via

Access Paper or Ask Questions

The growing echo chamber of social media: Measuring temporal and social contagion dynamics for over 150 languages on Twitter for 2009--2020

Mar 15, 2020

Thayer Alshaabi, David R. Dewhurst, Joshua R. Minot, Michael V. Arnold, Jane L. Adams, Christopher M. Danforth, Peter Sheridan Dodds

Figure 1 for The growing echo chamber of social media: Measuring temporal and social contagion dynamics for over 150 languages on Twitter for 2009--2020

Figure 2 for The growing echo chamber of social media: Measuring temporal and social contagion dynamics for over 150 languages on Twitter for 2009--2020

Figure 3 for The growing echo chamber of social media: Measuring temporal and social contagion dynamics for over 150 languages on Twitter for 2009--2020

Figure 4 for The growing echo chamber of social media: Measuring temporal and social contagion dynamics for over 150 languages on Twitter for 2009--2020

Abstract:Working from a dataset of 118 billion messages running from the start of 2009 to the end of 2019, we identify and explore the relative daily use of over 150 languages on Twitter. We find that eight languages comprise 80% of all tweets, with English, Japanese, Spanish, and Portuguese being the most dominant. To quantify each language's level of being a Twitter `echo chamber' over time, we compute the `contagion ratio': the balance of retweets to organic messages. We find that for the most common languages on Twitter there is a growing tendency, though not universal, to retweet rather than share new content. By the end of 2019, the contagion ratios for half of the top 30 languages, including English and Spanish, had reached above 1---the naive contagion threshold. In 2019, the top 5 languages with the highest average daily ratios were, in order, Thai (7.3), Hindi, Tamil, Urdu, and Catalan, while the bottom 5 were Russian, Swedish, Esperanto, Cebuano, and Finnish (0.26). Further, we show that over time, the contagion ratios for most common languages are growing more strongly than those of rare languages.

* 33 pages (11 main, 22 appendix), 26 figures (6 main, 20 appendix), 4 online appendices available http://compstorylab.org/share/papers/alshaabi2020a , our source code along with our documentation is publicly available online on a Gitlab repository https://gitlab.com/compstorylab/storywrangler

Via

Access Paper or Ask Questions