Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Matthew Nazari

NoLoR: An ASR-Based Framework for Expedited Endangered Language Documentation with Neo-Aramaic as a Case Study

Dec 06, 2024

Matthew Nazari

Figure 1 for NoLoR: An ASR-Based Framework for Expedited Endangered Language Documentation with Neo-Aramaic as a Case Study

Figure 2 for NoLoR: An ASR-Based Framework for Expedited Endangered Language Documentation with Neo-Aramaic as a Case Study

Figure 3 for NoLoR: An ASR-Based Framework for Expedited Endangered Language Documentation with Neo-Aramaic as a Case Study

Figure 4 for NoLoR: An ASR-Based Framework for Expedited Endangered Language Documentation with Neo-Aramaic as a Case Study

Abstract:The documentation of the Neo-Aramaic dialects before their extinction has been described as the most urgent task in all of Semitology today. The death of this language will be an unfathomable loss to the descendents of the indigenous speakers of Aramaic, now predominantly diasporic after forced displacement due to violence. This paper develops an ASR model to expedite the documentation of this endangered language and generalizes the strategy in a new framework we call NoLoR.

Via

Access Paper or Ask Questions

PCTreeS: 3D Point Cloud Tree Species Classification Using Airborne LiDAR Images

Dec 06, 2024

Hongjin Lin, Matthew Nazari, Derek Zheng

Figure 1 for PCTreeS: 3D Point Cloud Tree Species Classification Using Airborne LiDAR Images

Figure 2 for PCTreeS: 3D Point Cloud Tree Species Classification Using Airborne LiDAR Images

Figure 3 for PCTreeS: 3D Point Cloud Tree Species Classification Using Airborne LiDAR Images

Figure 4 for PCTreeS: 3D Point Cloud Tree Species Classification Using Airborne LiDAR Images

Abstract:Reliable large-scale data on the state of forests is crucial for monitoring ecosystem health, carbon stock, and the impact of climate change. Current knowledge of tree species distribution relies heavily on manual data collection in the field, which often takes years to complete, resulting in limited datasets that cover only a small subset of the world's forests. Recent works show that state-of-the-art deep learning models using Light Detection and Ranging (LiDAR) images enable accurate and scalable classification of tree species in various ecosystems. While LiDAR images contain rich 3D information, most previous works flatten the 3D images into 2D projections to use Convolutional Neural Networks (CNNs). This paper offers three significant contributions: (1) we apply the deep learning framework for tree classification in tropical savannas; (2) we use Airborne LiDAR images, which have a lower resolution but greater scalability than Terrestrial LiDAR images used in most previous works; (3) we introduce the approach of directly feeding 3D point cloud images into a vision transformer model (PCTreeS). Our results show that the PCTreeS approach outperforms current CNN baselines with 2D projections in AUC (0.81), overall accuracy (0.72), and training time (~45 mins). This paper also motivates further LiDAR image collection and validation for accurate large-scale automatic classification of tree species.

Via

Access Paper or Ask Questions

Consistent Explanations in the Face of Model Indeterminacy via Ensembling

Jun 13, 2023

Dan Ley, Leonard Tang, Matthew Nazari, Hongjin Lin, Suraj Srinivas, Himabindu Lakkaraju

Figure 1 for Consistent Explanations in the Face of Model Indeterminacy via Ensembling

Figure 2 for Consistent Explanations in the Face of Model Indeterminacy via Ensembling

Figure 3 for Consistent Explanations in the Face of Model Indeterminacy via Ensembling

Figure 4 for Consistent Explanations in the Face of Model Indeterminacy via Ensembling

Abstract:This work addresses the challenge of providing consistent explanations for predictive models in the presence of model indeterminacy, which arises due to the existence of multiple (nearly) equally well-performing models for a given dataset and task. Despite their similar performance, such models often exhibit inconsistent or even contradictory explanations for their predictions, posing challenges to end users who rely on these models to make critical decisions. Recognizing this issue, we introduce ensemble methods as an approach to enhance the consistency of the explanations provided in these scenarios. Leveraging insights from recent work on neural network loss landscapes and mode connectivity, we devise ensemble strategies to efficiently explore the underspecification set -- the set of models with performance variations resulting solely from changes in the random seed during training. Experiments on five benchmark financial datasets reveal that ensembling can yield significant improvements when it comes to explanation similarity, and demonstrate the potential of existing ensemble methods to explore the underspecification set efficiently. Our findings highlight the importance of considering model indeterminacy when interpreting explanations and showcase the effectiveness of ensembles in enhancing the reliability of explanations in machine learning.

Via

Access Paper or Ask Questions