Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiangyu Xie

Long-Term Probabilistic Forecast of Vegetation Conditions Using Climate Attributes in the Four Corners Region

Jan 22, 2026

Erika McPhillips, Hyeongseong Lee, Xiangyu Xie, Kathy Baylis, Chris Funk, Mengyang Gu

Abstract:Weather conditions can drastically alter the state of crops and rangelands, and in turn, impact the incomes and food security of individuals worldwide. Satellite-based remote sensing offers an effective way to monitor vegetation and climate variables on regional and global scales. The annual peak Normalized Difference Vegetation Index (NDVI), derived from satellite observations, is closely associated with crop development, rangeland biomass, and vegetation growth. Although various machine learning methods have been developed to forecast NDVI over short time ranges, such as one-month-ahead predictions, long-term forecasting approaches, such as one-year-ahead predictions of vegetation conditions, are not yet available. To fill this gap, we develop a two-phase machine learning model to forecast the one-year-ahead peak NDVI over high-resolution grids, using the Four Corners region of the Southwestern United States as a testbed. In phase one, we identify informative climate attributes, including precipitation and maximum vapor pressure deficit, and develop the generalized parallel Gaussian process that captures the relationship between climate attributes and NDVI. In phase two, we forecast these climate attributes using historical data at least one year before the NDVI prediction month, which then serve as inputs to forecast the peak NDVI at each spatial grid. We developed open-source tools that outperform alternative methods for both gross NDVI and grid-based NDVI one-year forecasts, providing information that can help farmers and ranchers make actionable plans a year in advance.

Via

Access Paper or Ask Questions

OnlySportsLM: Optimizing Sports-Domain Language Models with SOTA Performance under Billion Parameters

Aug 30, 2024

Zexin Chen, Chengxi Li, Xiangyu Xie, Parijat Dube

Figure 1 for OnlySportsLM: Optimizing Sports-Domain Language Models with SOTA Performance under Billion Parameters

Figure 2 for OnlySportsLM: Optimizing Sports-Domain Language Models with SOTA Performance under Billion Parameters

Figure 3 for OnlySportsLM: Optimizing Sports-Domain Language Models with SOTA Performance under Billion Parameters

Figure 4 for OnlySportsLM: Optimizing Sports-Domain Language Models with SOTA Performance under Billion Parameters

Abstract:This paper explores the potential of a small, domain-specific language model trained exclusively on sports-related data. We investigate whether extensive training data with specially designed small model structures can overcome model size constraints. The study introduces the OnlySports collection, comprising OnlySportsLM, OnlySports Dataset, and OnlySports Benchmark. Our approach involves: 1) creating a massive 600 billion tokens OnlySports Dataset from FineWeb, 2) optimizing the RWKV architecture for sports-related tasks, resulting in a 196M parameters model with 20-layer, 640-dimension structure, 3) training the OnlySportsLM on part of OnlySports Dataset, and 4) testing the resultant model on OnlySports Benchmark. OnlySportsLM achieves a 37.62%/34.08% accuracy improvement over previous 135M/360M state-of-the-art models and matches the performance of larger models such as SomlLM 1.7B and Qwen 1.5B in the sports domain. Additionally, the OnlySports collection presents a comprehensive workflow for building high-quality, domain-specific language models, providing a replicable blueprint for efficient AI development across various specialized fields.

* 13 pages, 4 figures, 4 tables

Via

Access Paper or Ask Questions