Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michael J. Smith

UniverseTBD

Dargana: fine-tuning EarthPT for dynamic tree canopy mapping from space

Apr 24, 2025

Michael J. Smith, Luke Fleming, James E. Geach, Ryan J. Roberts, Freddie Kalaitzis, James Banister

Abstract:We present Dargana, a fine-tuned variant of the EarthPT time-series foundation model that achieves specialisation using <3% of its pre-training data volume and 5% of its pre-training compute. Dargana is fine-tuned to generate regularly updated classification of tree canopy cover at 10m resolution, distinguishing conifer and broadleaved tree types. Using Cornwall, UK, as a test case, the model achieves a pixel-level ROC-AUC of 0.98 and a PR-AUC of 0.83 on unseen satellite imagery. Dargana can identify fine structures like hedgerows and coppice below the training sample limit, and can track temporal changes to canopy cover such as new woodland establishment. Our results demonstrate how pre-trained Large Observation Models like EarthPT can be specialised for granular, dynamic land cover monitoring from space, providing a valuable, scalable tool for natural capital management and conservation.

* 9 pages, 6 figures, spotlight at `Tackling Climate Change with Machine Learning', ICLR 2025

Via

Access Paper or Ask Questions

AstroLLaVA: towards the unification of astronomical data and natural language

Apr 11, 2025

Sharaf Zaman, Michael J. Smith, Pranav Khetarpal, Rishabh Chakrabarty, Michele Ginolfi, Marc Huertas-Company, Maja Jabłońska, Sandor Kruk, Matthieu Le Lain, Sergio José Rodríguez Méndez(+1 more)

Abstract:We present AstroLLaVA, a vision language model for astronomy that enables interaction with astronomical imagery through natural dialogue. By fine-tuning the LLaVA model on a diverse dataset of $\sim$30k images with captions and question-answer pairs sourced from NASA's `Astronomy Picture of the Day', the European Southern Observatory, and the NASA/ESA Hubble Space Telescope, we create a model capable of answering open-ended questions about astronomical concepts depicted visually. Our two-stage fine-tuning process adapts the model to both image captioning and visual question answering in the astronomy domain. We demonstrate AstroLLaVA's performance on an astronomical visual question answering benchmark and release the model weights, code, and training set to encourage further open source work in this space. Finally, we suggest a roadmap towards general astronomical data alignment with pre-trained language models, and provide an open space for collaboration towards this end for interested researchers.

* 8 pages, 3 figures, accepted to SCI-FM@ICLR 2025. Code at https://w3id.org/UniverseTBD/AstroLLaVA

Via

Access Paper or Ask Questions

A Survey on Hypothesis Generation for Scientific Discovery in the Era of Large Language Models

Apr 07, 2025

Atilla Kaan Alkan, Shashwat Sourav, Maja Jablonska, Simone Astarita, Rishabh Chakrabarty, Nikhil Garuda, Pranav Khetarpal, Maciej Pióro, Dimitrios Tanoglidis, Kartheik G. Iyer(+7 more)

Abstract:Hypothesis generation is a fundamental step in scientific discovery, yet it is increasingly challenged by information overload and disciplinary fragmentation. Recent advances in Large Language Models (LLMs) have sparked growing interest in their potential to enhance and automate this process. This paper presents a comprehensive survey of hypothesis generation with LLMs by (i) reviewing existing methods, from simple prompting techniques to more complex frameworks, and proposing a taxonomy that categorizes these approaches; (ii) analyzing techniques for improving hypothesis quality, such as novelty boosting and structured reasoning; (iii) providing an overview of evaluation strategies; and (iv) discussing key challenges and future directions, including multimodal integration and human-AI collaboration. Our survey aims to serve as a reference for researchers exploring LLMs for hypothesis generation.

* 9 pages (+2 pages of references), 2 figures

Via

Access Paper or Ask Questions

Towards more efficient agricultural practices via transformer-based crop type classification

Nov 04, 2024

E. Ulises Moya-Sánchez, Yazid S. Mikail, Daisy Nyang'anyi, Michael J. Smith, Isabella Smythe

Figure 1 for Towards more efficient agricultural practices via transformer-based crop type classification

Figure 2 for Towards more efficient agricultural practices via transformer-based crop type classification

Figure 3 for Towards more efficient agricultural practices via transformer-based crop type classification

Abstract:Machine learning has great potential to increase crop production and resilience to climate change. Accurate maps of where crops are grown are a key input to a number of downstream policy and research applications. In this proposal, we present preliminary work showing that it is possible to accurately classify crops from time series derived from Sentinel 1 and 2 satellite imagery in Mexico using a pixel-based binary crop/non-crop time series transformer model. We also find preliminary evidence that meta-learning approaches supplemented with data from similar agro-ecological zones may improve model performance. Due to these promising results, we propose further development of this method with the goal of accurate multi-class crop classification in Jalisco, Mexico via meta-learning with a dataset comprising similar agro-ecological zones.

Via

Access Paper or Ask Questions

pathfinder: A Semantic Framework for Literature Review and Knowledge Discovery in Astronomy

Aug 02, 2024

Kartheik G. Iyer, Mikaeel Yunus, Charles O'Neill, Christine Ye, Alina Hyk, Kiera McCormick, Ioana Ciuca, John F. Wu, Alberto Accomazzi, Simone Astarita(+20 more)

Figure 1 for pathfinder: A Semantic Framework for Literature Review and Knowledge Discovery in Astronomy

Figure 2 for pathfinder: A Semantic Framework for Literature Review and Knowledge Discovery in Astronomy

Figure 3 for pathfinder: A Semantic Framework for Literature Review and Knowledge Discovery in Astronomy

Figure 4 for pathfinder: A Semantic Framework for Literature Review and Knowledge Discovery in Astronomy

Abstract:The exponential growth of astronomical literature poses significant challenges for researchers navigating and synthesizing general insights or even domain-specific knowledge. We present Pathfinder, a machine learning framework designed to enable literature review and knowledge discovery in astronomy, focusing on semantic searching with natural language instead of syntactic searches with keywords. Utilizing state-of-the-art large language models (LLMs) and a corpus of 350,000 peer-reviewed papers from the Astrophysics Data System (ADS), Pathfinder offers an innovative approach to scientific inquiry and literature exploration. Our framework couples advanced retrieval techniques with LLM-based synthesis to search astronomical literature by semantic context as a complement to currently existing methods that use keywords or citation graphs. It addresses complexities of jargon, named entities, and temporal aspects through time-based and citation-based weighting schemes. We demonstrate the tool's versatility through case studies, showcasing its application in various research scenarios. The system's performance is evaluated using custom benchmarks, including single-paper and multi-paper tasks. Beyond literature review, Pathfinder offers unique capabilities for reformatting answers in ways that are accessible to various audiences (e.g. in a different language or as simplified text), visualizing research landscapes, and tracking the impact of observatories and methodologies. This tool represents a significant advancement in applying AI to astronomical research, aiding researchers at all career stages in navigating modern astronomy literature.

* 25 pages, 9 figures, submitted to AAS jorunals. Comments are welcome, and the tools mentioned are available online at https://pfdr.app

Via

Access Paper or Ask Questions

AstroPT: Scaling Large Observation Models for Astronomy

May 23, 2024

Michael J. Smith, Ryan J. Roberts, Eirini Angeloudi, Marc Huertas-Company

Abstract:This work presents AstroPT, an autoregressive pretrained transformer developed with astronomical use-cases in mind. The AstroPT models presented here have been pretrained on 8.6 million $512 \times 512$ pixel $grz$-band galaxy postage stamp observations from the DESI Legacy Survey DR8. We train a selection of foundation models of increasing size from 1 million to 2.1 billion parameters, and find that AstroPT follows a similar saturating log-log scaling law to textual models. We also find that the models' performances on downstream tasks as measured by linear probing improves with model size up to the model parameter saturation point. We believe that collaborative community development paves the best route towards realising an open source `Large Observation Model' -- a model trained on data taken from the observational sciences at the scale seen in natural language processing. To this end, we release the source code, weights, and dataset for AstroPT under the MIT license, and invite potential collaborators to join us in collectively building and researching these models.

* 12 pages, 4 figures, 1 table. Code available at https://github.com/Smith42/astroPT

Via

Access Paper or Ask Questions

AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets

Jan 05, 2024

Ernest Perkowski, Rui Pan, Tuan Dung Nguyen, Yuan-Sen Ting, Sandor Kruk, Tong Zhang, Charlie O'Neill, Maja Jablonska, Zechang Sun, Michael J. Smith(+4 more)

Abstract:We explore the potential of enhancing LLM performance in astronomy-focused question-answering through targeted, continual pre-training. By employing a compact 7B-parameter LLaMA-2 model and focusing exclusively on a curated set of astronomy corpora -- comprising abstracts, introductions, and conclusions -- we achieve notable improvements in specialized topic comprehension. While general LLMs like GPT-4 excel in broader question-answering scenarios due to superior reasoning capabilities, our findings suggest that continual pre-training with limited resources can still enhance model performance on specialized topics. Additionally, we present an extension of AstroLLaMA: the fine-tuning of the 7B LLaMA model on a domain-specific conversational dataset, culminating in the release of the chat-enabled AstroLLaMA for community use. Comprehensive quantitative benchmarking is currently in progress and will be detailed in an upcoming full paper. The model, AstroLLaMA-Chat, is now available at https://huggingface.co/universeTBD, providing the first open-source conversational AI tool tailored for the astronomy community.

* 4 pages, 1 figure, model is available at https://huggingface.co/universeTBD, published in RNAAS

Via

Access Paper or Ask Questions

EarthPT: a foundation model for Earth Observation

Sep 13, 2023

Michael J. Smith, Luke Fleming, James E. Geach

Abstract:We introduce EarthPT -- an Earth Observation (EO) pretrained transformer. EarthPT is a 700 million parameter decoding transformer foundation model trained in an autoregressive self-supervised manner and developed specifically with EO use-cases in mind. We demonstrate that EarthPT is an effective forecaster that can accurately predict future pixel-level surface reflectances across the 400-2300 nm range well into the future. For example, forecasts of the evolution of the Normalised Difference Vegetation Index (NDVI) have a typical error of approximately 0.05 (over a natural range of -1 -> 1) at the pixel level over a five month test set horizon, out-performing simple phase-folded models based on historical averaging. We also demonstrate that embeddings learnt by EarthPT hold semantically meaningful information and could be exploited for downstream tasks such as highly granular, dynamic land use classification. Excitingly, we note that the abundance of EO data provides us with -- in theory -- quadrillions of training tokens. Therefore, if we assume that EarthPT follows neural scaling laws akin to those derived for Large Language Models (LLMs), there is currently no data-imposed limit to scaling EarthPT and other similar `Large Observation Models.'

* 7 pages, 4 figures, submitted to NeurIPS CCAI workshop

Via

Access Paper or Ask Questions

Astronomia ex machina: a history, primer, and outlook on neural networks in astronomy

Nov 07, 2022

Michael J. Smith, James E. Geach

Abstract:In recent years, deep learning has infiltrated every field it has touched, reducing the need for specialist knowledge and automating the process of knowledge discovery from data. This review argues that astronomy is no different, and that we are currently in the midst of a deep learning revolution that is transforming the way we do astronomy. We trace the history of astronomical connectionism from the early days of multilayer perceptrons, through the second wave of convolutional and recurrent neural networks, to the current third wave of self-supervised and unsupervised deep learning. We then predict that we will soon enter a fourth wave of astronomical connectionism, in which finetuned versions of an all-encompassing 'foundation' model will replace expertly crafted deep learning models. We argue that such a model can only be brought about through a symbiotic relationship between astronomy and connectionism, whereby astronomy provides high quality multimodal data to train the foundation model, and in turn the foundation model is used to advance astronomical research.

* 60 pages, 269 references, 29 figures. Review submitted to Royal Society Open Science. Comments and feedback welcome

Via

Access Paper or Ask Questions

Realistic galaxy image simulation via score-based generative models

Nov 02, 2021

Michael J. Smith, James E. Geach, Ryan A. Jackson, Nikhil Arora, Connor Stone, Stéphane Courteau

Figure 1 for Realistic galaxy image simulation via score-based generative models

Figure 2 for Realistic galaxy image simulation via score-based generative models

Figure 3 for Realistic galaxy image simulation via score-based generative models

Figure 4 for Realistic galaxy image simulation via score-based generative models

Abstract:We show that a Denoising Diffusion Probabalistic Model (DDPM), a class of score-based generative model, can be used to produce realistic yet fake images that mimic observations of galaxies. Our method is tested with Dark Energy Spectroscopic Instrument grz imaging of galaxies from the Photometry and Rotation curve OBservations from Extragalactic Surveys (PROBES) sample and galaxies selected from the Sloan Digital Sky Survey. Subjectively, the generated galaxies are highly realistic when compared with samples from the real dataset. We quantify the similarity by borrowing from the deep generative learning literature, using the `Fr\'echet Inception Distance' to test for subjective and morphological similarity. We also introduce the `Synthetic Galaxy Distance' metric to compare the emergent physical properties (such as total magnitude, colour and half light radius) of a ground truth parent and synthesised child dataset. We argue that the DDPM approach produces sharper and more realistic images than other generative methods such as Adversarial Networks (with the downside of more costly inference), and could be used to produce large samples of synthetic observations tailored to a specific imaging survey. We demonstrate two potential uses of the DDPM: (1) accurate in-painting of occluded data, such as satellite trails, and (2) domain transfer, where new input images can be processed to mimic the properties of the DDPM training set. Here we `DESI-fy' cartoon images as a proof of concept for domain transfer. Finally, we suggest potential applications for score-based approaches that could motivate further research on this topic within the astronomical community.

* 10 pages, 8 figures. Code: https://github.com/smith42 . Follow the Twitter bot @ThisIsNotAnApod for DDPM-generated APODs

Via

Access Paper or Ask Questions