Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Qiong Luo

Improving Out-of-Domain Robustness with Targeted Augmentation in Frequency and Pixel Spaces

May 18, 2025

Ruoqi Wang, Haitao Wang, Shaojie Guo, Qiong Luo

Figure 1 for Improving Out-of-Domain Robustness with Targeted Augmentation in Frequency and Pixel Spaces

Figure 2 for Improving Out-of-Domain Robustness with Targeted Augmentation in Frequency and Pixel Spaces

Figure 3 for Improving Out-of-Domain Robustness with Targeted Augmentation in Frequency and Pixel Spaces

Figure 4 for Improving Out-of-Domain Robustness with Targeted Augmentation in Frequency and Pixel Spaces

Abstract:Out-of-domain (OOD) robustness under domain adaptation settings, where labeled source data and unlabeled target data come from different distributions, is a key challenge in real-world applications. A common approach to improving OOD robustness is through data augmentations. However, in real-world scenarios, models trained with generic augmentations can only improve marginally when generalized under distribution shifts toward unlabeled target domains. While dataset-specific targeted augmentations can address this issue, they typically require expert knowledge and extensive prior data analysis to identify the nature of the datasets and domain shift. To address these challenges, we propose Frequency-Pixel Connect, a domain-adaptation framework that enhances OOD robustness by introducing a targeted augmentation in both the frequency space and pixel space. Specifically, we mix the amplitude spectrum and pixel content of a source image and a target image to generate augmented samples that introduce domain diversity while preserving the semantic structure of the source image. Unlike previous targeted augmentation methods that are both dataset-specific and limited to the pixel space, Frequency-Pixel Connect is dataset-agnostic, enabling broader and more flexible applicability beyond natural image datasets. We further analyze the effectiveness of Frequency-Pixel Connect by evaluating the performance of our method connecting same-class cross-domain samples while separating different-class examples. We demonstrate that Frequency-Pixel Connect significantly improves cross-domain connectivity and outperforms previous generic methods on four diverse real-world benchmarks across vision, medical, audio, and astronomical domains, and it also outperforms other dataset-specific targeted augmentation methods.

Via

Access Paper or Ask Questions

JRE-L: Journalist, Reader, and Editor LLMs in the Loop for Science Journalism for the General Audience

Jan 28, 2025

Gongyao Jiang, Xinran Shi, Qiong Luo

Figure 1 for JRE-L: Journalist, Reader, and Editor LLMs in the Loop for Science Journalism for the General Audience

Figure 2 for JRE-L: Journalist, Reader, and Editor LLMs in the Loop for Science Journalism for the General Audience

Figure 3 for JRE-L: Journalist, Reader, and Editor LLMs in the Loop for Science Journalism for the General Audience

Figure 4 for JRE-L: Journalist, Reader, and Editor LLMs in the Loop for Science Journalism for the General Audience

Abstract:Science journalism reports current scientific discoveries to non-specialists, aiming to enable public comprehension of the state of the art. This task is challenging as the audience often lacks specific knowledge about the presented research. We propose a JRE-L framework that integrates three LLMs mimicking the writing-reading-feedback-revision loop. In JRE-L, one LLM acts as the journalist, another LLM as the general public reader, and the third LLM as an editor. The journalist's writing is iteratively refined by feedback from the reader and suggestions from the editor. Our experiments demonstrate that by leveraging the collaboration of two 7B and one 1.8B open-source LLMs, we can generate articles that are more accessible than those generated by existing methods, including prompting single advanced models such as GPT-4 and other LLM-collaboration strategies. Our code is publicly available at github.com/Zzoay/JRE-L.

* arXiv admin note: substantial text overlap with arXiv:2407.09756

Via

Access Paper or Ask Questions

Effective Fine-Tuning of Vision-Language Models for Accurate Galaxy Morphology Analysis

Nov 29, 2024

Ruoqi Wang, Haitao Wang, Qiong Luo

Figure 1 for Effective Fine-Tuning of Vision-Language Models for Accurate Galaxy Morphology Analysis

Figure 2 for Effective Fine-Tuning of Vision-Language Models for Accurate Galaxy Morphology Analysis

Figure 3 for Effective Fine-Tuning of Vision-Language Models for Accurate Galaxy Morphology Analysis

Figure 4 for Effective Fine-Tuning of Vision-Language Models for Accurate Galaxy Morphology Analysis

Abstract:Galaxy morphology analysis involves classifying galaxies by their shapes and structures. For this task, directly training domain-specific models on large, annotated astronomical datasets is effective but costly. In contrast, fine-tuning vision foundation models on a smaller set of astronomical images is more resource-efficient but generally results in lower accuracy. To harness the benefits of both approaches and address their shortcomings, we propose GalaxAlign, a novel method that fine-tunes pre-trained foundation models to achieve high accuracy on astronomical tasks. Specifically, our method extends a contrastive learning architecture to align three types of data in fine-tuning: (1) a set of schematic symbols representing galaxy shapes and structures, (2) textual labels of these symbols, and (3) galaxy images. This way, GalaxAlign not only eliminates the need for expensive pretraining but also enhances the effectiveness of fine-tuning. Extensive experiments on galaxy classification and similarity search demonstrate that our method effectively fine-tunes general pre-trained models for astronomical tasks by incorporating domain-specific multi-modal knowledge.

Via

Access Paper or Ask Questions

LLM-Collaboration on Automatic Science Journalism for the General Audience

Jul 13, 2024

Gongyao Jiang, Xinran Shi, Qiong Luo

Figure 1 for LLM-Collaboration on Automatic Science Journalism for the General Audience

Figure 2 for LLM-Collaboration on Automatic Science Journalism for the General Audience

Figure 3 for LLM-Collaboration on Automatic Science Journalism for the General Audience

Figure 4 for LLM-Collaboration on Automatic Science Journalism for the General Audience

Abstract:Science journalism reports current scientific discoveries to non-specialists, aiming to enable public comprehension of the state of the art. However, this task can be challenging as the audience often lacks specific knowledge about the presented research. To address this challenge, we propose a framework that integrates three LLMs mimicking the real-world writing-reading-feedback-revision workflow, with one LLM acting as the journalist, a smaller LLM as the general public reader, and the third LLM as an editor. The journalist's writing is iteratively refined by feedback from the reader and suggestions from the editor. Our experiments demonstrate that by leveraging the collaboration of two 7B and one 1.8B open-source LLMs, we can generate articles that are more accessible than those generated by existing methods, including advanced models such as GPT-4.

* Under review

Via

Access Paper or Ask Questions

VisRec: A Semi-Supervised Approach to Radio Interferometric Data Reconstruction

Mar 01, 2024

Ruoqi Wang, Haitao Wang, Qiong Luo, Feng Wang, Hejun Wu

Figure 1 for VisRec: A Semi-Supervised Approach to Radio Interferometric Data Reconstruction

Figure 2 for VisRec: A Semi-Supervised Approach to Radio Interferometric Data Reconstruction

Figure 3 for VisRec: A Semi-Supervised Approach to Radio Interferometric Data Reconstruction

Figure 4 for VisRec: A Semi-Supervised Approach to Radio Interferometric Data Reconstruction

Abstract:Radio telescopes produce visibility data about celestial objects, but these data are sparse and noisy. As a result, images created on raw visibility data are of low quality. Recent studies have used deep learning models to reconstruct visibility data to get cleaner images. However, these methods rely on a substantial amount of labeled training data, which requires significant labeling effort from radio astronomers. Addressing this challenge, we propose VisRec, a model-agnostic semi-supervised learning approach to the reconstruction of visibility data. Specifically, VisRec consists of both a supervised learning module and an unsupervised learning module. In the supervised learning module, we introduce a set of data augmentation functions to produce diverse training examples. In comparison, the unsupervised learning module in VisRec augments unlabeled data and uses reconstructions from non-augmented visibility data as pseudo-labels for training. This hybrid approach allows VisRec to effectively leverage both labeled and unlabeled data. This way, VisRec performs well even when labeled data is scarce. Our evaluation results show that VisRec outperforms all baseline methods in reconstruction quality, robustness against common observation perturbation, and generalizability to different telescope configurations.

Via

Access Paper or Ask Questions

Dissecting the Runtime Performance of the Training, Fine-tuning, and Inference of Large Language Models

Nov 07, 2023

Longteng Zhang, Xiang Liu, Zeyu Li, Xinglin Pan, Peijie Dong, Ruibo Fan, Rui Guo, Xin Wang, Qiong Luo, Shaohuai Shi(+1 more)

Abstract:Large Language Models (LLMs) have seen great advance in both academia and industry, and their popularity results in numerous open-source frameworks and techniques in accelerating LLM pre-training, fine-tuning, and inference. Training and deploying LLMs are expensive as it requires considerable computing resources and memory, hence many efficient approaches have been developed for improving system pipelines as well as operators. However, the runtime performance can vary significantly across hardware and software stacks, which makes it difficult to choose the best configuration. In this work, we aim to benchmark the performance from both macro and micro perspectives. First, we benchmark the end-to-end performance of pre-training, fine-tuning, and serving LLMs in different sizes , i.e., 7, 13, and 70 billion parameters (7B, 13B, and 70B) on three 8-GPU platforms with and without individual optimization techniques, including ZeRO, quantization, recomputation, FlashAttention. Then, we dive deeper to provide a detailed runtime analysis of the sub-modules, including computing and communication operators in LLMs. For end users, our benchmark and findings help better understand different optimization techniques, training and inference frameworks, together with hardware platforms in choosing configurations for deploying LLMs. For researchers, our in-depth module-wise analyses discover potential opportunities for future work to further optimize the runtime performance of LLMs.

Via

Access Paper or Ask Questions

A Transformer-Conditioned Neural Fields Pipeline with Polar Coordinate Representation for Astronomical Radio Interferometric Data Reconstruction

Aug 28, 2023

Ruoqi Wang, Qiong Luo, Feng Wang

Figure 1 for A Transformer-Conditioned Neural Fields Pipeline with Polar Coordinate Representation for Astronomical Radio Interferometric Data Reconstruction

Figure 2 for A Transformer-Conditioned Neural Fields Pipeline with Polar Coordinate Representation for Astronomical Radio Interferometric Data Reconstruction

Figure 3 for A Transformer-Conditioned Neural Fields Pipeline with Polar Coordinate Representation for Astronomical Radio Interferometric Data Reconstruction

Figure 4 for A Transformer-Conditioned Neural Fields Pipeline with Polar Coordinate Representation for Astronomical Radio Interferometric Data Reconstruction

Abstract:In radio astronomy, visibility data, which are measurements of wave signals from radio telescopes, are transformed into images for observation of distant celestial objects. However, these resultant images usually contain both real sources and artifacts, due to signal sparsity and other factors. One way to obtain cleaner images is to reconstruct samples into dense forms before imaging. Unfortunately, existing visibility reconstruction methods may miss some components of the frequency data, so blurred object edges and persistent artifacts remain in the images. Furthermore, the computation overhead is high on irregular visibility samples due to the data skew. To address these problems, we propose PolarRec, a reconstruction method for interferometric visibility data, which consists of a transformer-conditioned neural fields pipeline with a polar coordinate representation. This representation matches the way in which telescopes observe a celestial area as the Earth rotates. We further propose Radial Frequency Loss function, using radial coordinates in the polar coordinate system to correlate with the frequency information, to help reconstruct complete visibility. We also group visibility sample points by angular coordinates in the polar coordinate system, and use groups as the granularity for subsequent encoding with a Transformer encoder. Consequently, our method can capture the inherent characteristics of visibility data effectively and efficiently. Our experiments demonstrate that PolarRec markedly improves imaging results by faithfully reconstructing all frequency components in the visibility domain while significantly reducing the computation cost.

Via

Access Paper or Ask Questions

A Conditional Denoising Diffusion Probabilistic Model for Radio Interferometric Image Reconstruction

May 16, 2023

Ruoqi Wang, Zhuoyang Chen, Qiong Luo, Feng Wang

Figure 1 for A Conditional Denoising Diffusion Probabilistic Model for Radio Interferometric Image Reconstruction

Figure 2 for A Conditional Denoising Diffusion Probabilistic Model for Radio Interferometric Image Reconstruction

Figure 3 for A Conditional Denoising Diffusion Probabilistic Model for Radio Interferometric Image Reconstruction

Figure 4 for A Conditional Denoising Diffusion Probabilistic Model for Radio Interferometric Image Reconstruction

Abstract:In radio astronomy, signals from radio telescopes are transformed into images of observed celestial objects, or sources. However, these images, called dirty images, contain real sources as well as artifacts due to signal sparsity and other factors. Therefore, radio interferometric image reconstruction is performed on dirty images, aiming to produce clean images in which artifacts are reduced and real sources are recovered. So far, existing methods have limited success on recovering faint sources, preserving detailed structures, and eliminating artifacts. In this paper, we present VIC-DDPM, a Visibility and Image Conditioned Denoising Diffusion Probabilistic Model. Our main idea is to use both the original visibility data in the spectral domain and dirty images in the spatial domain to guide the image generation process with DDPM. This way, we can leverage DDPM to generate fine details and eliminate noise, while utilizing visibility data to separate signals from noise and retaining spatial information in dirty images. We have conducted experiments in comparison with both traditional methods and recent deep learning based approaches. Our results show that our method significantly improves the resulting images by reducing artifacts, preserving fine details, and recovering dim sources. This advancement further facilitates radio astronomical data analysis tasks on celestial phenomena.

* 8 pages

Via

Access Paper or Ask Questions