Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yikun Zhang

University of Washington, Seattle

A Comparative Review of RNA Language Models

May 14, 2025

He Wang, Yikun Zhang, Jie Chen, Jian Zhan, Yaoqi Zhou

Abstract:Given usefulness of protein language models (LMs) in structure and functional inference, RNA LMs have received increased attentions in the last few years. However, these RNA models are often not compared against the same standard. Here, we divided RNA LMs into three classes (pretrained on multiple RNA types (especially noncoding RNAs), specific-purpose RNAs, and LMs that unify RNA with DNA or proteins or both) and compared 13 RNA LMs along with 3 DNA and 1 protein LMs as controls in zero-shot prediction of RNA secondary structure and functional classification. Results shows that the models doing well on secondary structure prediction often perform worse in function classification or vice versa, suggesting that more balanced unsupervised training is needed.

Via

Access Paper or Ask Questions

Doubly Robust Inference on Causal Derivative Effects for Continuous Treatments

Jan 12, 2025

Yikun Zhang, Yen-Chi Chen

Abstract:Statistical methods for causal inference with continuous treatments mainly focus on estimating the mean potential outcome function, commonly known as the dose-response curve. However, it is often not the dose-response curve but its derivative function that signals the treatment effect. In this paper, we investigate nonparametric inference on the derivative of the dose-response curve with and without the positivity condition. Under the positivity and other regularity conditions, we propose a doubly robust (DR) inference method for estimating the derivative of the dose-response curve using kernel smoothing. When the positivity condition is violated, we demonstrate the inconsistency of conventional inverse probability weighting (IPW) and DR estimators, and introduce novel bias-corrected IPW and DR estimators. In all settings, our DR estimator achieves asymptotic normality at the standard nonparametric rate of convergence. Additionally, our approach reveals an interesting connection to nonparametric support and level set estimation problems. Finally, we demonstrate the applicability of our proposed estimators through simulations and a case study of evaluating a job training program.

* 111 pages (24 pages for the main paper), 9 figures

Via

Access Paper or Ask Questions

BLADE: Benchmarking Language Model Agents for Data-Driven Science

Aug 20, 2024

Ken Gu, Ruoxi Shang, Ruien Jiang, Keying Kuang, Richard-John Lin, Donghe Lyu, Yue Mao, Youran Pan, Teng Wu, Jiaqian Yu(+6 more)

Abstract:Data-driven scientific discovery requires the iterative integration of scientific domain knowledge, statistical expertise, and an understanding of data semantics to make nuanced analytical decisions, e.g., about which variables, transformations, and statistical models to consider. LM-based agents equipped with planning, memory, and code execution capabilities have the potential to support data-driven science. However, evaluating agents on such open-ended tasks is challenging due to multiple valid approaches, partially correct steps, and different ways to express the same decisions. To address these challenges, we present BLADE, a benchmark to automatically evaluate agents' multifaceted approaches to open-ended research questions. BLADE consists of 12 datasets and research questions drawn from existing scientific literature, with ground truth collected from independent analyses by expert data scientists and researchers. To automatically evaluate agent responses, we developed corresponding computational methods to match different representations of analyses to this ground truth. Though language models possess considerable world knowledge, our evaluation shows that they are often limited to basic analyses. However, agents capable of interacting with the underlying data demonstrate improved, but still non-optimal, diversity in their analytical decision making. Our work enables the evaluation of agents for data-driven science and provides researchers deeper insights into agents' analysis approaches.

Via

Access Paper or Ask Questions

Atomas: Hierarchical Alignment on Molecule-Text for Unified Molecule Understanding and Generation

Apr 23, 2024

Yikun Zhang, Geyan Ye, Chaohao Yuan, Bo Han, Long-Kai Huang, Jianhua Yao, Wei Liu, Yu Rong

Abstract:Molecule-and-text cross-modal representation learning has emerged as a promising direction for enhancing the quality of molecular representation, thereby improving performance in various scientific fields, including drug discovery and materials science. Existing studies adopt a global alignment approach to learn the knowledge from different modalities. These global alignment approaches fail to capture fine-grained information, such as molecular fragments and their corresponding textual description, which is crucial for downstream tasks. Furthermore, it is incapable to model such information using a similar global alignment strategy due to data scarcity of paired local part annotated data from existing datasets. In this paper, we propose Atomas, a multi-modal molecular representation learning framework to jointly learn representations from SMILES string and text. We design a Hierarchical Adaptive Alignment model to concurrently learn the fine-grained fragment correspondence between two modalities and align these representations of fragments in three levels. Additionally, Atomas's end-to-end training framework incorporates the tasks of understanding and generating molecule, thereby supporting a wider range of downstream tasks. In the retrieval task, Atomas exhibits robust generalization ability and outperforms the baseline by 30.8% of recall@1 on average. In the generation task, Atomas achieves state-of-the-art results in both molecule captioning task and molecule generation task. Moreover, the visualization of the Hierarchical Adaptive Alignment model further confirms the chemical significance of our approach. Our codes can be found at https://anonymous.4open.science/r/Atomas-03C3.

Via

Access Paper or Ask Questions

Functional Protein Design with Local Domain Alignment

Apr 18, 2024

Chaohao Yuan, Songyou Li, Geyan Ye, Yikun Zhang, Long-Kai Huang, Wenbing Huang, Wei Liu, Jianhua Yao, Yu Rong

Figure 1 for Functional Protein Design with Local Domain Alignment

Figure 2 for Functional Protein Design with Local Domain Alignment

Figure 3 for Functional Protein Design with Local Domain Alignment

Figure 4 for Functional Protein Design with Local Domain Alignment

Abstract:The core challenge of de novo protein design lies in creating proteins with specific functions or properties, guided by certain conditions. Current models explore to generate protein using structural and evolutionary guidance, which only provide indirect conditions concerning functions and properties. However, textual annotations of proteins, especially the annotations for protein domains, which directly describe the protein's high-level functionalities, properties, and their correlation with target amino acid sequences, remain unexplored in the context of protein design tasks. In this paper, we propose Protein-Annotation Alignment Generation (PAAG), a multi-modality protein design framework that integrates the textual annotations extracted from protein database for controllable generation in sequence space. Specifically, within a multi-level alignment module, PAAG can explicitly generate proteins containing specific domains conditioned on the corresponding domain annotations, and can even design novel proteins with flexible combinations of different kinds of annotations. Our experimental results underscore the superiority of the aligned protein representations from PAAG over 7 prediction tasks. Furthermore, PAAG demonstrates a nearly sixfold increase in generation success rate (24.7% vs 4.7% in zinc finger, and 54.3% vs 8.7% in the immunoglobulin domain) in comparison to the existing model.

Via

Access Paper or Ask Questions

Mode and Ridge Estimation in Euclidean and Directional Product Spaces: A Mean Shift Approach

Oct 16, 2021

Yikun Zhang, Yen-Chi Chen

Figure 1 for Mode and Ridge Estimation in Euclidean and Directional Product Spaces: A Mean Shift Approach

Figure 2 for Mode and Ridge Estimation in Euclidean and Directional Product Spaces: A Mean Shift Approach

Figure 3 for Mode and Ridge Estimation in Euclidean and Directional Product Spaces: A Mean Shift Approach

Figure 4 for Mode and Ridge Estimation in Euclidean and Directional Product Spaces: A Mean Shift Approach

Abstract:The set of local modes and the ridge lines estimated from a dataset are important summary characteristics of the data-generating distribution. In this work, we consider estimating the local modes and ridges from point cloud data in a product space with two or more Euclidean/directional metric spaces. Specifically, we generalize the well-known (subspace constrained) mean shift algorithm to the product space setting and illuminate some pitfalls in such generalization. We derive the algorithmic convergence of the proposed method, provide practical guidelines on the implementation, and demonstrate its effectiveness on both simulated and real datasets.

* 51 pages, 10 figures

Via

Access Paper or Ask Questions

Linear Convergence of the Subspace Constrained Mean Shift Algorithm: From Euclidean to Directional Data

Apr 29, 2021

Yikun Zhang, Yen-Chi Chen

Figure 1 for Linear Convergence of the Subspace Constrained Mean Shift Algorithm: From Euclidean to Directional Data

Figure 2 for Linear Convergence of the Subspace Constrained Mean Shift Algorithm: From Euclidean to Directional Data

Figure 3 for Linear Convergence of the Subspace Constrained Mean Shift Algorithm: From Euclidean to Directional Data

Figure 4 for Linear Convergence of the Subspace Constrained Mean Shift Algorithm: From Euclidean to Directional Data

Abstract:This paper studies linear convergence of the subspace constrained mean shift (SCMS) algorithm, a well-known algorithm for identifying a density ridge defined by a kernel density estimator. By arguing that the SCMS algorithm is a special variant of a subspace constrained gradient ascent (SCGA) algorithm with an adaptive step size, we derive linear convergence of such SCGA algorithm. While the existing research focuses mainly on density ridges in the Euclidean space, we generalize density ridges and the SCMS algorithm to directional data. In particular, we establish the stability theorem of density ridges with directional data and prove the linear convergence of our proposed directional SCMS algorithm.

* 75 pages, 9 figures

Via

Access Paper or Ask Questions

The EM Perspective of Directional Mean Shift Algorithm

Jan 25, 2021

Yikun Zhang, Yen-Chi Chen

Figure 1 for The EM Perspective of Directional Mean Shift Algorithm

Abstract:The directional mean shift (DMS) algorithm is a nonparametric method for pursuing local modes of densities defined by kernel density estimators on the unit hypersphere. In this paper, we show that any DMS iteration can be viewed as a generalized Expectation-Maximization (EM) algorithm; in particular, when the von Mises kernel is applied, it becomes an exact EM algorithm. Under the (generalized) EM framework, we provide a new proof for the ascending property of density estimates and demonstrate the global convergence of directional mean shift sequences. Finally, we give a new insight into the linear convergence of the DMS algorithm.

Via

Access Paper or Ask Questions

Kernel Smoothing, Mean Shift, and Their Learning Theory with Directional Data

Oct 23, 2020

Yikun Zhang, Yen-Chi Chen

Figure 1 for Kernel Smoothing, Mean Shift, and Their Learning Theory with Directional Data

Figure 2 for Kernel Smoothing, Mean Shift, and Their Learning Theory with Directional Data

Figure 3 for Kernel Smoothing, Mean Shift, and Their Learning Theory with Directional Data

Figure 4 for Kernel Smoothing, Mean Shift, and Their Learning Theory with Directional Data

Abstract:Directional data consist of observations distributed on a (hyper)sphere, and appear in many applied fields, such as astronomy, ecology, and environmental science. This paper studies both statistical and computational problems of kernel smoothing for directional data. We generalize the classical mean shift algorithm to directional data, which allows us to identify local modes of the directional kernel density estimator (KDE). The statistical convergence rates of the directional KDE and its derivatives are derived, and the problem of mode estimation is examined. We also prove the ascending property of our directional mean shift algorithm and investigate a general problem of gradient ascent on the unit hypersphere. To demonstrate the applicability of our proposed algorithm, we evaluate it as a mode clustering method on both simulated and real-world datasets.

* 82 pages, 10 figures

Via

Access Paper or Ask Questions

Iterative Reconstruction for Low-Dose CT using Deep Gradient Priors of Generative Model

Sep 27, 2020

Zhuonan He, Yikun Zhang, Yu Guan, Shanzhou Niu, Yi Zhang, Yang Chen, Qiegen Liu

Figure 1 for Iterative Reconstruction for Low-Dose CT using Deep Gradient Priors of Generative Model

Figure 2 for Iterative Reconstruction for Low-Dose CT using Deep Gradient Priors of Generative Model

Figure 3 for Iterative Reconstruction for Low-Dose CT using Deep Gradient Priors of Generative Model

Figure 4 for Iterative Reconstruction for Low-Dose CT using Deep Gradient Priors of Generative Model

Abstract:Dose reduction in computed tomography (CT) is essential for decreasing radiation risk in clinical applications. Iterative reconstruction is one of the most promising ways to compensate for the increased noise due to reduction of photon flux. Rather than most existing prior-driven algorithms that benefit from manually designed prior functions or supervised learning schemes, in this work we integrate the data-consistency as a conditional term into the iterative generative model for low-dose CT. At first, a score-based generative network is used for unsupervised distribution learning and the gradient of generative density prior is learned from normal-dose images. Then, the annealing Langevin dynamics is employed to update the trained priors with conditional scheme, i.e., the distance between the reconstructed image and the manifold is minimized along with data fidelity during reconstruction. Experimental comparisons demonstrated the noise reduction and detail preservation abilities of the proposed method.

Via

Access Paper or Ask Questions