Picture for Xiaomeng Yang

Xiaomeng Yang

VidGen-1M: A Large-Scale Dataset for Text-to-video Generation

Add code
Aug 05, 2024
Viaarxiv icon

EVALALIGN: Supervised Fine-Tuning Multimodal LLMs with Human-Aligned Data for Evaluating Text-to-Image Models

Add code
Jun 27, 2024
Viaarxiv icon

EvalAlign: Evaluating Text-to-Image Models through Precision Alignment of Multimodal Large Models with Supervised Fine-Tuning to Human Annotations

Add code
Jun 24, 2024
Viaarxiv icon

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

Add code
Apr 12, 2024
Viaarxiv icon

IPAD: Iterative, Parallel, and Diffusion-based Network for Scene Text Recognition

Add code
Dec 19, 2023
Viaarxiv icon

End-to-end Story Plot Generator

Add code
Oct 13, 2023
Viaarxiv icon

Learning Personalized Story Evaluation

Add code
Oct 10, 2023
Viaarxiv icon

TorchRL: A data-driven decision-making library for PyTorch

Add code
Jun 01, 2023
Viaarxiv icon

Masked and Permuted Implicit Context Learning for Scene Text Recognition

Add code
May 25, 2023
Viaarxiv icon

Modeling Scattering Coefficients using Self-Attentive Complex Polynomials with Image-based Representation

Add code
Jan 10, 2023
Viaarxiv icon