Picture for Zhaoyang Liu

Zhaoyang Liu

What is Wrong with Perplexity for Long-context Language Modeling?

Add code
Oct 31, 2024
Viaarxiv icon

MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions

Add code
Jul 30, 2024
Viaarxiv icon

VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks

Add code
Jun 12, 2024
Viaarxiv icon

VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling

Add code
Jun 06, 2024
Viaarxiv icon

LLMs Meet Multimodal Generation and Editing: A Survey

Add code
May 29, 2024
Viaarxiv icon

Paths of A Million People: Extracting Life Trajectories from Wikipedia

Add code
May 25, 2024
Viaarxiv icon

Linear Gaussian Bounding Box Representation and Ring-Shaped Rotated Convolution for Oriented Object Detection

Add code
Nov 14, 2023
Viaarxiv icon

ControlLLM: Augment Language Models with Tools by Searching on Graphs

Add code
Oct 30, 2023
Viaarxiv icon

Data-Juicer: A One-Stop Data Processing System for Large Language Models

Add code
Sep 05, 2023
Viaarxiv icon

InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language

Add code
May 11, 2023
Viaarxiv icon