Picture for Zhaoyang Liu

Zhaoyang Liu

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Add code
Dec 06, 2024
Viaarxiv icon

ONION: Physics-Informed Deep Learning Model for Line Integral Diagnostics Across Fusion Devices

Add code
Nov 27, 2024
Viaarxiv icon

What is Wrong with Perplexity for Long-context Language Modeling?

Add code
Oct 31, 2024
Figure 1 for What is Wrong with Perplexity for Long-context Language Modeling?
Figure 2 for What is Wrong with Perplexity for Long-context Language Modeling?
Figure 3 for What is Wrong with Perplexity for Long-context Language Modeling?
Figure 4 for What is Wrong with Perplexity for Long-context Language Modeling?
Viaarxiv icon

MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions

Add code
Jul 30, 2024
Figure 1 for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
Figure 2 for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
Figure 3 for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
Figure 4 for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
Viaarxiv icon

VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks

Add code
Jun 12, 2024
Viaarxiv icon

VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling

Add code
Jun 06, 2024
Viaarxiv icon

LLMs Meet Multimodal Generation and Editing: A Survey

Add code
May 29, 2024
Viaarxiv icon

Paths of A Million People: Extracting Life Trajectories from Wikipedia

Add code
May 25, 2024
Figure 1 for Paths of A Million People: Extracting Life Trajectories from Wikipedia
Figure 2 for Paths of A Million People: Extracting Life Trajectories from Wikipedia
Figure 3 for Paths of A Million People: Extracting Life Trajectories from Wikipedia
Figure 4 for Paths of A Million People: Extracting Life Trajectories from Wikipedia
Viaarxiv icon

Linear Gaussian Bounding Box Representation and Ring-Shaped Rotated Convolution for Oriented Object Detection

Add code
Nov 14, 2023
Viaarxiv icon

ControlLLM: Augment Language Models with Tools by Searching on Graphs

Add code
Oct 30, 2023
Viaarxiv icon