Picture for Longyue Wang

Longyue Wang

Marco-LLM: Bridging Languages via Massive Multilingual Training for Cross-Lingual Enhancement

Add code
Dec 05, 2024
Viaarxiv icon

Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions

Add code
Nov 21, 2024
Viaarxiv icon

Large Language Models as Code Executors: An Exploratory Study

Add code
Oct 10, 2024
Figure 1 for Large Language Models as Code Executors: An Exploratory Study
Figure 2 for Large Language Models as Code Executors: An Exploratory Study
Figure 3 for Large Language Models as Code Executors: An Exploratory Study
Figure 4 for Large Language Models as Code Executors: An Exploratory Study
Viaarxiv icon

Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation

Add code
Aug 19, 2024
Figure 1 for Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation
Figure 2 for Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation
Figure 3 for Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation
Figure 4 for Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation
Viaarxiv icon

LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference

Add code
Jun 26, 2024
Figure 1 for LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference
Figure 2 for LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference
Figure 3 for LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference
Figure 4 for LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference
Viaarxiv icon

VideoVista: A Versatile Benchmark for Video Understanding and Reasoning

Add code
Jun 17, 2024
Figure 1 for VideoVista: A Versatile Benchmark for Video Understanding and Reasoning
Figure 2 for VideoVista: A Versatile Benchmark for Video Understanding and Reasoning
Figure 3 for VideoVista: A Versatile Benchmark for Video Understanding and Reasoning
Figure 4 for VideoVista: A Versatile Benchmark for Video Understanding and Reasoning
Viaarxiv icon

(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts

Add code
May 20, 2024
Viaarxiv icon

Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts

Add code
May 18, 2024
Viaarxiv icon

VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context

Add code
May 08, 2024
Viaarxiv icon

On the Information Redundancy in Non-Autoregressive Translation

Add code
May 04, 2024
Viaarxiv icon