Picture for Jiawei Wang

Jiawei Wang

Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video Understanding

Add code
Jan 14, 2025
Viaarxiv icon

Cascaded Self-Evaluation Augmented Training for Efficient Multimodal Large Language Models

Add code
Jan 10, 2025
Viaarxiv icon

DeepSeek-V3 Technical Report

Add code
Dec 27, 2024
Viaarxiv icon

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Add code
Dec 13, 2024
Figure 1 for DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Figure 2 for DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Figure 3 for DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Figure 4 for DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Viaarxiv icon

CausalMob: Causal Human Mobility Prediction with LLMs-derived Human Intentions toward Public Events

Add code
Dec 03, 2024
Viaarxiv icon

VQ-SGen: A Vector Quantized Stroke Representation for Sketch Generation

Add code
Nov 25, 2024
Viaarxiv icon

Tarsier: Recipes for Training and Evaluating Large Video Description Models

Add code
Jun 30, 2024
Viaarxiv icon

DLAFormer: An End-to-End Transformer For Document Layout Analysis

Add code
May 20, 2024
Viaarxiv icon

AMCEN: An Attention Masking-based Contrastive Event Network for Two-stage Temporal Knowledge Graph Reasoning

Add code
May 16, 2024
Viaarxiv icon

EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech

Add code
Mar 17, 2024
Figure 1 for EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech
Figure 2 for EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech
Figure 3 for EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech
Figure 4 for EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech
Viaarxiv icon