Picture for Yifan Yang

Yifan Yang

Real-Time Neural-Enhancement for Online Cloud Gaming

Add code
Jan 12, 2025
Viaarxiv icon

Interleaved Speech-Text Language Models are Simple Streaming Text to Speech Synthesizers

Add code
Dec 23, 2024
Viaarxiv icon

SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training

Add code
Dec 20, 2024
Viaarxiv icon

SP$^2$T: Sparse Proxy Attention for Dual-stream Point Transformer

Add code
Dec 16, 2024
Figure 1 for SP$^2$T: Sparse Proxy Attention for Dual-stream Point Transformer
Figure 2 for SP$^2$T: Sparse Proxy Attention for Dual-stream Point Transformer
Figure 3 for SP$^2$T: Sparse Proxy Attention for Dual-stream Point Transformer
Figure 4 for SP$^2$T: Sparse Proxy Attention for Dual-stream Point Transformer
Viaarxiv icon

MageBench: Bridging Large Multimodal Models to Agents

Add code
Dec 05, 2024
Viaarxiv icon

BIGCity: A Universal Spatiotemporal Model for Unified Trajectory and Traffic State Data Analysis

Add code
Dec 01, 2024
Viaarxiv icon

A Comparative Study of LLM-based ASR and Whisper in Low Resource and Code Switching Scenario

Add code
Dec 01, 2024
Figure 1 for A Comparative Study of LLM-based ASR and Whisper in Low Resource and Code Switching Scenario
Figure 2 for A Comparative Study of LLM-based ASR and Whisper in Low Resource and Code Switching Scenario
Figure 3 for A Comparative Study of LLM-based ASR and Whisper in Low Resource and Code Switching Scenario
Figure 4 for A Comparative Study of LLM-based ASR and Whisper in Low Resource and Code Switching Scenario
Viaarxiv icon

k2SSL: A Faster and Better Framework for Self-Supervised Speech Representation Learning

Add code
Nov 26, 2024
Viaarxiv icon

LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation

Add code
Nov 26, 2024
Figure 1 for LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation
Figure 2 for LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation
Figure 3 for LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation
Figure 4 for LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation
Viaarxiv icon

Navigating Spatial Inequities in Freight Truck Crash Severity via Counterfactual Inference in Los Angeles

Add code
Nov 26, 2024
Viaarxiv icon