Picture for Qiao Liang

Qiao Liang

Arden

PerlDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Models

Add code
Jul 08, 2024
Viaarxiv icon

CT3D++: Improving 3D Object Detection with Keypoint-induced Channel-wise Transformer

Add code
Jun 12, 2024
Figure 1 for CT3D++: Improving 3D Object Detection with Keypoint-induced Channel-wise Transformer
Figure 2 for CT3D++: Improving 3D Object Detection with Keypoint-induced Channel-wise Transformer
Figure 3 for CT3D++: Improving 3D Object Detection with Keypoint-induced Channel-wise Transformer
Figure 4 for CT3D++: Improving 3D Object Detection with Keypoint-induced Channel-wise Transformer
Viaarxiv icon

RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception

Add code
May 17, 2024
Viaarxiv icon

ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases

Add code
Jun 08, 2023
Viaarxiv icon

A Language Agnostic Multilingual Streaming On-Device ASR System

Add code
Aug 29, 2022
Figure 1 for A Language Agnostic Multilingual Streaming On-Device ASR System
Figure 2 for A Language Agnostic Multilingual Streaming On-Device ASR System
Figure 3 for A Language Agnostic Multilingual Streaming On-Device ASR System
Figure 4 for A Language Agnostic Multilingual Streaming On-Device ASR System
Viaarxiv icon

Streaming Intended Query Detection using E2E Modeling for Continued Conversation

Add code
Aug 29, 2022
Figure 1 for Streaming Intended Query Detection using E2E Modeling for Continued Conversation
Figure 2 for Streaming Intended Query Detection using E2E Modeling for Continued Conversation
Figure 3 for Streaming Intended Query Detection using E2E Modeling for Continued Conversation
Figure 4 for Streaming Intended Query Detection using E2E Modeling for Continued Conversation
Viaarxiv icon

Turn-Taking Prediction for Natural Conversational Speech

Add code
Aug 29, 2022
Figure 1 for Turn-Taking Prediction for Natural Conversational Speech
Figure 2 for Turn-Taking Prediction for Natural Conversational Speech
Figure 3 for Turn-Taking Prediction for Natural Conversational Speech
Figure 4 for Turn-Taking Prediction for Natural Conversational Speech
Viaarxiv icon

TRIE++: Towards End-to-End Information Extraction from Visually Rich Documents

Add code
Jul 14, 2022
Figure 1 for TRIE++: Towards End-to-End Information Extraction from Visually Rich Documents
Figure 2 for TRIE++: Towards End-to-End Information Extraction from Visually Rich Documents
Figure 3 for TRIE++: Towards End-to-End Information Extraction from Visually Rich Documents
Figure 4 for TRIE++: Towards End-to-End Information Extraction from Visually Rich Documents
Viaarxiv icon

A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes

Add code
Apr 20, 2022
Figure 1 for A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Figure 2 for A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Figure 3 for A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Figure 4 for A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Viaarxiv icon

Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition

Add code
Apr 13, 2022
Figure 1 for Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition
Figure 2 for Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition
Figure 3 for Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition
Figure 4 for Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition
Viaarxiv icon