Picture for Hanchong Zhang

Hanchong Zhang

StoryTeller: Improving Long Video Description through Global Audio-Visual Character Identification

Add code
Nov 11, 2024
Figure 1 for StoryTeller: Improving Long Video Description through Global Audio-Visual Character Identification
Figure 2 for StoryTeller: Improving Long Video Description through Global Audio-Visual Character Identification
Figure 3 for StoryTeller: Improving Long Video Description through Global Audio-Visual Character Identification
Figure 4 for StoryTeller: Improving Long Video Description through Global Audio-Visual Character Identification
Viaarxiv icon

Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?

Add code
Jul 15, 2024
Figure 1 for Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Figure 2 for Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Figure 3 for Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Figure 4 for Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Viaarxiv icon

AGILE: A Novel Framework of LLM Agents

Add code
May 23, 2024
Viaarxiv icon

CoE-SQL: In-Context Learning for Multi-Turn Text-to-SQL with Chain-of-Editions

Add code
May 04, 2024
Viaarxiv icon

A BiRGAT Model for Multi-intent Spoken Language Understanding with Hierarchical Semantic Frames

Add code
Feb 28, 2024
Figure 1 for A BiRGAT Model for Multi-intent Spoken Language Understanding with Hierarchical Semantic Frames
Figure 2 for A BiRGAT Model for Multi-intent Spoken Language Understanding with Hierarchical Semantic Frames
Figure 3 for A BiRGAT Model for Multi-intent Spoken Language Understanding with Hierarchical Semantic Frames
Figure 4 for A BiRGAT Model for Multi-intent Spoken Language Understanding with Hierarchical Semantic Frames
Viaarxiv icon

ASTormer: An AST Structure-aware Transformer Decoder for Text-to-SQL

Add code
Oct 28, 2023
Viaarxiv icon

ACT-SQL: In-Context Learning for Text-to-SQL with Automatically-Generated Chain-of-Thought

Add code
Oct 26, 2023
Viaarxiv icon

CSS: A Large-scale Cross-schema Chinese Text-to-SQL Medical Dataset

Add code
May 25, 2023
Viaarxiv icon

On the Structural Generalization in Text-to-SQL

Add code
Jan 21, 2023
Viaarxiv icon