Picture for Chao Huang

Chao Huang

Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting

Add code
Apr 09, 2025
Viaarxiv icon

Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1)

Add code
Apr 04, 2025
Viaarxiv icon

AD-GPT: Large Language Models in Alzheimer's Disease

Add code
Apr 03, 2025
Viaarxiv icon

Urban Computing in the Era of Large Language Models

Add code
Apr 02, 2025
Viaarxiv icon

FreSca: Unveiling the Scaling Space in Diffusion Models

Add code
Apr 02, 2025
Viaarxiv icon

Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs

Add code
Mar 07, 2025
Figure 1 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Figure 2 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Figure 3 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Figure 4 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Viaarxiv icon

AutoTestForge: A Multidimensional Automated Testing Framework for Natural Language Processing Models

Add code
Mar 07, 2025
Viaarxiv icon

Self-Adjust Softmax

Add code
Feb 25, 2025
Viaarxiv icon

MetaChain: A Fully-Automated and Zero-Code Framework for LLM Agents

Add code
Feb 09, 2025
Viaarxiv icon

VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context Videos

Add code
Feb 03, 2025
Figure 1 for VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context Videos
Figure 2 for VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context Videos
Figure 3 for VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context Videos
Figure 4 for VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context Videos
Viaarxiv icon