Picture for Karan Sikka

Karan Sikka

Pelican: Correcting Hallucination in Vision-LLMs via Claim Decomposition and Program of Thought Verification

Add code
Jul 02, 2024
Figure 1 for Pelican: Correcting Hallucination in Vision-LLMs via Claim Decomposition and Program of Thought Verification
Figure 2 for Pelican: Correcting Hallucination in Vision-LLMs via Claim Decomposition and Program of Thought Verification
Figure 3 for Pelican: Correcting Hallucination in Vision-LLMs via Claim Decomposition and Program of Thought Verification
Figure 4 for Pelican: Correcting Hallucination in Vision-LLMs via Claim Decomposition and Program of Thought Verification
Viaarxiv icon

A Video is Worth 10,000 Words: Training and Benchmarking with Diverse Captions for Better Long Video Retrieval

Add code
Nov 30, 2023
Figure 1 for A Video is Worth 10,000 Words: Training and Benchmarking with Diverse Captions for Better Long Video Retrieval
Figure 2 for A Video is Worth 10,000 Words: Training and Benchmarking with Diverse Captions for Better Long Video Retrieval
Figure 3 for A Video is Worth 10,000 Words: Training and Benchmarking with Diverse Captions for Better Long Video Retrieval
Figure 4 for A Video is Worth 10,000 Words: Training and Benchmarking with Diverse Captions for Better Long Video Retrieval
Viaarxiv icon

DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback

Add code
Nov 16, 2023
Figure 1 for DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback
Figure 2 for DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback
Figure 3 for DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback
Figure 4 for DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback
Viaarxiv icon

Demonstrations Are All You Need: Advancing Offensive Content Paraphrasing using In-Context Learning

Add code
Oct 16, 2023
Figure 1 for Demonstrations Are All You Need: Advancing Offensive Content Paraphrasing using In-Context Learning
Figure 2 for Demonstrations Are All You Need: Advancing Offensive Content Paraphrasing using In-Context Learning
Figure 3 for Demonstrations Are All You Need: Advancing Offensive Content Paraphrasing using In-Context Learning
Figure 4 for Demonstrations Are All You Need: Advancing Offensive Content Paraphrasing using In-Context Learning
Viaarxiv icon

SayNav: Grounding Large Language Models for Dynamic Planning to Navigation in New Environments

Add code
Sep 22, 2023
Viaarxiv icon

Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models

Add code
Sep 08, 2023
Viaarxiv icon

TIJO: Trigger Inversion with Joint Optimization for Defending Multimodal Backdoored Models

Add code
Aug 07, 2023
Viaarxiv icon

Multilingual Content Moderation: A Case Study on Reddit

Add code
Feb 19, 2023
Figure 1 for Multilingual Content Moderation: A Case Study on Reddit
Figure 2 for Multilingual Content Moderation: A Case Study on Reddit
Figure 3 for Multilingual Content Moderation: A Case Study on Reddit
Figure 4 for Multilingual Content Moderation: A Case Study on Reddit
Viaarxiv icon

Dual-Key Multimodal Backdoors for Visual Question Answering

Add code
Dec 14, 2021
Figure 1 for Dual-Key Multimodal Backdoors for Visual Question Answering
Figure 2 for Dual-Key Multimodal Backdoors for Visual Question Answering
Figure 3 for Dual-Key Multimodal Backdoors for Visual Question Answering
Figure 4 for Dual-Key Multimodal Backdoors for Visual Question Answering
Viaarxiv icon

Challenges in Procedural Multimodal Machine Comprehension:A Novel Way To Benchmark

Add code
Oct 22, 2021
Figure 1 for Challenges in Procedural Multimodal Machine Comprehension:A Novel Way To Benchmark
Figure 2 for Challenges in Procedural Multimodal Machine Comprehension:A Novel Way To Benchmark
Figure 3 for Challenges in Procedural Multimodal Machine Comprehension:A Novel Way To Benchmark
Figure 4 for Challenges in Procedural Multimodal Machine Comprehension:A Novel Way To Benchmark
Viaarxiv icon