Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yun-hsuan Sung

Transforming LLMs into Cross-modal and Cross-lingual Retrieval Systems

Apr 04, 2024

Frank Palma Gomez, Ramon Sanabria, Yun-hsuan Sung, Daniel Cer, Siddharth Dalmia, Gustavo Hernandez Abrego

Abstract:Large language models (LLMs) are trained on text-only data that go far beyond the languages with paired speech and text data. At the same time, Dual Encoder (DE) based retrieval systems project queries and documents into the same embedding space and have demonstrated their success in retrieval and bi-text mining. To match speech and text in many languages, we propose using LLMs to initialize multi-modal DE retrieval systems. Unlike traditional methods, our system doesn't require speech data during LLM pre-training and can exploit LLM's multilingual text understanding capabilities to match speech and text in languages unseen during retrieval training. Our multi-modal LLM-based retrieval system is capable of matching speech and text in 102 languages despite only training on 21 languages. Our system outperforms previous systems trained explicitly on all 102 languages. We achieve a 10% absolute improvement in Recall@1 averaged across these languages. Additionally, our model demonstrates cross-lingual speech and text matching, which is further enhanced by readily available machine translation data.

Via

Access Paper or Ask Questions

Knowledge Prompts: Injecting World Knowledge into Language Models through Soft Prompts

Oct 10, 2022

Cicero Nogueira dos Santos, Zhe Dong, Daniel Cer, John Nham, Siamak Shakeri, Jianmo Ni, Yun-hsuan Sung

Figure 1 for Knowledge Prompts: Injecting World Knowledge into Language Models through Soft Prompts

Figure 2 for Knowledge Prompts: Injecting World Knowledge into Language Models through Soft Prompts

Figure 3 for Knowledge Prompts: Injecting World Knowledge into Language Models through Soft Prompts

Figure 4 for Knowledge Prompts: Injecting World Knowledge into Language Models through Soft Prompts

Abstract:Soft prompts have been recently proposed as a tool for adapting large frozen language models (LMs) to new tasks. In this work, we repurpose soft prompts to the task of injecting world knowledge into LMs. We introduce a method to train soft prompts via self-supervised learning on data from knowledge bases. The resulting soft knowledge prompts (KPs) are task independent and work as an external memory of the LMs. We perform qualitative and quantitative experiments and demonstrate that: (1) KPs can effectively model the structure of the training data; (2) KPs can be used to improve the performance of LMs in different knowledge intensive tasks.

Via

Access Paper or Ask Questions

Improving Multilingual Sentence Embedding using Bi-directional Dual Encoder with Additive Margin Softmax

Feb 22, 2019

Yinfei Yang, Gustavo Hernandez Abrego, Steve Yuan, Mandy Guo, Qinlan Shen, Daniel Cer, Yun-hsuan Sung, Brian Strope, Ray Kurzweil

Figure 1 for Improving Multilingual Sentence Embedding using Bi-directional Dual Encoder with Additive Margin Softmax

Figure 2 for Improving Multilingual Sentence Embedding using Bi-directional Dual Encoder with Additive Margin Softmax

Figure 3 for Improving Multilingual Sentence Embedding using Bi-directional Dual Encoder with Additive Margin Softmax

Figure 4 for Improving Multilingual Sentence Embedding using Bi-directional Dual Encoder with Additive Margin Softmax

Abstract:In this paper, we present an approach to learn multilingual sentence embeddings using a bi-directional dual-encoder with additive margin softmax. The embeddings are able to achieve state-of-the-art results on the United Nations (UN) parallel corpus retrieval task. In all the languages tested, the system achieves P@1 of 86% or higher. We use pairs retrieved by our approach to train NMT models that achieve similar performance to models trained on gold pairs. We explore simple document-level embeddings constructed by averaging our sentence embeddings. On the UN document-level retrieval task, document embeddings achieve around 97% on P@1 for all experimented language pairs. Lastly, we evaluate the proposed model on the BUCC mining task. The learned embeddings with raw cosine similarity scores achieve competitive results compared to current state-of-the-art models, and with a second-stage scorer we achieve a new state-of-the-art level on this task.

Via

Access Paper or Ask Questions

Efficient Natural Language Response Suggestion for Smart Reply

May 01, 2017

Matthew Henderson, Rami Al-Rfou, Brian Strope, Yun-hsuan Sung, Laszlo Lukacs, Ruiqi Guo, Sanjiv Kumar, Balint Miklos, Ray Kurzweil

Figure 1 for Efficient Natural Language Response Suggestion for Smart Reply

Figure 2 for Efficient Natural Language Response Suggestion for Smart Reply

Figure 3 for Efficient Natural Language Response Suggestion for Smart Reply

Figure 4 for Efficient Natural Language Response Suggestion for Smart Reply

Abstract:This paper presents a computationally efficient machine-learned method for natural language response suggestion. Feed-forward neural networks using n-gram embedding features encode messages into vectors which are optimized to give message-response pairs a high dot-product value. An optimized search finds response suggestions. The method is evaluated in a large-scale commercial e-mail application, Inbox by Gmail. Compared to a sequence-to-sequence approach, the new system achieves the same quality at a small fraction of the computational requirements and latency.

Via

Access Paper or Ask Questions

Conversational Contextual Cues: The Case of Personalization and History for Response Ranking

Jun 01, 2016

Rami Al-Rfou, Marc Pickett, Javier Snaider, Yun-hsuan Sung, Brian Strope, Ray Kurzweil

Figure 1 for Conversational Contextual Cues: The Case of Personalization and History for Response Ranking

Figure 2 for Conversational Contextual Cues: The Case of Personalization and History for Response Ranking

Figure 3 for Conversational Contextual Cues: The Case of Personalization and History for Response Ranking

Figure 4 for Conversational Contextual Cues: The Case of Personalization and History for Response Ranking

Abstract:We investigate the task of modeling open-domain, multi-turn, unstructured, multi-participant, conversational dialogue. We specifically study the effect of incorporating different elements of the conversation. Unlike previous efforts, which focused on modeling messages and responses, we extend the modeling to long context and participant's history. Our system does not rely on handwritten rules or engineered features; instead, we train deep neural networks on a large conversational dataset. In particular, we exploit the structure of Reddit comments and posts to extract 2.1 billion messages and 133 million conversations. We evaluate our models on the task of predicting the next response in a conversation, and we find that modeling both context and participants improves prediction accuracy.

* 10 pages, 6 figures

Via

Access Paper or Ask Questions