Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhufeng Pan

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Mar 08, 2024

Machel Reid, Nikolay Savinov, Denis Teplyashin, Dmitry Lepikhin, Timothy Lillicrap, Jean-baptiste Alayrac, Radu Soricut, Angeliki Lazaridou, Orhan Firat, Julian Schrittwieser(+659 more)

Abstract:In this report, we present the latest model of the Gemini family, Gemini 1.5 Pro, a highly compute-efficient multimodal mixture-of-experts model capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. Gemini 1.5 Pro achieves near-perfect recall on long-context retrieval tasks across modalities, improves the state-of-the-art in long-document QA, long-video QA and long-context ASR, and matches or surpasses Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5 Pro's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 2.1 (200k) and GPT-4 Turbo (128k). Finally, we highlight surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content.

Via

Access Paper or Ask Questions

Who Blames or Endorses Whom? Entity-to-Entity Directed Sentiment Extraction in News Text

Jun 22, 2021

Kunwoo Park, Zhufeng Pan, Jungseock Joo

Figure 1 for Who Blames or Endorses Whom? Entity-to-Entity Directed Sentiment Extraction in News Text

Figure 2 for Who Blames or Endorses Whom? Entity-to-Entity Directed Sentiment Extraction in News Text

Figure 3 for Who Blames or Endorses Whom? Entity-to-Entity Directed Sentiment Extraction in News Text

Figure 4 for Who Blames or Endorses Whom? Entity-to-Entity Directed Sentiment Extraction in News Text

Abstract:Understanding who blames or supports whom in news text is a critical research question in computational social science. Traditional methods and datasets for sentiment analysis are, however, not suitable for the domain of political text as they do not consider the direction of sentiments expressed between entities. In this paper, we propose a novel NLP task of identifying directed sentiment relationship between political entities from a given news document, which we call directed sentiment extraction. From a million-scale news corpus, we construct a dataset of news sentences where sentiment relations of political entities are manually annotated. We present a simple but effective approach for utilizing a pretrained transformer, which infers the target class by predicting multiple question-answering tasks and combining the outcomes. We demonstrate the utility of our proposed method for social science research questions by analyzing positive and negative opinions between political entities in two major events: 2016 U.S. presidential election and COVID-19. The newly proposed problem, data, and method will facilitate future studies on interdisciplinary NLP methods and applications.

* Published in Findings of ACL 2021 (Long paper). The manuscript is slightly revised after the camera ready version

Via

Access Paper or Ask Questions

Interpretable Real-Time Win Prediction for Honor of Kings, a Popular Mobile MOBA Esport

Sep 04, 2020

Zelong Yang, Zhufeng Pan, Yan Wang, Deng Cai, Shuming Shi, Shao-Lun Huang, Xiaojiang Liu

Figure 1 for Interpretable Real-Time Win Prediction for Honor of Kings, a Popular Mobile MOBA Esport

Figure 2 for Interpretable Real-Time Win Prediction for Honor of Kings, a Popular Mobile MOBA Esport

Figure 3 for Interpretable Real-Time Win Prediction for Honor of Kings, a Popular Mobile MOBA Esport

Figure 4 for Interpretable Real-Time Win Prediction for Honor of Kings, a Popular Mobile MOBA Esport

Abstract:With the rapid prevalence and explosive development of MOBA esports (Multiplayer Online Battle Arena electronic sports), many research efforts have been devoted to automatically predicting the game results (win predictions). While this task has great potential in various applications such as esports live streaming and game commentator AI systems, previous studies suffer from two major limitations: 1) insufficient real-time input features and high-quality training data; 2) non-interpretable inference processes of the black-box prediction models. To mitigate these issues, we collect and release a large-scale dataset that contains real-time game records with rich input features of the popular MOBA game Honor of Kings. For interpretable predictions, we propose a Two-Stage Spatial-Temporal Network (TSSTN) that can not only provide accurate real-time win predictions but also attribute the ultimate prediction results to the contributions of different features for interpretability. Experiment results and applications in real-world live streaming scenarios show that the proposed TSSTN model is effective both in prediction accuracy and interpretability.

* 8 pages, 6 figures

Via

Access Paper or Ask Questions