Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Roman Koshkin

LLMs Are Zero-Shot Context-Aware Simultaneous Translators

Jun 19, 2024

Roman Koshkin, Katsuhito Sudoh, Satoshi Nakamura

Figure 1 for LLMs Are Zero-Shot Context-Aware Simultaneous Translators

Figure 2 for LLMs Are Zero-Shot Context-Aware Simultaneous Translators

Figure 3 for LLMs Are Zero-Shot Context-Aware Simultaneous Translators

Figure 4 for LLMs Are Zero-Shot Context-Aware Simultaneous Translators

Abstract:The advent of transformers has fueled progress in machine translation. More recently large language models (LLMs) have come to the spotlight thanks to their generality and strong performance in a wide range of language tasks, including translation. Here we show that open-source LLMs perform on par with or better than some state-of-the-art baselines in simultaneous machine translation (SiMT) tasks, zero-shot. We also demonstrate that injection of minimal background information, which is easy with an LLM, brings further performance gains, especially on challenging technical subject-matter. This highlights LLMs' potential for building next generation of massively multilingual, context-aware and terminologically accurate SiMT systems that require no resource-intensive training or fine-tuning.

Via

Access Paper or Ask Questions

TransLLaMa: LLM-based Simultaneous Translation System

Feb 07, 2024

Roman Koshkin, Katsuhito Sudoh, Satoshi Nakamura

Figure 1 for TransLLaMa: LLM-based Simultaneous Translation System

Figure 2 for TransLLaMa: LLM-based Simultaneous Translation System

Figure 3 for TransLLaMa: LLM-based Simultaneous Translation System

Figure 4 for TransLLaMa: LLM-based Simultaneous Translation System

Abstract:Decoder-only large language models (LLMs) have recently demonstrated impressive capabilities in text generation and reasoning. Nonetheless, they have limited applications in simultaneous machine translation (SiMT), currently dominated by encoder-decoder transformers. This study demonstrates that, after fine-tuning on a small dataset comprising causally aligned source and target sentence pairs, a pre-trained open-source LLM can control input segmentation directly by generating a special "wait" token. This obviates the need for a separate policy and enables the LLM to perform English-German and English-Russian SiMT tasks with BLEU scores that are comparable to those of specific state-of-the-art baselines. We also evaluated closed-source models such as GPT-4, which displayed encouraging results in performing the SiMT task without prior training (zero-shot), indicating a promising avenue for enhancing future SiMT systems.

Via

Access Paper or Ask Questions

convSeq: Fast and Scalable Method for Detecting Patterns in Spike Data

Feb 02, 2024

Roman Koshkin, Tomoki Fukai

Abstract:Spontaneous neural activity, crucial in memory, learning, and spatial navigation, often manifests itself as repetitive spatiotemporal patterns. Despite their importance, analyzing these patterns in large neural recordings remains challenging due to a lack of efficient and scalable detection methods. Addressing this gap, we introduce convSeq, an unsupervised method that employs backpropagation for optimizing spatiotemporal filters that effectively identify these neural patterns. Our method's performance is validated on various synthetic data and real neural recordings, revealing spike sequences with unprecedented scalability and efficiency. Significantly surpassing existing methods in speed, convSeq sets a new standard for analyzing spontaneous neural activity, potentially advancing our understanding of information processing in neural circuits.

Via

Access Paper or Ask Questions