Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Thomas Delteil

Reducing Distraction in Long-Context Language Models by Focused Learning

Nov 08, 2024

Zijun Wu, Bingyuan Liu, Ran Yan, Lei Chen, Thomas Delteil

Abstract:Recent advancements in Large Language Models (LLMs) have significantly enhanced their capacity to process long contexts. However, effectively utilizing this long context remains a challenge due to the issue of distraction, where irrelevant information dominates lengthy contexts, causing LLMs to lose focus on the most relevant segments. To address this, we propose a novel training method that enhances LLMs' ability to discern relevant information through a unique combination of retrieval-based data augmentation and contrastive learning. Specifically, during fine-tuning with long contexts, we employ a retriever to extract the most relevant segments, serving as augmented inputs. We then introduce an auxiliary contrastive learning objective to explicitly ensure that outputs from the original context and the retrieved sub-context are closely aligned. Extensive experiments on long single-document and multi-document QA benchmarks demonstrate the effectiveness of our proposed method.

Via

Access Paper or Ask Questions

MATrIX -- Modality-Aware Transformer for Information eXtraction

May 17, 2022

Thomas Delteil, Edouard Belval, Lei Chen, Luis Goncalves, Vijay Mahadevan

Figure 1 for MATrIX -- Modality-Aware Transformer for Information eXtraction

Figure 2 for MATrIX -- Modality-Aware Transformer for Information eXtraction

Figure 3 for MATrIX -- Modality-Aware Transformer for Information eXtraction

Figure 4 for MATrIX -- Modality-Aware Transformer for Information eXtraction

Abstract:We present MATrIX - a Modality-Aware Transformer for Information eXtraction in the Visual Document Understanding (VDU) domain. VDU covers information extraction from visually rich documents such as forms, invoices, receipts, tables, graphs, presentations, or advertisements. In these, text semantics and visual information supplement each other to provide a global understanding of the document. MATrIX is pre-trained in an unsupervised way with specifically designed tasks that require the use of multi-modal information (spatial, visual, or textual). We consider the spatial and text modalities all at once in a single token set. To make the attention more flexible, we use a learned modality-aware relative bias in the attention mechanism to modulate the attention between the tokens of different modalities. We evaluate MATrIX on 3 different datasets each with strong baselines.

Via

Access Paper or Ask Questions

A Computationally Efficient Pipeline Approach to Full Page Offline Handwritten Text Recognition

Oct 01, 2019

Jonathan Chung, Thomas Delteil

Figure 1 for A Computationally Efficient Pipeline Approach to Full Page Offline Handwritten Text Recognition

Figure 2 for A Computationally Efficient Pipeline Approach to Full Page Offline Handwritten Text Recognition

Figure 3 for A Computationally Efficient Pipeline Approach to Full Page Offline Handwritten Text Recognition

Figure 4 for A Computationally Efficient Pipeline Approach to Full Page Offline Handwritten Text Recognition

Abstract:Offline handwriting recognition with deep neural networks is usually limited to words or lines due to large computational costs. In this paper, a less computationally expensive full page offline handwritten text recognition framework is introduced. This framework includes a pipeline that locates handwritten text with an object detection neural network and recognises the text within the detected regions using features extracted with a multi-scale convolutional neural network (CNN) fed into a bidirectional long short term memory (LSTM) network. This framework achieves comparable error rates to state of the art frameworks while using less memory and time. The results in this paper demonstrate the potential of this framework and future work can investigate production ready and deployable handwritten text recognisers.

Via

Access Paper or Ask Questions