Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jordi Torres

In-Context Bias Propagation in LLM-Based Tabular Data Generation

Jun 11, 2025

Pol G. Recasens, Alberto Gutierrez, Jordi Torres, Josep. Ll Berral, Anisa Halimi, Kieran Fraser

Abstract:Large Language Models (LLMs) are increasingly used for synthetic tabular data generation through in-context learning (ICL), offering a practical solution for data augmentation in data scarce scenarios. While prior work has shown the potential of LLMs to improve downstream task performance through augmenting underrepresented groups, these benefits often assume access to a subset of unbiased in-context examples, representative of the real dataset. In real-world settings, however, data is frequently noisy and demographically skewed. In this paper, we systematically study how statistical biases within in-context examples propagate to the distribution of synthetic tabular data, showing that even mild in-context biases lead to global statistical distortions. We further introduce an adversarial scenario where a malicious contributor can inject bias into the synthetic dataset via a subset of in-context examples, ultimately compromising the fairness of downstream classifiers for a targeted and protected subgroup. Our findings demonstrate a new vulnerability associated with LLM-based data generation pipelines that rely on in-context prompts with in sensitive domains.

* Paper accepted at ICML 2025 workshop DIG-BUG

Via

Access Paper or Ask Questions

Mind the Memory Gap: Unveiling GPU Bottlenecks in Large-Batch LLM Inference

Mar 11, 2025

Pol G. Recasens, Ferran Agullo, Yue Zhu, Chen Wang, Eun Kyung Lee, Olivier Tardieu, Jordi Torres, Josep Ll. Berral

Abstract:Large language models have been widely adopted across different tasks, but their auto-regressive generation nature often leads to inefficient resource utilization during inference. While batching is commonly used to increase throughput, performance gains plateau beyond a certain batch size, especially with smaller models, a phenomenon that existing literature typically explains as a shift to the compute-bound regime. In this paper, through an in-depth GPU-level analysis, we reveal that large-batch inference remains memory-bound, with most GPU compute capabilities underutilized due to DRAM bandwidth saturation as the primary bottleneck. To address this, we propose a Batching Configuration Advisor (BCA) that optimizes memory allocation, reducing GPU memory requirements with minimal impact on throughput. The freed memory and underutilized GPU compute capabilities can then be leveraged by concurrent workloads. Specifically, we use model replication to improve serving throughput and GPU utilization. Our findings challenge conventional assumptions about LLM inference, offering new insights and practical strategies for improving resource utilization, particularly for smaller language models.

* Pol G. Recasens, Ferran Agullo: equal contribution

Via

Access Paper or Ask Questions

FRIDA: Free-Rider Detection using Privacy Attacks

Oct 07, 2024

Pol G. Recasens, Ádám Horváth, Alberto Gutierrez-Torre, Jordi Torres, Josep Ll. Berral, Balázs Pejó

Figure 1 for FRIDA: Free-Rider Detection using Privacy Attacks

Figure 2 for FRIDA: Free-Rider Detection using Privacy Attacks

Figure 3 for FRIDA: Free-Rider Detection using Privacy Attacks

Figure 4 for FRIDA: Free-Rider Detection using Privacy Attacks

Abstract:Federated learning is increasingly popular as it enables multiple parties with limited datasets and resources to train a high-performing machine learning model collaboratively. However, similarly to other collaborative systems, federated learning is vulnerable to free-riders -- participants who do not contribute to the training but still benefit from the shared model. Free-riders not only compromise the integrity of the learning process but also slow down the convergence of the global model, resulting in increased costs for the honest participants. To address this challenge, we propose FRIDA: free-rider detection using privacy attacks, a framework that leverages inference attacks to detect free-riders. Unlike traditional methods that only capture the implicit effects of free-riding, FRIDA directly infers details of the underlying training datasets, revealing characteristics that indicate free-rider behaviour. Through extensive experiments, we demonstrate that membership and property inference attacks are effective for this purpose. Our evaluation shows that FRIDA outperforms state-of-the-art methods, especially in non-IID settings.

Via

Access Paper or Ask Questions

Towards Pareto Optimal Throughput in Small Language Model Serving

Apr 04, 2024

Pol G. Recasens, Yue Zhu, Chen Wang, Eun Kyung Lee, Olivier Tardieu, Alaa Youssef, Jordi Torres, Josep Ll. Berral

Abstract:Large language models (LLMs) have revolutionized the state-of-the-art of many different natural language processing tasks. Although serving LLMs is computationally and memory demanding, the rise of Small Language Models (SLMs) offers new opportunities for resource-constrained users, who now are able to serve small models with cutting-edge performance. In this paper, we present a set of experiments designed to benchmark SLM inference at performance and energy levels. Our analysis provides a new perspective in serving, highlighting that the small memory footprint of SLMs allows for reaching the Pareto-optimal throughput within the resource capacity of a single accelerator. In this regard, we present an initial set of findings demonstrating how model replication can effectively improve resource utilization for serving SLMs.

* It is going to be published at EuroMLSys'24

Via

Access Paper or Ask Questions

Sign Language Translation from Instructional Videos

Apr 14, 2023

Laia Tarrés, Gerard I. Gállego, Amanda Duarte, Jordi Torres, Xavier Giró-i-Nieto

Figure 1 for Sign Language Translation from Instructional Videos

Figure 2 for Sign Language Translation from Instructional Videos

Figure 3 for Sign Language Translation from Instructional Videos

Figure 4 for Sign Language Translation from Instructional Videos

Abstract:The advances in automatic sign language translation (SLT) to spoken languages have been mostly benchmarked with datasets of limited size and restricted domains. Our work advances the state of the art by providing the first baseline results on How2Sign, a large and broad dataset. We train a Transformer over I3D video features, using the reduced BLEU as a reference metric for validation, instead of the widely used BLEU score. We report a result of 8.03 on the BLEU score, and publish the first open-source implementation of its kind to promote further advances.

* Paper accepted at WiCV @CVPR23

Via

Access Paper or Ask Questions

Tackling Low-Resourced Sign Language Translation: UPC at WMT-SLT 22

Dec 02, 2022

Laia Tarrés, Gerard I. Gàllego, Xavier Giró-i-Nieto, Jordi Torres

Abstract:This paper describes the system developed at the Universitat Polit\`ecnica de Catalunya for the Workshop on Machine Translation 2022 Sign Language Translation Task, in particular, for the sign-to-text direction. We use a Transformer model implemented with the Fairseq modeling toolkit. We have experimented with the vocabulary size, data augmentation techniques and pretraining the model with the PHOENIX-14T dataset. Our system obtains 0.50 BLEU score for the test set, improving the organizers' baseline by 0.38 BLEU. We remark the poor results for both the baseline and our system, and thus, the unreliability of our findings.

Via

Access Paper or Ask Questions

Topic Detection in Continuous Sign Language Videos

Sep 01, 2022

Alvaro Budria, Laia Tarres, Gerard I. Gallego, Francesc Moreno-Noguer, Jordi Torres, Xavier Giro-i-Nieto

Figure 1 for Topic Detection in Continuous Sign Language Videos

Figure 2 for Topic Detection in Continuous Sign Language Videos

Figure 3 for Topic Detection in Continuous Sign Language Videos

Figure 4 for Topic Detection in Continuous Sign Language Videos

Abstract:Significant progress has been made recently on challenging tasks in automatic sign language understanding, such as sign language recognition, translation and production. However, these works have focused on datasets with relatively few samples, short recordings and limited vocabulary and signing space. In this work, we introduce the novel task of sign language topic detection. We base our experiments on How2Sign, a large-scale video dataset spanning multiple semantic domains. We provide strong baselines for the task of topic detection and present a comparison between different visual features commonly used in the domain of sign language.

* "AVA: Accessibility, Vision, and Autonomy Meet" CVPR 2022 Workshop
* Presented as an extended abstract in the "AVA: Accessibility, Vision, and Autonomy Meet" CVPR 2022 Workshop

Via

Access Paper or Ask Questions

Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation

Oct 29, 2021

Josep Lluis Berral, Oriol Aranda, Juan Luis Dominguez, Jordi Torres

Figure 1 for Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation

Figure 2 for Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation

Figure 3 for Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation

Figure 4 for Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation

Abstract:Most research on novel techniques for 3D Medical Image Segmentation (MIS) is currently done using Deep Learning with GPU accelerators. The principal challenge of such technique is that a single input can easily cope computing resources, and require prohibitive amounts of time to be processed. Distribution of deep learning and scalability over computing devices is an actual need for progressing on such research field. Conventional distribution of neural networks consist in data parallelism, where data is scattered over resources (e.g., GPUs) to parallelize the training of the model. However, experiment parallelism is also an option, where different training processes are parallelized across resources. While the first option is much more common on 3D image segmentation, the second provides a pipeline design with less dependence among parallelized processes, allowing overhead reduction and more potential scalability. In this work we present a design for distributed deep learning training pipelines, focusing on multi-node and multi-GPU environments, where the two different distribution approaches are deployed and benchmarked. We take as proof of concept the 3D U-Net architecture, using the MSD Brain Tumor Segmentation dataset, a state-of-art problem in medical image segmentation with high computing and space requirements. Using the BSC MareNostrum supercomputer as benchmarking environment, we use TensorFlow and Ray as neural network training and experiment distribution platforms. We evaluate the experiment speed-up, showing the potential for scaling out on GPUs and nodes. Also comparing the different parallelism techniques, showing how experiment distribution leverages better such resources through scaling. Finally, we provide the implementation of the design open to the community, and the non-trivial steps and methodology for adapting and deploying a MIS case as the here presented.

* 7 pages, 4 figures, scientific report, official code: https://github.com/HiEST/DistMIS

Via

Access Paper or Ask Questions

RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation

Oct 01, 2020

Miriam Bellver, Carles Ventura, Carina Silberer, Ioannis Kazakos, Jordi Torres, Xavier Giro-i-Nieto

Figure 1 for RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation

Figure 2 for RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation

Figure 3 for RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation

Figure 4 for RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation

Abstract:The task of video object segmentation with referring expressions (language-guided VOS) is to, given a linguistic phrase and a video, generate binary masks for the object to which the phrase refers. Our work argues that existing benchmarks used for this task are mainly composed of trivial cases, in which referents can be identified with simple phrases. Our analysis relies on a new categorization of the phrases in the DAVIS-2017 and Actor-Action datasets into trivial and non-trivial REs, with the non-trivial REs annotated with seven RE semantic categories. We leverage this data to analyze the results of RefVOS, a novel neural network that obtains competitive results for the task of language-guided image segmentation and state of the art results for language-guided VOS. Our study indicates that the major challenges for the task are related to understanding motion and static actions.

Via

Access Paper or Ask Questions

Mask-guided sample selection for Semi-Supervised Instance Segmentation

Aug 25, 2020

Miriam Bellver, Amaia Salvador, Jordi Torres, Xavier Giro-i-Nieto

Figure 1 for Mask-guided sample selection for Semi-Supervised Instance Segmentation

Figure 2 for Mask-guided sample selection for Semi-Supervised Instance Segmentation

Figure 3 for Mask-guided sample selection for Semi-Supervised Instance Segmentation

Figure 4 for Mask-guided sample selection for Semi-Supervised Instance Segmentation

Abstract:Image segmentation methods are usually trained with pixel-level annotations, which require significant human effort to collect. The most common solution to address this constraint is to implement weakly-supervised pipelines trained with lower forms of supervision, such as bounding boxes or scribbles. Another option are semi-supervised methods, which leverage a large amount of unlabeled data and a limited number of strongly-labeled samples. In this second setup, samples to be strongly-annotated can be selected randomly or with an active learning mechanism that chooses the ones that will maximize the model performance. In this work, we propose a sample selection approach to decide which samples to annotate for semi-supervised instance segmentation. Our method consists in first predicting pseudo-masks for the unlabeled pool of samples, together with a score predicting the quality of the mask. This score is an estimate of the Intersection Over Union (IoU) of the segment with the ground truth mask. We study which samples are better to annotate given the quality score, and show how our approach outperforms a random selection, leading to improved performance for semi-supervised instance segmentation with low annotation budgets.

* Preprint submitted to Multimedia Tools and Applications

Via

Access Paper or Ask Questions