Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Julien Velcin

Histoires Morales: A French Dataset for Assessing Moral Alignment

Jan 28, 2025

Thibaud Leteno, Irina Proskurina, Antoine Gourru, Julien Velcin, Charlotte Laclau, Guillaume Metzler, Christophe Gravier

Abstract:Aligning language models with human values is crucial, especially as they become more integrated into everyday life. While models are often adapted to user preferences, it is equally important to ensure they align with moral norms and behaviours in real-world social situations. Despite significant progress in languages like English and Chinese, French has seen little attention in this area, leaving a gap in understanding how LLMs handle moral reasoning in this language. To address this gap, we introduce Histoires Morales, a French dataset derived from Moral Stories, created through translation and subsequently refined with the assistance of native speakers to guarantee grammatical accuracy and adaptation to the French cultural context. We also rely on annotations of the moral values within the dataset to ensure their alignment with French norms. Histoires Morales covers a wide range of social situations, including differences in tipping practices, expressions of honesty in relationships, and responsibilities toward animals. To foster future research, we also conduct preliminary experiments on the alignment of multilingual models on French and English data and the robustness of the alignment. We find that while LLMs are generally aligned with human moral norms by default, they can be easily influenced with user-preference optimization for both moral and immoral data.

* Accepted to NAACL 2025

Via

Access Paper or Ask Questions

Capturing Style in Author and Document Representation

Jul 18, 2024

Enzo Terreau, Antoine Gourru, Julien Velcin

Abstract:A wide range of Deep Natural Language Processing (NLP) models integrates continuous and low dimensional representations of words and documents. Surprisingly, very few models study representation learning for authors. These representations can be used for many NLP tasks, such as author identification and classification, or in recommendation systems. A strong limitation of existing works is that they do not explicitly capture writing style, making them hardly applicable to literary data. We therefore propose a new architecture based on Variational Information Bottleneck (VIB) that learns embeddings for both authors and documents with a stylistic constraint. Our model fine-tunes a pre-trained document encoder. We stimulate the detection of writing style by adding predefined stylistic features making the representation axis interpretable with respect to writing style indicators. We evaluate our method on three datasets: a literary corpus extracted from the Gutenberg Project, the Blog Authorship Corpus and IMDb62, for which we show that it matches or outperforms strong/recent baselines in authorship attribution while capturing much more accurately the authors stylistic aspects.

Via

Access Paper or Ask Questions

When Quantization Affects Confidence of Large Language Models?

May 01, 2024

Irina Proskurina, Luc Brun, Guillaume Metzler, Julien Velcin

Abstract:Recent studies introduced effective compression techniques for Large Language Models (LLMs) via post-training quantization or low-bit weight representation. Although quantized weights offer storage efficiency and allow for faster inference, existing works have indicated that quantization might compromise performance and exacerbate biases in LLMs. This study investigates the confidence and calibration of quantized models, considering factors such as language model type and scale as contributors to quantization loss. Firstly, we reveal that quantization with GPTQ to 4-bit results in a decrease in confidence regarding true labels, with varying impacts observed among different language models. Secondly, we observe fluctuations in the impact on confidence across different scales. Finally, we propose an explanation for quantization loss based on confidence levels, indicating that quantization disproportionately affects samples where the full model exhibited low confidence levels in the first place.

* Accepted to NAACL 2024 Findings

Via

Access Paper or Ask Questions

Mini Minds: Exploring Bebeshka and Zlata Baby Models

Nov 06, 2023

Irina Proskurina, Guillaume Metzler, Julien Velcin

Abstract:In this paper, we describe the University of Lyon 2 submission to the Strict-Small track of the BabyLM competition. The shared task is created with an emphasis on small-scale language modelling from scratch on limited-size data and human language acquisition. Dataset released for the Strict-Small track has 10M words, which is comparable to children's vocabulary size. We approach the task with an architecture search, minimizing masked language modelling loss on the data of the shared task. Having found an optimal configuration, we introduce two small-size language models (LMs) that were submitted for evaluation, a 4-layer encoder with 8 attention heads and a 6-layer decoder model with 12 heads which we term Bebeshka and Zlata, respectively. Despite being half the scale of the baseline LMs, our proposed models achieve comparable performance. We further explore the applicability of small-scale language models in tasks involving moral judgment, aligning their predictions with human values. These findings highlight the potential of compact LMs in addressing practical language understanding tasks.

* CoNLL 2023 BabyLM Challenge

Via

Access Paper or Ask Questions

Dynamic Mixed Membership Stochastic Block Model for Weighted Labeled Networks

Apr 12, 2023

Gaël Poux-Médard, Julien Velcin, Sabine Loudcher

Figure 1 for Dynamic Mixed Membership Stochastic Block Model for Weighted Labeled Networks

Figure 2 for Dynamic Mixed Membership Stochastic Block Model for Weighted Labeled Networks

Figure 3 for Dynamic Mixed Membership Stochastic Block Model for Weighted Labeled Networks

Figure 4 for Dynamic Mixed Membership Stochastic Block Model for Weighted Labeled Networks

Abstract:Most real-world networks evolve over time. Existing literature proposes models for dynamic networks that are either unlabeled or assumed to have a single membership structure. On the other hand, a new family of Mixed Membership Stochastic Block Models (MMSBM) allows to model static labeled networks under the assumption of mixed-membership clustering. In this work, we propose to extend this later class of models to infer dynamic labeled networks under a mixed membership assumption. Our approach takes the form of a temporal prior on the model's parameters. It relies on the single assumption that dynamics are not abrupt. We show that our method significantly differs from existing approaches, and allows to model more complex systems --dynamic labeled networks. We demonstrate the robustness of our method with several experiments on both synthetic and real-world datasets. A key interest of our approach is that it needs very few training data to yield good results. The performance gain under challenging conditions broadens the variety of possible applications of automated learning tools --as in social sciences, which comprise many fields where small datasets are a major obstacle to the introduction of machine learning methods.

Via

Access Paper or Ask Questions

Multivariate Powered Dirichlet Hawkes Process

Dec 13, 2022

Gaël Poux-Médard, Julien Velcin, Sabine Loudcher

Figure 1 for Multivariate Powered Dirichlet Hawkes Process

Figure 2 for Multivariate Powered Dirichlet Hawkes Process

Figure 3 for Multivariate Powered Dirichlet Hawkes Process

Figure 4 for Multivariate Powered Dirichlet Hawkes Process

Abstract:The publication time of a document carries a relevant information about its semantic content. The Dirichlet-Hawkes process has been proposed to jointly model textual information and publication dynamics. This approach has been used with success in several recent works, and extended to tackle specific challenging problems --typically for short texts or entangled publication dynamics. However, the prior in its current form does not allow for complex publication dynamics. In particular, inferred topics are independent from each other --a publication about finance is assumed to have no influence on publications about politics, for instance. In this work, we develop the Multivariate Powered Dirichlet-Hawkes Process (MPDHP), that alleviates this assumption. Publications about various topics can now influence each other. We detail and overcome the technical challenges that arise from considering interacting topics. We conduct a systematic evaluation of MPDHP on a range of synthetic datasets to define its application domain and limitations. Finally, we develop a use case of the MPDHP on Reddit data. At the end of this article, the interested reader will know how and when to use MPDHP, and when not to.

Via

Access Paper or Ask Questions

Dirichlet-Survival Process: Scalable Inference of Topic-Dependent Diffusion Networks

Dec 12, 2022

Gaël Poux-Médard, Julien Velcin, Sabine Loudcher

Abstract:Information spread on networks can be efficiently modeled by considering three features: documents' content, time of publication relative to other publications, and position of the spreader in the network. Most previous works model up to two of those jointly, or rely on heavily parametric approaches. Building on recent Dirichlet-Point processes literature, we introduce the Houston (Hidden Online User-Topic Network) model, that jointly considers all those features in a non-parametric unsupervised framework. It infers dynamic topic-dependent underlying diffusion networks in a continuous-time setting along with said topics. It is unsupervised; it considers an unlabeled stream of triplets shaped as \textit{(time of publication, information's content, spreading entity)} as input data. Online inference is conducted using a sequential Monte-Carlo algorithm that scales linearly with the size of the dataset. Our approach yields consequent improvements over existing baselines on both cluster recovery and subnetworks inference tasks.

Via

Access Paper or Ask Questions

Explainable Clustering via Exemplars: Complexity and Efficient Approximation Algorithms

Sep 20, 2022

Ian Davidson, Michael Livanos, Antoine Gourru, Peter Walker, Julien Velcin, S. S. Ravi

Figure 1 for Explainable Clustering via Exemplars: Complexity and Efficient Approximation Algorithms

Figure 2 for Explainable Clustering via Exemplars: Complexity and Efficient Approximation Algorithms

Figure 3 for Explainable Clustering via Exemplars: Complexity and Efficient Approximation Algorithms

Figure 4 for Explainable Clustering via Exemplars: Complexity and Efficient Approximation Algorithms

Abstract:Explainable AI (XAI) is an important developing area but remains relatively understudied for clustering. We propose an explainable-by-design clustering approach that not only finds clusters but also exemplars to explain each cluster. The use of exemplars for understanding is supported by the exemplar-based school of concept definition in psychology. We show that finding a small set of exemplars to explain even a single cluster is computationally intractable; hence, the overall problem is challenging. We develop an approximation algorithm that provides provable performance guarantees with respect to clustering quality as well as the number of exemplars used. This basic algorithm explains all the instances in every cluster whilst another approximation algorithm uses a bounded number of exemplars to allow simpler explanations and provably covers a large fraction of all the instances. Experimental results show that our work is useful in domains involving difficult to understand deep embeddings of images and text.

* 22 pages; 4 figures

Via

Access Paper or Ask Questions

Properties of Reddit News Topical Interactions

Sep 16, 2022

Gaël Poux-Médard, Julien Velcin, Sabine Loudcher

Figure 1 for Properties of Reddit News Topical Interactions

Figure 2 for Properties of Reddit News Topical Interactions

Figure 3 for Properties of Reddit News Topical Interactions

Figure 4 for Properties of Reddit News Topical Interactions

Abstract:Most models of information diffusion online rely on the assumption that pieces of information spread independently from each other. However, several works pointed out the necessity of investigating the role of interactions in real-world processes, and highlighted possible difficulties in doing so: interactions are sparse and brief. As an answer, recent advances developed models to account for interactions in underlying publication dynamics. In this article, we propose to extend and apply one such model to determine whether interactions between news headlines on Reddit play a significant role in their underlying publication mechanisms. After conducting an in-depth case study on 100,000 news headline from 2019, we retrieve state-of-the-art conclusions about interactions and conclude that they play a minor role in this dataset.

* 2022 Complex Networks and their Applications XI
* Published at the conference Complex Networks and their Applications

Via

Access Paper or Ask Questions

Serialized Interacting Mixed Membership Stochastic Block Model

Sep 16, 2022

Gaël Poux-Médard, Julien Velcin, Sabine Loudcher

Figure 1 for Serialized Interacting Mixed Membership Stochastic Block Model

Figure 2 for Serialized Interacting Mixed Membership Stochastic Block Model

Figure 3 for Serialized Interacting Mixed Membership Stochastic Block Model

Figure 4 for Serialized Interacting Mixed Membership Stochastic Block Model

Abstract:Last years have seen a regain of interest for the use of stochastic block modeling (SBM) in recommender systems. These models are seen as a flexible alternative to tensor decomposition techniques that are able to handle labeled data. Recent works proposed to tackle discrete recommendation problems via SBMs by considering larger contexts as input data and by adding second order interactions between contexts' related elements. In this work, we show that these models are all special cases of a single global framework: the Serialized Interacting Mixed membership Stochastic Block Model (SIMSBM). It allows to model an arbitrarily large context as well as an arbitrarily high order of interactions. We demonstrate that SIMSBM generalizes several recent SBM-based baselines. Besides, we demonstrate that our formulation allows for an increased predictive power on six real-world datasets.

* ICDM 2022 - IEEE International Conference on Data Mining 2022
* Published at ICDM 2022

Via

Access Paper or Ask Questions