ENAC
Abstract:We introduce FewTopNER, a novel framework that integrates few-shot named entity recognition (NER) with topic-aware contextual modeling to address the challenges of cross-lingual and low-resource scenarios. FewTopNER leverages a shared multilingual encoder based on XLM-RoBERTa, augmented with language-specific calibration mechanisms, to generate robust contextual embeddings. The architecture comprises a prototype-based entity recognition branch, employing BiLSTM and Conditional Random Fields for sequence labeling, and a topic modeling branch that extracts document-level semantic features through hybrid probabilistic and neural methods. A cross-task bridge facilitates dynamic bidirectional attention and feature fusion between entity and topic representations, thereby enhancing entity disambiguation by incorporating global semantic context. Empirical evaluations on multilingual benchmarks across English, French, Spanish, German, and Italian demonstrate that FewTopNER significantly outperforms existing state-of-the-art few-shot NER models. In particular, the framework achieves improvements of 2.5-4.0 percentage points in F1 score and exhibits enhanced topic coherence, as measured by normalized pointwise mutual information. Ablation studies further confirm the critical contributions of the shared encoder and cross-task integration mechanisms to the overall performance. These results underscore the efficacy of incorporating topic-aware context into few-shot NER and highlight the potential of FewTopNER for robust cross-lingual applications in low-resource settings.
Abstract:It is well established that to ensure or certify the robustness of a neural network, its Lipschitz constant plays a prominent role. However, its calculation is NP-hard. In this note, by taking into account activation regions at each layer as new constraints, we propose new quadratically constrained MIP formulations for the neural network Lipschitz estimation problem. The solutions of these problems give lower bounds and upper bounds of the Lipschitz constant and we detail conditions when they coincide with the exact Lipschitz constant.
Abstract:Undoubtedly that the Bidirectional Encoder representations from Transformers is the most powerful technique in making Natural Language Processing tasks such as Named Entity Recognition, Question & Answers or Sentiment Analysis, however, the use of traditional techniques remains a major potential for the improvement of recent models, in particular word tokenization techniques and embeddings, but also the improvement of neural network architectures which are now the core of each architecture. recent. In this paper, we conduct a comparative study between Fine-Tuning the Bidirectional Encoder Representations from Transformers and a method of concatenating two embeddings to boost the performance of a stacked Bidirectional Long Short-Term Memory-Bidirectional Gated Recurrent Units model; these two approaches are applied in the context of sentiment analysis of shopping places in Morocco. A search for the best learning rate was made at the level of the two approaches, and a comparison of the best optimizers was made for each sentence embedding combination with regard to the second approach.
Abstract:We address the issue of binary classification in Banach spaces in presence of uncertainty. We show that a number of results from classical support vector machines theory can be appropriately generalised to their robust counterpart in Banach spaces. These include the Representer Theorem, strong duality for the associated Optimization problem as well as their geometric interpretation. Furthermore, we propose a game theoretic interpretation by expressing a Nash equilibrium problem formulation for the more general problem of finding the closest points in two closed convex sets when the underlying space is reflexive and smooth.