Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Stefan Schmid

Centroid Approximation for Byzantine-Tolerant Federated Learning

Jun 18, 2025

Mélanie Cambus, Darya Melnyk, Tijana Milentijević, Stefan Schmid

Abstract:Federated learning allows each client to keep its data locally when training machine learning models in a distributed setting. Significant recent research established the requirements that the input must satisfy in order to guarantee convergence of the training loop. This line of work uses averaging as the aggregation rule for the training models. In particular, we are interested in whether federated learning is robust to Byzantine behavior, and observe and investigate a tradeoff between the average/centroid and the validity conditions from distributed computing. We show that the various validity conditions alone do not guarantee a good approximation of the average. Furthermore, we show that reaching good approximation does not give good results in experimental settings due to possible Byzantine outliers. Our main contribution is the first lower bound of $\min\{\frac{n-t}{t},\sqrt{d}\}$ on the centroid approximation under box validity that is often considered in the literature, where $n$ is the number of clients, $t$ the upper bound on the number of Byzantine faults, and $d$ is the dimension of the machine learning model. We complement this lower bound by an upper bound of $2\min\{n,\sqrt{d}\}$, by providing a new analysis for the case $n<d$. In addition, we present a new algorithm that achieves a $\sqrt{2d}$-approximation under convex validity, which also proves that the existing lower bound in the literature is tight. We show that all presented bounds can also be achieved in the distributed peer-to-peer setting. We complement our analytical results with empirical evaluations in federated stochastic gradient descent and federated averaging settings.

* 19 pages, 10 figures

Via

Access Paper or Ask Questions

MultiADS: Defect-aware Supervision for Multi-type Anomaly Detection and Segmentation in Zero-Shot Learning

Apr 09, 2025

Ylli Sadikaj, Hongkuan Zhou, Lavdim Halilaj, Stefan Schmid, Steffen Staab, Claudia Plant

Abstract:Precise optical inspection in industrial applications is crucial for minimizing scrap rates and reducing the associated costs. Besides merely detecting if a product is anomalous or not, it is crucial to know the distinct type of defect, such as a bent, cut, or scratch. The ability to recognize the "exact" defect type enables automated treatments of the anomalies in modern production lines. Current methods are limited to solely detecting whether a product is defective or not without providing any insights on the defect type, nevertheless detecting and identifying multiple defects. We propose MultiADS, a zero-shot learning approach, able to perform Multi-type Anomaly Detection and Segmentation. The architecture of MultiADS comprises CLIP and extra linear layers to align the visual- and textual representation in a joint feature space. To the best of our knowledge, our proposal, is the first approach to perform a multi-type anomaly segmentation task in zero-shot learning. Contrary to the other baselines, our approach i) generates specific anomaly masks for each distinct defect type, ii) learns to distinguish defect types, and iii) simultaneously identifies multiple defect types present in an anomalous product. Additionally, our approach outperforms zero/few-shot learning SoTA methods on image-level and pixel-level anomaly detection and segmentation tasks on five commonly used datasets: MVTec-AD, Visa, MPDD, MAD and Real-IAD.

Via

Access Paper or Ask Questions

Approximate Agreement Algorithms for Byzantine Collaborative Learning

Apr 02, 2025

Tijana Milentijević, Mélanie Cambus, Darya Melnyk, Stefan Schmid

Abstract:In Byzantine collaborative learning, $n$ clients in a peer-to-peer network collectively learn a model without sharing their data by exchanging and aggregating stochastic gradient estimates. Byzantine clients can prevent others from collecting identical sets of gradient estimates. The aggregation step thus needs to be combined with an efficient (approximate) agreement subroutine to ensure convergence of the training process. In this work, we study the geometric median aggregation rule for Byzantine collaborative learning. We show that known approaches do not provide theoretical guarantees on convergence or gradient quality in the agreement subroutine. To satisfy these theoretical guarantees, we present a hyperbox algorithm for geometric median aggregation. We practically evaluate our algorithm in both centralized and decentralized settings under Byzantine attacks on non-i.i.d. data. We show that our geometric median-based approaches can tolerate sign-flip attacks better than known mean-based approaches from the literature.

Via

Access Paper or Ask Questions

Predicting the Road Ahead: A Knowledge Graph based Foundation Model for Scene Understanding in Autonomous Driving

Mar 24, 2025

Hongkuan Zhou, Stefan Schmid, Yicong Li, Lavdim Halilaj, Xiangtong Yao, Wei cao

Abstract:The autonomous driving field has seen remarkable advancements in various topics, such as object recognition, trajectory prediction, and motion planning. However, current approaches face limitations in effectively comprehending the complex evolutions of driving scenes over time. This paper proposes FM4SU, a novel methodology for training a symbolic foundation model (FM) for scene understanding in autonomous driving. It leverages knowledge graphs (KGs) to capture sensory observation along with domain knowledge such as road topology, traffic rules, or complex interactions between traffic participants. A bird's eye view (BEV) symbolic representation is extracted from the KG for each driving scene, including the spatio-temporal information among the objects across the scenes. The BEV representation is serialized into a sequence of tokens and given to pre-trained language models (PLMs) for learning an inherent understanding of the co-occurrence among driving scene elements and generating predictions on the next scenes. We conducted a number of experiments using the nuScenes dataset and KG in various scenarios. The results demonstrate that fine-tuned models achieve significantly higher accuracy in all tasks. The fine-tuned T5 model achieved a next scene prediction accuracy of 86.7%. This paper concludes that FM4SU offers a promising foundation for developing more comprehensive models for scene understanding in autonomous driving.

Via

Access Paper or Ask Questions

An Optimization Driven Link SINR Assurance in RIS-assisted Indoor Networks

Dec 28, 2024

Cao Vien Phung, Max Franke, Ehsan Tohidi, June Heinemann, Andre Drummond, Stefan Schmid, Slawomir Stanczak, Admela Jukan

Abstract:Future smart factories are expected to deploy applications over high-performance indoor wireless channels in the millimeter-wave (mmWave) bands, which on the other hand are susceptible to high path losses and Line-of Sight (LoS) blockages. Low-cost Reconfigurable Intelligent Surfaces (RISs) can provide great opportunities in such scenarios, due to its ability to alleviate LoS link blockages. In this paper, we formulate a combinatorial optimization problem, solved with Integer Linear Programming (ILP) to optimally maintain connectivity by solving the problem of allocating RIS to robots in a wireless indoor network. Our model exploits the characteristic of nulling interference from RISs by tuning RIS reflection coefficients. We further consider Quality-of-Service (QoS) at receivers in terms of Signal-to-Interference-plus-Noise Ratio (SINR) and connection outages due to insufficient transmission quality service. Numerical results for optimal solutions and heuristics show the benefits of optimally deploying RISs by providing continuous connectivity through SINR, which significantly reduces outages due to link quality.

* This paper is uploaded here for research community, thus it is for non-commercial purposes

Via

Access Paper or Ask Questions

Visual Representation Learning Guided By Multi-modal Prior Knowledge

Oct 21, 2024

Hongkuan Zhou, Lavdim Halilaj, Sebastian Monka, Stefan Schmid, Yuqicheng Zhu, Bo Xiong, Steffen Staab

Abstract:Despite the remarkable success of deep neural networks (DNNs) in computer vision, they fail to remain high-performing when facing distribution shifts between training and testing data. In this paper, we propose Knowledge-Guided Visual representation learning (KGV), a distribution-based learning approach leveraging multi-modal prior knowledge, to improve generalization under distribution shift. We use prior knowledge from two distinct modalities: 1) a knowledge graph (KG) with hierarchical and association relationships; and 2) generated synthetic images of visual elements semantically represented in the KG. The respective embeddings are generated from the given modalities in a common latent space, i.e., visual embeddings from original and synthetic images as well as knowledge graph embeddings (KGEs). These embeddings are aligned via a novel variant of translation-based KGE methods, where the node and relation embeddings of the KG are modeled as Gaussian distributions and translations respectively. We claim that incorporating multi-model prior knowledge enables more regularized learning of image representations. Thus, the models are able to better generalize across different data distributions. We evaluate KGV on different image classification tasks with major or minor distribution shifts, namely road sign classification across datasets from Germany, China, and Russia, image classification with the mini-ImageNet dataset and its variants, as well as the DVM-CAR dataset. The results demonstrate that KGV consistently exhibits higher accuracy and data efficiency than the baselines across all experiments.

Via

Access Paper or Ask Questions

GermanPartiesQA: Benchmarking Commercial Large Language Models for Political Bias and Sycophancy

Jul 25, 2024

Jan Batzner, Volker Stocker, Stefan Schmid, Gjergji Kasneci

Abstract:LLMs are changing the way humans create and interact with content, potentially affecting citizens' political opinions and voting decisions. As LLMs increasingly shape our digital information ecosystems, auditing to evaluate biases, sycophancy, or steerability has emerged as an active field of research. In this paper, we evaluate and compare the alignment of six LLMs by OpenAI, Anthropic, and Cohere with German party positions and evaluate sycophancy based on a prompt experiment. We contribute to evaluating political bias and sycophancy in multi-party systems across major commercial LLMs. First, we develop the benchmark dataset GermanPartiesQA based on the Voting Advice Application Wahl-o-Mat covering 10 state and 1 national elections between 2021 and 2023. In our study, we find a left-green tendency across all examined LLMs. We then conduct our prompt experiment for which we use the benchmark and sociodemographic data of leading German parliamentarians to evaluate changes in LLMs responses. To differentiate between sycophancy and steerabilty, we use 'I am [politician X], ...' and 'You are [politician X], ...' prompts. Against our expectations, we do not observe notable differences between prompting 'I am' and 'You are'. While our findings underscore that LLM responses can be ideologically steered with political personas, they suggest that observed changes in LLM outputs could be better described as personalization to the given context rather than sycophancy.

* 12 pages

Via

Access Paper or Ask Questions

Credence: Augmenting Datacenter Switch Buffer Sharing with ML Predictions

Jan 05, 2024

Vamsi Addanki, Maciej Pacut, Stefan Schmid

Abstract:Packet buffers in datacenter switches are shared across all the switch ports in order to improve the overall throughput. The trend of shrinking buffer sizes in datacenter switches makes buffer sharing extremely challenging and a critical performance issue. Literature suggests that push-out buffer sharing algorithms have significantly better performance guarantees compared to drop-tail algorithms. Unfortunately, switches are unable to benefit from these algorithms due to lack of support for push-out operations in hardware. Our key observation is that drop-tail buffers can emulate push-out buffers if the future packet arrivals are known ahead of time. This suggests that augmenting drop-tail algorithms with predictions about the future arrivals has the potential to significantly improve performance. This paper is the first research attempt in this direction. We propose Credence, a drop-tail buffer sharing algorithm augmented with machine-learned predictions. Credence can unlock the performance only attainable by push-out algorithms so far. Its performance hinges on the accuracy of predictions. Specifically, Credence achieves near-optimal performance of the best known push-out algorithm LQD (Longest Queue Drop) with perfect predictions, but gracefully degrades to the performance of the simplest drop-tail algorithm Complete Sharing when the prediction error gets arbitrarily worse. Our evaluations show that Credence improves throughput by $1.5$x compared to traditional approaches. In terms of flow completion times, we show that Credence improves upon the state-of-the-art approaches by up to $95\%$ using off-the-shelf machine learning techniques that are also practical in today's hardware. We believe this work opens several interesting future work opportunities both in systems and theory that we discuss at the end of this paper.

Via

Access Paper or Ask Questions

Heterogeneous Graph-based Trajectory Prediction using Local Map Context and Social Interactions

Nov 30, 2023

Daniel Grimm, Maximilian Zipfl, Felix Hertlein, Alexander Naumann, Jürgen Lüttin, Steffen Thoma, Stefan Schmid, Lavdim Halilaj, Achim Rettinger, J. Marius Zöllner

Figure 1 for Heterogeneous Graph-based Trajectory Prediction using Local Map Context and Social Interactions

Figure 2 for Heterogeneous Graph-based Trajectory Prediction using Local Map Context and Social Interactions

Figure 3 for Heterogeneous Graph-based Trajectory Prediction using Local Map Context and Social Interactions

Figure 4 for Heterogeneous Graph-based Trajectory Prediction using Local Map Context and Social Interactions

Abstract:Precisely predicting the future trajectories of surrounding traffic participants is a crucial but challenging problem in autonomous driving, due to complex interactions between traffic agents, map context and traffic rules. Vector-based approaches have recently shown to achieve among the best performances on trajectory prediction benchmarks. These methods model simple interactions between traffic agents but don't distinguish between relation-type and attributes like their distance along the road. Furthermore, they represent lanes only by sequences of vectors representing center lines and ignore context information like lane dividers and other road elements. We present a novel approach for vector-based trajectory prediction that addresses these shortcomings by leveraging three crucial sources of information: First, we model interactions between traffic agents by a semantic scene graph, that accounts for the nature and important features of their relation. Second, we extract agent-centric image-based map features to model the local map context. Finally, we generate anchor paths to enforce the policy in multi-modal prediction to permitted trajectories only. Each of these three enhancements shows advantages over the baseline model HoliGraph.

* Accepted on IEEE ITSC 2023

Via

Access Paper or Ask Questions

Relation-based Motion Prediction using Traffic Scene Graphs

Nov 24, 2022

Maximilian Zipfl, Felix Hertlein, Achim Rettinger, Steffen Thoma, Lavdim Halilaj, Juergen Luettin, Stefan Schmid, Cory Henson

Figure 1 for Relation-based Motion Prediction using Traffic Scene Graphs

Figure 2 for Relation-based Motion Prediction using Traffic Scene Graphs

Figure 3 for Relation-based Motion Prediction using Traffic Scene Graphs

Figure 4 for Relation-based Motion Prediction using Traffic Scene Graphs

Abstract:Representing relevant information of a traffic scene and understanding its environment is crucial for the success of autonomous driving. Modeling the surrounding of an autonomous car using semantic relations, i.e., how different traffic participants relate in the context of traffic rule based behaviors, is hardly been considered in previous work. This stems from the fact that these relations are hard to extract from real-world traffic scenes. In this work, we model traffic scenes in a form of spatial semantic scene graphs for various different predictions about the traffic participants, e.g., acceleration and deceleration. Our learning and inference approach uses Graph Neural Networks (GNNs) and shows that incorporating explicit information about the spatial semantic relations between traffic participants improves the predicdtion results. Specifically, the acceleration prediction of traffic participants is improved by up to 12% compared to the baselines, which do not exploit this explicit information. Furthermore, by including additional information about previous scenes, we achieve 73% improvements.

Via

Access Paper or Ask Questions