Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sahin Albayrak

Technische Universität Berlin, Berlin, Germany

APR-Transformer: Initial Pose Estimation for Localization in Complex Environments through Absolute Pose Regression

May 14, 2025

Srinivas Ravuri, Yuan Xu, Martin Ludwig Zehetner, Ketan Motlag, Sahin Albayrak

Abstract:Precise initialization plays a critical role in the performance of localization algorithms, especially in the context of robotics, autonomous driving, and computer vision. Poor localization accuracy is often a consequence of inaccurate initial poses, particularly noticeable in GNSS-denied environments where GPS signals are primarily relied upon for initialization. Recent advances in leveraging deep neural networks for pose regression have led to significant improvements in both accuracy and robustness, especially in estimating complex spatial relationships and orientations. In this paper, we introduce APR-Transformer, a model architecture inspired by state-of-the-art methods, which predicts absolute pose (3D position and 3D orientation) using either image or LiDAR data. We demonstrate that our proposed method achieves state-of-the-art performance on established benchmark datasets such as the Radar Oxford Robot-Car and DeepLoc datasets. Furthermore, we extend our experiments to include our custom complex APR-BeIntelli dataset. Additionally, we validate the reliability of our approach in GNSS-denied environments by deploying the model in real-time on an autonomous test vehicle. This showcases the practical feasibility and effectiveness of our approach. The source code is available at:https://github.com/GT-ARC/APR-Transformer.

* 8 pages with 6 figures

Via

Access Paper or Ask Questions

Building Blocks for Robust and Effective Semi-Supervised Real-World Object Detection

Mar 24, 2025

Moussa Kassem Sbeyti, Nadja Klein, Azarm Nowzad, Fikret Sivrikaya, Sahin Albayrak

Abstract:Semi-supervised object detection (SSOD) based on pseudo-labeling significantly reduces dependence on large labeled datasets by effectively leveraging both labeled and unlabeled data. However, real-world applications of SSOD often face critical challenges, including class imbalance, label noise, and labeling errors. We present an in-depth analysis of SSOD under real-world conditions, uncovering causes of suboptimal pseudo-labeling and key trade-offs between label quality and quantity. Based on our findings, we propose four building blocks that can be seamlessly integrated into an SSOD framework. Rare Class Collage (RCC): a data augmentation method that enhances the representation of rare classes by creating collages of rare objects. Rare Class Focus (RCF): a stratified batch sampling strategy that ensures a more balanced representation of all classes during training. Ground Truth Label Correction (GLC): a label refinement method that identifies and corrects false, missing, and noisy ground truth labels by leveraging the consistency of teacher model predictions. Pseudo-Label Selection (PLS): a selection method for removing low-quality pseudo-labeled images, guided by a novel metric estimating the missing detection rate while accounting for class rarity. We validate our methods through comprehensive experiments on autonomous driving datasets, resulting in up to 6% increase in SSOD performance. Overall, our investigation and novel, data-centric, and broadly applicable building blocks enable robust and effective SSOD in complex, real-world scenarios. Code is available at https://mos-ks.github.io/publications.

* Transactions on Machine Learning Research, 2025
* Accepted to Transactions on Machine Learning Research (TMLR). OpenReview: https://openreview.net/forum?id=vRYt8QLKqK

Via

Access Paper or Ask Questions

Optimizing AI-Assisted Code Generation

Dec 14, 2024

Simon Torka, Sahin Albayrak

Abstract:In recent years, the rise of AI-assisted code-generation tools has significantly transformed software development. While code generators have mainly been used to support conventional software development, their use will be extended to powerful and secure AI systems. Systems capable of generating code, such as ChatGPT, OpenAI Codex, GitHub Copilot, and AlphaCode, take advantage of advances in machine learning (ML) and natural language processing (NLP) enabled by large language models (LLMs). However, it must be borne in mind that these models work probabilistically, which means that although they can generate complex code from natural language input, there is no guarantee for the functionality and security of the generated code. However, to fully exploit the considerable potential of this technology, the security, reliability, functionality, and quality of the generated code must be guaranteed. This paper examines the implementation of these goals to date and explores strategies to optimize them. In addition, we explore how these systems can be optimized to create safe, high-performance, and executable artificial intelligence (AI) models, and consider how to improve their accessibility to make AI development more inclusive and equitable.

Via

Access Paper or Ask Questions

ALPACA -- Adaptive Learning Pipeline for Comprehensive AI

Dec 14, 2024

Simon Torka, Sahin Albayrak

Abstract:The advancement of AI technologies has greatly increased the complexity of AI pipelines as they include many stages such as data collection, pre-processing, training, evaluation and visualisation. To provide effective and accessible AI solutions, it is important to design pipelines for different user groups such as experts, professionals from different fields and laypeople. Ease of use and trust play a central role in the acceptance of AI systems. The presented system, ALPACA (Adaptive Learning Pipeline for Advanced Comprehensive AI Analysis), offers a comprehensive AI pipeline that addresses the needs of diverse user groups. ALPACA integrates visual and code-based development and facilitates all key phases of the AI pipeline. Its architecture is based on Celery (with Redis backend) for efficient task management, MongoDB for seamless data storage and Kubernetes for cloud-based scalability and resource utilisation. Future versions of ALPACA will support modern techniques such as federated and continuous learning as well as explainable AI methods to further improve security, usability and trustworthiness. The application is demonstrated by an Android app for similarity recognition, which emphasises ALPACA's potential for use in everyday life.

Via

Access Paper or Ask Questions

LaFAM: Unsupervised Feature Attribution with Label-free Activation Maps

Jul 09, 2024

Aray Karjauv, Sahin Albayrak

Abstract:Convolutional Neural Networks (CNNs) are known for their ability to learn hierarchical structures, naturally developing detectors for objects, and semantic concepts within their deeper layers. Activation maps (AMs) reveal these saliency regions, which are crucial for many Explainable AI (XAI) methods. However, the direct exploitation of raw AMs in CNNs for feature attribution remains underexplored in literature. This work revises Class Activation Map (CAM) methods by introducing the Label-free Activation Map (LaFAM), a streamlined approach utilizing raw AMs for feature attribution without reliance on labels. LaFAM presents an efficient alternative to conventional CAM methods, demonstrating particular effectiveness in saliency map generation for self-supervised learning while maintaining applicability in supervised learning scenarios.

Via

Access Paper or Ask Questions

Cost-Sensitive Uncertainty-Based Failure Recognition for Object Detection

Apr 26, 2024

Moussa Kassem Sbeyti, Michelle Karg, Christian Wirth, Nadja Klein, Sahin Albayrak

Abstract:Object detectors in real-world applications often fail to detect objects due to varying factors such as weather conditions and noisy input. Therefore, a process that mitigates false detections is crucial for both safety and accuracy. While uncertainty-based thresholding shows promise, previous works demonstrate an imperfect correlation between uncertainty and detection errors. This hinders ideal thresholding, prompting us to further investigate the correlation and associated cost with different types of uncertainty. We therefore propose a cost-sensitive framework for object detection tailored to user-defined budgets on the two types of errors, missing and false detections. We derive minimum thresholding requirements to prevent performance degradation and define metrics to assess the applicability of uncertainty for failure recognition. Furthermore, we automate and optimize the thresholding process to maximize the failure recognition rate w.r.t. the specified budget. Evaluation on three autonomous driving datasets demonstrates that our approach significantly enhances safety, particularly in challenging scenarios. Leveraging localization aleatoric uncertainty and softmax-based entropy only, our method boosts the failure recognition rate by 36-60\% compared to conventional approaches. Code is available at https://mos-ks.github.io/publications.

* Accepted with an oral presentation at UAI 2024

Via

Access Paper or Ask Questions

Overcoming the Limitations of Localization Uncertainty: Efficient & Exact Non-Linear Post-Processing and Calibration

Jun 15, 2023

Moussa Kassem Sbeyti, Michelle Karg, Christian Wirth, Azarm Nowzad, Sahin Albayrak

Abstract:Robustly and accurately localizing objects in real-world environments can be challenging due to noisy data, hardware limitations, and the inherent randomness of physical systems. To account for these factors, existing works estimate the aleatoric uncertainty of object detectors by modeling their localization output as a Gaussian distribution $\mathcal{N}(\mu,\,\sigma^{2})\,$, and training with loss attenuation. We identify three aspects that are unaddressed in the state of the art, but warrant further exploration: (1) the efficient and mathematically sound propagation of $\mathcal{N}(\mu,\,\sigma^{2})\,$ through non-linear post-processing, (2) the calibration of the predicted uncertainty, and (3) its interpretation. We overcome these limitations by: (1) implementing loss attenuation in EfficientDet, and proposing two deterministic methods for the exact and fast propagation of the output distribution, (2) demonstrating on the KITTI and BDD100K datasets that the predicted uncertainty is miscalibrated, and adapting two calibration methods to the localization task, and (3) investigating the correlation between aleatoric uncertainty and task-relevant error sources. Our contributions are: (1) up to five times faster propagation while increasing localization performance by up to 1\%, (2) up to fifteen times smaller expected calibration error, and (3) the predicted uncertainty is found to correlate with occlusion, object distance, detection accuracy, and image quality.

* This preprint has not undergone any post-submission improvements or corrections. Accepted to ECML-PKDD 2023

Via

Access Paper or Ask Questions

Adaptive Stochastic Optimisation of Nonconvex Composite Objectives

Nov 21, 2022

Weijia Shao, Fikret Sivrikaya, Sahin Albayrak

Abstract:In this paper, we propose and analyse a family of generalised stochastic composite mirror descent algorithms. With adaptive step sizes, the proposed algorithms converge without requiring prior knowledge of the problem. Combined with an entropy-like update-generating function, these algorithms perform gradient descent in the space equipped with the maximum norm, which allows us to exploit the low-dimensional structure of the decision sets for high-dimensional problems. Together with a sampling method based on the Rademacher distribution and variance reduction techniques, the proposed algorithms guarantee a logarithmic complexity dependence on dimensionality for zeroth-order optimisation problems.

* arXiv admin note: substantial text overlap with arXiv:2208.04579

Via

Access Paper or Ask Questions

Adaptive Zeroth-Order Optimisation of Nonconvex Composite Objectives

Aug 14, 2022

Weijia Shao, Sahin Albayrak

Figure 1 for Adaptive Zeroth-Order Optimisation of Nonconvex Composite Objectives

Figure 2 for Adaptive Zeroth-Order Optimisation of Nonconvex Composite Objectives

Figure 3 for Adaptive Zeroth-Order Optimisation of Nonconvex Composite Objectives

Figure 4 for Adaptive Zeroth-Order Optimisation of Nonconvex Composite Objectives

Abstract:In this paper, we propose and analyze algorithms for zeroth-order optimization of non-convex composite objectives, focusing on reducing the complexity dependence on dimensionality. This is achieved by exploiting the low dimensional structure of the decision set using the stochastic mirror descent method with an entropy alike function, which performs gradient descent in the space equipped with the maximum norm. To improve the gradient estimation, we replace the classic Gaussian smoothing method with a sampling method based on the Rademacher distribution and show that the mini-batch method copes with the non-Euclidean geometry. To avoid tuning hyperparameters, we analyze the adaptive stepsizes for the general stochastic mirror descent and show that the adaptive version of the proposed algorithm converges without requiring prior knowledge about the problem.

Via

Access Paper or Ask Questions

Optimistic Optimisation of Composite Objective with Exponentiated Update

Aug 08, 2022

Weijia Shao, Fikret Sivrikaya, Sahin Albayrak

Figure 1 for Optimistic Optimisation of Composite Objective with Exponentiated Update

Figure 2 for Optimistic Optimisation of Composite Objective with Exponentiated Update

Figure 3 for Optimistic Optimisation of Composite Objective with Exponentiated Update

Figure 4 for Optimistic Optimisation of Composite Objective with Exponentiated Update

Abstract:This paper proposes a new family of algorithms for the online optimisation of composite objectives. The algorithms can be interpreted as the combination of the exponentiated gradient and $p$-norm algorithm. Combined with algorithmic ideas of adaptivity and optimism, the proposed algorithms achieve a sequence-dependent regret upper bound, matching the best-known bounds for sparse target decision variables. Furthermore, the algorithms have efficient implementations for popular composite objectives and constraints and can be converted to stochastic optimisation algorithms with the optimal accelerated rate for smooth objectives.

Via

Access Paper or Ask Questions