Abstract:Face recognition and verification are two computer vision tasks whose performances have advanced with the introduction of deep representations. However, ethical, legal, and technical challenges due to the sensitive nature of face data and biases in real-world training datasets hinder their development. Generative AI addresses privacy by creating fictitious identities, but fairness problems remain. Using the existing DCFace SOTA framework, we introduce a new controlled generation pipeline that improves fairness. Through classical fairness metrics and a proposed in-depth statistical analysis based on logit models and ANOVA, we show that our generation pipeline improves fairness more than other bias mitigation approaches while slightly improving raw performance.
Abstract:The effectiveness of Recommender Systems (RS) is closely tied to the quality and distinctiveness of user profiles, yet despite many advancements in raw performance, the sensitivity of RS to user profile quality remains under-researched. This paper introduces novel information-theoretic measures for understanding recommender systems: a "surprise" measure quantifying users' deviations from popular choices, and a "conditional surprise" measure capturing user interaction coherence. We evaluate 7 recommendation algorithms across 9 datasets, revealing the relationships between our measures and standard performance metrics. Using a rigorous statistical framework, our analysis quantifies how much user profile density and information measures impact algorithm performance across domains. By segmenting users based on these measures, we achieve improved performance with reduced data and show that simpler algorithms can match complex ones for low-coherence users. Additionally, we employ our measures to analyze how well different recommendation algorithms maintain the coherence and diversity of user preferences in their predictions, providing insights into algorithm behavior. This work advances the theoretical understanding of user behavior and practical heuristics for personalized recommendation systems, promoting more efficient and adaptive architectures.