Abstract:The construction industry faces high risks due to frequent accidents, often leaving workers in perilous situations where rapid response is critical. Traditional safety monitoring methods, including wearable sensors and GPS, often fail under obstructive or indoor conditions. This research introduces a novel real-time scream detection and localization system tailored for construction sites, especially in low-resource environments. Integrating Wav2Vec2 and Enhanced ConvNet models for accurate scream detection, coupled with the GCC-PHAT algorithm for robust time delay estimation under reverberant conditions, followed by a gradient descent-based approach to achieve precise position estimation in noisy environments. Our approach combines these concepts to achieve high detection accuracy and rapid localization, thereby minimizing false alarms and optimizing emergency response. Preliminary results demonstrate that the system not only accurately detects distress calls amidst construction noise but also reliably identifies the caller's location. This solution represents a substantial improvement in worker safety, with the potential for widespread application across high-risk occupational environments. The scripts used for training, evaluation of scream detection, position estimation, and integrated framework will be released at: https://github.com/Anmol2059/construction_safety.