Abstract:This work describes a computer vision system that enables pervasive mapping and monitoring of human attention. The key contribution is that our methodology enables full 3D recovery of the gaze pointer, human view frustum and associated human centered measurements directly into an automatically computed 3D model in real-time. We apply RGB-D SLAM and descriptor matching methodologies for the 3D modeling, localization and fully automated annotation of ROIs (regions of interest) within the acquired 3D model. This innovative methodology will open new avenues for attention studies in real world environments, bringing new potential into automated processing for human factors technologies.
Abstract:The study of human factors in the frame of interaction studies has been relevant for usability engi-neering and ergonomics for decades. Today, with the advent of wearable eye-tracking and Google glasses, monitoring of human factors will soon become ubiquitous. This work describes a computer vision system that enables pervasive mapping and monitoring of human attention. The key contribu-tion is that our methodology enables full 3D recovery of the gaze pointer, human view frustum and associated human centred measurements directly into an automatically computed 3D model in real-time. We apply RGB-D SLAM and descriptor matching methodologies for the 3D modelling, locali-zation and fully automated annotation of ROIs (regions of interest) within the acquired 3D model. This innovative methodology will open new avenues for attention studies in real world environments, bringing new potential into automated processing for human factors technologies.
Abstract:Crowd monitoring and analysis in mass events are highly important technologies to support the security of attending persons. Proposed methods based on terrestrial or airborne image/video data often fail in achieving sufficiently accurate results to guarantee a robust service. We present a novel framework for estimating human count, density and motion from video data based on custom tailored object detection techniques, a regression based density estimate and a total variation based optical flow extraction. From the gathered features we present a detailed accuracy analysis versus ground truth measurements. In addition, all information is projected into world coordinates to enable a direct integration with existing geo-information systems. The resulting human counts demonstrate a mean error of 4% to 9% and thus represent a most efficient measure that can be robustly applied in security critical services.