Inertial motion capture systems widely use low-cost IMUs to obtain the orientation of human body segments, but these sensors alone are unable to estimate link positions. Therefore, this research used a SLAM method in conjunction with inertial data fusion to estimate link positions. SLAM is a method that tracks a target in a reconstructed map of the environment using a camera. This paper proposes quaternion-based extended and square-root unscented Kalman filters (EKF & SRUKF) algorithms for pose estimation. The Kalman filters use measurements based on SLAM position data, multi-link biomechanical constraints, and vertical referencing to correct errors. In addition to the sensor biases, the fusion algorithm is capable of estimating link geometries, allowing the imposing of biomechanical constraints without a priori knowledge of sensor positions. An optical tracking system is used as a reference of ground-truth to experimentally evaluate the performance of the proposed algorithm in various scenarios of human arm movements. The proposed algorithms achieve up to 5.87 (cm) and 1.1 (deg) accuracy in position and attitude estimation. Compared to the EKF, the SRUKF algorithm presents a smoother and higher convergence rate but is 2.4 times more computationally demanding. After convergence, the SRUKF is up to 17% less and 36% more accurate than the EKF in position and attitude estimation, respectively. Using an absolute position measurement method instead of SLAM produced 80% and 40%, in the case of EKF, and 60% and 6%, in the case of SRUKF, less error in position and attitude estimation, respectively.