This paper studies a near-field multiple-input multiple-output (MIMO) radar sensing system, in which the transceivers with massive antennas aim to localize multiple near-field targets in the three-dimensional (3D) space over unknown cluttered environments. We consider a spherical wavefront propagation with both channel phase and amplitude variations over different antennas. Under this setup, the unknown parameters include the 3D coordinates and complex reflection coefficients of the targets, as well as the noise and interference covariance matrix. First, by considering general transmit signal waveforms, we derive the Fisher information matrix (FIM) corresponding to the 3D coordinates and the complex reflection coefficients of the targets and accordingly obtain the Cram\'er-Rao bound (CRB) for the 3D coordinates. This provides a performance bound for 3D near-field target localization. For the special single-target case, we obtain the CRB in an analytical form, and analyze its asymptotic scaling behaviors with respect to the target distance and antenna size of the transceiver. Next, to facilitate practical localization, we propose two estimators to localize targets based on the maximum likelihood (ML) criterion, namely the 3D approximate cyclic optimization (3D-ACO) and the 3D cyclic optimization with white Gaussian noise (3D-CO-WGN), respectively. Numerical results validate the asymptotic CRB analysis and show that the consideration of varying channel amplitudes is vital to achieve accurate CRB and localization when the targets are close to the transceivers. It is also shown that the proposed estimators achieve localization performance close to the derived CRB under various cluttered environments, thus validating their effectiveness in practical implementation. Furthermore, it is shown that transmit waveforms have a significant impact on CRB and the localization performance.