Reconfigurable intelligent surfaces (RISs) have been proposed as a key enabler to improve the coverage of the signals and mitigate the frequent blockages in millimeter wave (mmWave) multiple-input multiple-output (MIMO) communications. However, the channel state information (CSI) acquisition is one of the major challenges for the practical deployment of the RIS. The passive RIS without any baseband processing capabilities brings difficulty on the channel estimation (CE), since the individual channels or the cascaded one can be estimated only at base station (BS) via uplink training or mobile station (MS) via downlink training. In order to facilitate the CSI acquisition, we focus on the hybrid RIS architecture, where a small number of elements are active and able to receive and process the pilot signals at the RIS. The CE is performed in two stages by following the atomic norm minimization to recover the channel parameters, i.e., angles of departure (AoDs), angles of arrival (AoAs), and propagation path gains. Simulation results show that the proposed scheme can outperform the passive RIS CE under the same training overhead. Furthermore, we also study the theoretical performance limits in terms of mean square error (MSE) via Cram\'er-Rao lower bound (CRLB) analyses.