Active reconfigurable intelligent surface (RIS) has attracted significant attention as a recently proposed RIS architecture. Owing to its capability to amplify the incident signals, active RIS can mitigate the multiplicative fading effect inherent in the passive RIS-aided system. In this paper, we consider an active RIS-aided uplink multi-user massive multiple-input multiple-output (MIMO) system in the presence of phase noise at the active RIS. Specifically, we employ a two-timescale scheme, where the beamforming at the base station (BS) is adjusted based on the instantaneous aggregated channel state information (CSI) and the statistical CSI serves as the basis for designing the phase shifts at the active RIS, so that the feedback overhead and computational complexity can be significantly reduced. The aggregated channel composed of the cascaded and direct channels is estimated by utilizing the linear minimum mean square error (LMMSE) technique. Based on the estimated channel, we derive the analytical closed-form expression of a lower bound of the achievable rate. The power scaling laws in the active RIS-aided system are investigated based on the theoretical expressions. When the transmit power of each user is scaled down by the number of BS antennas M or reflecting elements N, we find that the thermal noise will cause the lower bound of the achievable rate to approach zero, as the number of M or N increases to infinity. Moreover, an optimization approach based on genetic algorithms (GA) is introduced to tackle the phase shift optimization problem. Numerical results reveal that the active RIS can greatly enhance the performance of the considered system under various settings.