Although reconfigurable intelligent surface (RIS) is a promising technology for shaping the propagation environment, it consists of a single-layer structure within inherent limitations regarding the number of beam steering patterns. Based on the recently revolutionary technology, denoted as stacked intelligent metasurface (SIM), we propose its implementation not only on the base station (BS) side in a massive multiple-input multiple-output (mMIMO) setup but also in the intermediate space between the base station and the users to adjust the environment further as needed. For the sake of convenience, we call the former BS SIM (BSIM), and the latter channel SIM (CSIM). Hence, we achieve wave-based combining at the BS and wave-based configuration at the intermediate space. Specifically, we propose a channel estimation method with reduced overhead, being crucial for SIMassisted communications. Next, we derive the uplink sum spectral efficiency (SE) in closed form in terms of statistical channel state information (CSI). Notably, we optimize the phase shifts of both BSIM and CSIM simultaneously by using the projected gradient ascent method (PGAM). Compared to previous works on SIMs, we study the uplink transmission, a mMIMO setup, channel estimation in a single phase, a second SIM at the intermediate space, and simultaneous optimization of the two SIMs. Simulation results show the impact of various parameters on the sum SE, and demonstrate the superiority of our optimization approach compared to the alternating optimization (AO) method.