Reconfigurable intelligent surfaces (RISs) are set to be a revolutionary technology in the 6th generation of wireless systems. In this work, we study the application of RIS in a multi-user passive localization scenario, where we have one transmitter (Tx) and multiple asynchronous receivers (Rxs) with known locations. We aim to estimate the locations of multiple users equipped with RISs. The RISs only reflect the signal from the Tx to the Rxs and are not used as active transceivers themselves. Each Rx receives the signal from the Tx (LOS path) and the reflected signal from the RISs (NLOS path). We show that users' 3D position can be estimated with submeter accuracy in a large area around the transmitter, using the LOS and NLOS time-of-arrival measurements at the Rxs. We do so, by developing the signal model, deriving the Cramer-Rao bounds, and devising an estimator that attains these bounds. Furthermore, by orthogonalizing the RIS phase profiles across different users, we circumvent inter-path interference.