We study the initial beam acquisition problem in millimeter wave (mm-wave) networks from the perspective of best arm identification in multi-armed bandits (MABs). For the stationary environment, we propose a novel algorithm called concurrent beam exploration, CBE, in which multiple beams are grouped based on the beam indices and are simultaneously activated to detect the presence of the user. The best beam is then identified using a Hamming decoding strategy. For the case of orthogonal and highly directional thin beams, we characterize the performance of CBE in terms of the probability of missed detection and false alarm in a beam group (BG). Leveraging this, we derive the probability of beam selection error and prove that CBE outperforms the state-of-the-art strategies in this metric. Then, for the abruptly changing environments, e.g., in the case of moving blockages, we characterize the performance of the classical sequential halving (SH) algorithm. In particular, we derive the conditions on the distribution of the change for which the beam selection error is exponentially bounded. In case the change is restricted to a subset of the beams, we devise a strategy called K-sequential halving and exhaustive search, K-SHES, that leads to an improved bound for the beam selection error as compared to SH. This policy is particularly useful when a near-optimal beam becomes optimal during the beam-selection procedure due to abruptly changing channel conditions. Finally, we demonstrate the efficacy of the proposed scheme by employing it in a tandem beam refinement and data transmission scheme.