Reconfigurable Intelligent Surface (RIS) is a revolutionizing approach to provide cost-effective yet energy-efficient communications. The transmit beamforming of the base station (BS) and discrete phase shifts of the RIS are jointly optimized to provide high quality of service. However, existing works ignore the high dependence between the large number of phase shifts and estimate them separately, consequently, easily getting trapped into local optima. To investigate the number and distribution of local optima, we conduct a fitness landscape analysis on the sum rate maximization problems. Two landscape features, the fitness distribution correlation and autocorrelation, are employed to investigate the ruggedness of landscape. The investigation results indicate that the landscape exhibits a rugged, multi-modal structure, i.e., has many local peaks, particularly in the cases with large-scale RISs. To handle the multi-modal landscape structure, we propose a novel niching genetic algorithm to solve the sum rate maximization problem. Particularly, a niching technique, nearest-better clustering, is incorporated to partition the population into several neighborhood species, thereby locating multiple local optima and enhance the global search ability. We also present a minimum species size to further improve the convergence speed. Simulation results demonstrate that our method achieves significant capacity gains compared to existing algorithms, particularly in the cases with large-scale RISs.