This paper focuses on the design of beamforming codebooks that maximize the average normalized beamforming gain for any underlying channel distribution. While the existing techniques use statistical channel models, we utilize a model-free data-driven approach with foundations in machine learning to generate beamforming codebooks that adapt to the surrounding propagation conditions. The key technical contribution lies in reducing the codebook design problem to an unsupervised clustering problem on a Grassmann manifold where the cluster centroids form the finite-sized beamforming codebook for the channel state information (CSI), which can be efficiently solved using K-means clustering. This approach is extended to develop a remarkably efficient procedure for designing product codebooks for full-dimension (FD) multiple-input multiple-output (MIMO) systems with uniform planar array (UPA) antennas. Simulation results demonstrate the capability of the proposed design criterion in learning the codebooks, reducing the codebook size and producing noticeably higher beamforming gains compared to the existing state-of-the-art CSI quantization techniques.