Position-aided beam selection methods have been shown to be an effective approach to achieve high beamforming gain while limiting the overhead and latency of initial access in millimeter wave (mmWave) communications. Most research in the area, however, has focused on vehicular applications, where the orientation of the user terminal (UT) is mostly fixed at each position of the environment. This paper proposes a location- and orientation-based beam selection method to enable context information (CI)-based beam alignment in applications where the UT can take arbitrary orientation at each location. We propose three different network structures, with different amounts of trainable parameters that can be used with different training dataset sizes. A professional 3-dimensional ray tracing tool is used to generate datasets for an IEEE standard indoor scenario. Numerical results show the proposed networks outperform a CI-aided benchmark such as the generalized inverse fingerprinting (GIFP) method as well as hierarchical beam search as a non-CI-based approach. Moreover, compared to the GIFP method, the proposed deep learning-based beam selection shows higher robustness to different line-of-sight blockage probability in the training and test datasets and lower sensitivity to inaccuracies in the position and orientation information.