Nowadays, automation is a critical topic due to its significant impacts on the productivity of construction projects. Utilizing automation in this industry brings about great results, such as remarkable improvements in the efficiency, quality, and safety of construction activities. The scope of automation in construction includes a wide range of stages, and monitoring construction projects is no exception. Additionally, it is of great importance in project management since an accurate and timely assessment of project progress enables managers to quickly identify deviations from the schedule and take the required actions at the right time. In this stage, one of the most important tasks is to daily keep track of the project progress, which is very time-consuming and labor-intensive, but automation has facilitated and accelerated this task. It also eliminated or at least decreased the risk of many dangerous tasks. In this way, the first step of construction automation is to detect used materials in a project site automatically. In this paper, a novel deep learning architecture is utilized, called Vision Transformer (ViT), for detecting and classifying construction materials. To evaluate the applicability and performance of the proposed method, it is trained and tested on three large imbalanced datasets, namely Construction Material Library (CML) and Building Material Dataset (BMD), used in the previous papers, as well as a new dataset created by combining them. The achieved results revealed an accuracy of 100 percent in all parameters and also in each material category. It is believed that the proposed method provides a novel and robust tool for detecting and classifying different material types.