Convolutional neural networks have become a popular research in the field of finger vein recognition because of their powerful image feature representation. However, most researchers focus on improving the performance of the network by increasing the CNN depth and width, which often requires high computational effort. Moreover, we can notice that not only the importance of pixels in different channels is different, but also the importance of pixels in different positions of the same channel is different. To reduce the computational effort and to take into account the different importance of pixels, we propose a lightweight convolutional neural network with a convolutional block attention module (CBAM) for finger vein recognition, which can achieve a more accurate capture of visual structures through an attention mechanism. First, image sequences are fed into a lightweight convolutional neural network we designed to improve visual features. Afterwards, it learns to assign feature weights in an adaptive manner with the help of a convolutional block attention module. The experiments are carried out on two publicly available databases and the results demonstrate that the proposed method achieves a stable, highly accurate, and robust performance in multimodal finger recognition.