Detection faults in seismic data is a crucial step for seismic structural interpretation, reservoir characterization and well placement. Some recent works regard it as an image segmentation task. The task of image segmentation requires huge labels, especially 3D seismic data, which has a complex structure and lots of noise. Therefore, its annotation requires expert experience and a huge workload. In this study, we present {\lambda}-BCE and {\lambda}-smooth L1loss to effectively train 3D-CNN by some slices from 3D seismic data, so that the model can learn the segmentation of 3D seismic data from a few 2D slices. In order to fully extract information from limited data and suppress seismic noise, we propose an attention module that can be used for active supervision training and embedded in the network. The attention heatmap target is generated by the original label, and letting it supervise the attention module using the {\lambda}-smooth L1loss. The experiment proves the effectiveness of our loss function and attention module, it also shows that our method can extract 3D seismic features from a few 2D slices labels, and the segmentation effect achieves state-of-the-art. We only use 3.3% of the all labels, and we can achieve similar performance as using all labels. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.