This study aims to establish a computer-aided diagnostic system for lung lesions using bronchoscope endobronchial ultrasound (EBUS) to assist physicians in identifying lesion areas. During EBUS-transbronchial needle aspiration (EBUS-TBNA) procedures, physicians rely on grayscale ultrasound images to determine the location of lesions. However, these images often contain significant noise and can be influenced by surrounding tissues or blood vessels, making interpretation challenging. Previous research has lacked the application of object detection models to EBUS-TBNA, and there has been no well-defined solution for annotating the EBUS-TBNA dataset. In related studies on ultrasound images, although models have been successful in capturing target regions for their respective tasks, their training and predictions have been based on two-dimensional images, limiting their ability to leverage temporal features for improved predictions. This study introduces a three-dimensional image-based object detection model. It utilizes an attention mechanism to capture temporal correlations and we will implements a filtering mechanism to select relevant information from previous frames. Subsequently, a teacher-student model training approach is employed to optimize the model further, leveraging unlabeled data. To mitigate the impact of poor-quality pseudo-labels on the student model, we will add a special Gaussian Mixture Model (GMM) to ensure the quality of pseudo-labels.