Abstract:One of the major challenges for the agricultural industry today is the uncertainty in manual labor availability and the associated cost. Automated flower and fruit density estimation, localization, and counting could help streamline harvesting, yield estimation, and crop-load management strategies such as flower and fruitlet thinning. This article proposes a deep regression-based network, AgRegNet, to estimate density, count, and location of flower and fruit in tree fruit canopies without explicit object detection or polygon annotation. Inspired by popular U-Net architecture, AgRegNet is a U-shaped network with an encoder-to-decoder skip connection and modified ConvNeXt-T as an encoder feature extractor. AgRegNet can be trained based on information from point annotation and leverages segmentation information and attention modules (spatial and channel) to highlight relevant flower and fruit features while suppressing non-relevant background features. Experimental evaluation in apple flower and fruit canopy images under an unstructured orchard environment showed that AgRegNet achieved promising accuracy as measured by Structural Similarity Index (SSIM), percentage Mean Absolute Error (pMAE) and mean Average Precision (mAP) to estimate flower and fruit density, count, and centroid location, respectively. Specifically, the SSIM, pMAE, and mAP values for flower images were 0.938, 13.7%, and 0.81, respectively. For fruit images, the corresponding values were 0.910, 5.6%, and 0.93. Since the proposed approach relies on information from point annotation, it is suitable for sparsely and densely located objects. This simplified technique will be highly applicable for growers to accurately estimate yields and decide on optimal chemical and mechanical flower thinning practices.