Cone-beam computed tomography (CBCT) has become a vital imaging technique in various medical fields but scatter artifacts are a major limitation in CBCT scanning. This challenge is exacerbated by the use of large flat panel 2D detectors. The scatter-to-primary ratio increases significantly with the increase in the size of FOV being scanned. Several deep learning methods, particularly U-Net architectures, have shown promising capabilities in estimating the scatter directly from the CBCT projections. However, the influence of varying FOV sizes on these deep learning models remains unexplored. Having a single neural network for the scatter estimation of varying FOV projections can be of significant importance towards real clinical applications. This study aims to train and evaluate the performance of a U-Net network on a simulated dataset with varying FOV sizes. We further propose a new method (Aux-Net) by providing auxiliary information, such as FOV size, to the U-Net encoder. We validate our method on 30 different FOV sizes and compare it with the U-Net. Our study demonstrates that providing auxiliary information to the network enhances the generalization capability of the U-Net. Our findings suggest that this novel approach outperforms the baseline U-Net, offering a significant step towards practical application in real clinical settings where CBCT systems are employed to scan a wide range of FOVs.