Positron Emission Tomography (PET) is a vital imaging modality widely used in clinical diagnosis and preclinical research but faces limitations in image resolution and signal-to-noise ratio due to inherent physical degradation factors. Current deep learning-based denoising methods face challenges in adapting to the variability of clinical settings, influenced by factors such as scanner types, tracer choices, dose levels, and acquisition times. In this work, we proposed a novel 3D ControlNet-based denoising method for whole-body PET imaging. We first pre-trained a 3D Denoising Diffusion Probabilistic Model (DDPM) using a large dataset of high-quality normal-dose PET images. Following this, we fine-tuned the model on a smaller set of paired low- and normal-dose PET images, integrating low-dose inputs through a 3D ControlNet architecture, thereby making the model adaptable to denoising tasks in diverse clinical settings. Experimental results based on clinical PET datasets show that the proposed framework outperformed other state-of-the-art PET image denoising methods both in visual quality and quantitative metrics. This plug-and-play approach allows large diffusion models to be fine-tuned and adapted to PET images from diverse acquisition protocols.