In recent years, neural networks have been used to solve phase retrieval problems in imaging with superior accuracy and speed than traditional techniques, especially in the presence of noise. However, in the context of interferometric imaging, phase noise has been largely unaddressed by existing neural network architectures. Such noise arises naturally in an interferometer due to mechanical instabilities or atmospheric turbulence, limiting measurement acquisition times and posing a challenge in scenarios with limited light intensity, such as remote sensing. Here, we introduce a 3D-2D Phase Retrieval U-Net (PRUNe) that takes noisy and randomly phase-shifted interferograms as inputs, and outputs a single 2D phase image. A 3D downsampling convolutional encoder captures correlations within and between frames to produce a 2D latent space, which is upsampled by a 2D decoder into a phase image. We test our model against a state-of-the-art singular value decomposition algorithm and find PRUNe reconstructions consistently show more accurate and smooth reconstructions, with a x2.5 - 4 lower mean squared error at multiple signal-to-noise ratios for interferograms with low (< 1 photon/pixel) and high (~100 photons/pixel) signal intensity. Our model presents a faster and more accurate approach to perform phase retrieval in extremely low light intensity interferometry in presence of phase noise, and will find application in other multi-frame noisy imaging techniques.