Full waveform inversion (FWI) updates the velocity model by minimizing the discrepancy between observed and simulated data. However, discretization errors in numerical modeling and incomplete seismic data acquisition can introduce noise, which propagates through the adjoint operator and affects the accuracy of the velocity gradient, thereby impacting the FWI inversion accuracy. To mitigate the influence of noise on the gradient, we employ a convolutional neural network (CNN) to refine the velocity model before performing the forward simulation, aiming to reduce noise and provide a more accurate velocity update direction. We use the same data misfit loss to update both the velocity and network parameters, thereby forming a self-supervised learning procedure. We propose two implementation schemes, which differ in whether the velocity update passes through the CNN. In both methodologies, the velocity representation is extended (VRE) by using a neural network in addition to the grid-based velocities. Thus, we refer to this general approach as VRE-FWI. Synthetic and real data tests demonstrate that the proposed VRE-FWI achieves higher velocity inversion accuracy compared to traditional FWI, at a marginal additional computational cost of approximately 1%.