This paper introduces the architecture of a convolutional autoencoder (CAE) for the task of peak-to-average power ratio (PAPR) reduction and waveform design, for orthogonal frequency division multiplexing (OFDM) systems. The proposed architecture integrates a PAPR reduction block and a non-linear high power amplifier (HPA) model. We apply gradual loss learning for multi-objective optimization. We analyze the models performance by examining the bit error rate (BER), the PAPR and the spectral response, and comparing them with common PAPR reduction algorithms.