Sparse time-frequency (T-F) representations have been an important research topic for more than several decades. Among them, optimization-based methods (in particular, extensions of basis pursuit) allow us to design the representations through objective functions. Since acoustic signal processing utilizes models of spectrogram, the flexibility of optimization-based T-F representations is helpful for adjusting the representation for each application. However, acoustic applications often require models of \textit{magnitude} of T-F representations obtained by discrete Gabor transform (DGT). Adjusting a T-F representation to such a magnitude model (e.g., smoothness of magnitude of DGT coefficients) results in a non-convex optimization problem that is difficult to solve. In this paper, instead of tackling difficult non-convex problems, we propose a convex optimization-based framework that realizes a T-F representation whose magnitude has characteristics specified by the user. We analyzed the properties of the proposed method and provide numerical examples of sparse T-F representations having, e.g., low-rank or smooth magnitude, which have not been realized before.