Ultra-widefield (UWF) fundus images are replacing traditional fundus images in screening, detection, prediction, and treatment of complications related to myopia because their much broader visual range is advantageous for highly myopic eyes. Spherical equivalent (SE) is extensively used as the main myopia outcome measure, and axial length (AL) has drawn increasing interest as an important ocular component for assessing myopia. Cutting-edge studies show that SE and AL are strongly correlated. Using the joint information from SE and AL is potentially better than using either separately. In the deep learning community, though there is research on multiple-response tasks with a 3D image biomarker, dependence among responses is only sporadically taken into consideration. Inspired by the spirit that information extracted from the data by statistical methods can improve the prediction accuracy of deep learning models, we formulate a class of multivariate response regression models with a higher-order tensor biomarker, for the bivariate tasks of regression-classification and regression-regression. Specifically, we propose a copula-enhanced convolutional neural network (CeCNN) framework that incorporates the dependence between responses through a Gaussian copula (with parameters estimated from a warm-up CNN) and uses the induced copula-likelihood loss with the backbone CNNs. We establish the statistical framework and algorithms for the aforementioned two bivariate tasks. We show that the CeCNN has better prediction accuracy after adding the dependency information to the backbone models. The modeling and the proposed CeCNN algorithm are applicable beyond the UWF scenario and can be effective with other backbones beyond ResNet and LeNet.