In recent years, increasingly large datasets with two different sets of features measured for each sample have become prevalent in many areas of biology. For example, a recently developed method called Patch-seq provides single-cell RNA sequencing data together with electrophysiological measurements of the same neurons. However, the efficient and interpretable analysis of such paired data has remained a challenge. As a tool for exploration and visualization of Patch-seq data, we introduce neural networks with a two-dimensional bottleneck, trained to predict electrophysiological measurements from gene expression. To make the model biologically interpretable and perform gene selection, we enforce sparsity by using a group lasso penalty, followed by pruning of the input units and subsequent fine-tuning. We applied this method to a recent dataset with $>$1000 neurons from mouse motor cortex and found that the resulting bottleneck model had the same predictive performance as a full-rank linear model with much higher latent dimensionality. Exploring the two-dimensional latent space in terms of neural types showed that the nonlinear bottleneck approach led to much better visualizations and higher biological interpretability.