Cancer is one of the most life-threatening diseases worldwide, and head and neck (H&N) cancer is a prevalent type with hundreds of thousands of new cases recorded each year. Clinicians use medical imaging modalities such as computed tomography and positron emission tomography to detect the presence of a tumor, and they combine that information with clinical data for patient prognosis. The process is mostly challenging and time-consuming. Machine learning and deep learning can automate these tasks to help clinicians with highly promising results. This work studies two approaches for H&N tumor segmentation: (i) exploration and comparison of vision transformer (ViT)-based and convolutional neural network-based models; and (ii) proposal of a novel 2D perspective to working with 3D data. Furthermore, this work proposes two new architectures for the prognosis task. An ensemble of several models predicts patient outcomes (which won the HECKTOR 2021 challenge prognosis task), and a ViT-based framework concurrently performs patient outcome prediction and tumor segmentation, which outperforms the ensemble model.