Recent randomised clinical trials have shown that patients with ischaemic stroke {due to occlusion of a large intracranial blood vessel} benefit from endovascular thrombectomy. However, predicting outcome of treatment in an individual patient remains a challenge. We propose a novel deep learning approach to directly exploit multimodal data (clinical metadata information, imaging data, and imaging biomarkers extracted from images) to estimate the success of endovascular treatment. We incorporate an attention mechanism in our architecture to model global feature inter-dependencies, both channel-wise and spatially. We perform comparative experiments using unimodal and multimodal data, to predict functional outcome (modified Rankin Scale score, mRS) and achieve 0.75 AUC for dichotomised mRS scores and 0.35 classification accuracy for individual mRS scores.