We develop new algorithms for simultaneous learning of multiple tasks (e.g., image classification, depth estimation), and for adapting to unseen task/domain distributions within those high-level tasks (e.g., different environments). First, we learn common representations underlying all tasks. We then propose an attention mechanism to dynamically specialize the network, at runtime, for each task. Our approach is based on weighting each feature map of the backbone network, based on its relevance to a particular task. To achieve this, we enable the attention module to learn task representations during training, which are used to obtain attention weights. Our method improves performance on new, previously unseen environments, and is 1.5x faster than standard existing meta learning methods using similar architectures. We highlight performance improvements for Multi-Task Meta Learning of 4 tasks (image classification, depth, vanishing point, and surface normal estimation), each over 10 to 25 test domains/environments, a result that could not be achieved with standard meta learning techniques like MAML.