Deep neural networks trained for predicting cellular events from DNA sequence have become emerging tools to help elucidate the biological mechanism underlying the associations identified in genome-wide association studies. To enhance the training, multi-task learning (MTL) has been commonly exploited in previous works where trained networks were needed for multiple profiles differing in either event modality or cell type. All existing works adopted a simple MTL framework where all tasks share a single feature extraction network. Such a strategy even though effective to certain extent leads to substantial negative transfer, meaning the existence of large portion of tasks for which models obtained through MTL perform worse than those by single task learning. There have been methods developed to address such negative transfer in other domains, such as computer vision. However, these methods are generally difficult to scale up to handle large amount of tasks. In this paper, we propose a highly scalable task grouping framework to address negative transfer by only jointly training tasks that are potentially beneficial to each other. The proposed method exploits the network weights associated with task specific classification heads that can be cheaply obtained by one-time joint training of all tasks. Our results using a dataset consisting of 367 epigenetic profiles demonstrate the effectiveness of the proposed approach and its superiority over baseline methods.