Object detection in natural images has achieved remarkable results over the years. However, a similar progress has not yet been observed in aerial object detection due to several challenges, such as high resolution images, instances scale variation, class imbalance etc. We show the performance of two-stage, one-stage and attention based object detectors on the iSAID dataset. Furthermore, we describe some modifications and analysis performed for different models - a) In two stage detector: introduced weighted attention based FPN, class balanced sampler and density prediction head. b) In one stage detector: used weighted focal loss and introduced FPN. c) In attention based detector: compare single,multi-scale attention and demonstrate effect of different backbones. Finally, we show a comparative study highlighting the pros and cons of different models in aerial imagery setting.