In recent years , there has been an upsurge in a new form of entertainment medium called memes. These memes although seemingly innocuous have transcended onto the boundary of online harassment against women and created an unwanted bias against them . To help alleviate this problem , we propose an early fusion model for prediction and identification of misogynistic memes and its type in this paper for which we participated in SemEval-2022 Task 5 . The model receives as input meme image with its text transcription with a target vector. Given that a key challenge with this task is the combination of different modalities to predict misogyny, our model relies on pretrained contextual representations from different state-of-the-art transformer-based language models and pretrained image pretrained models to get an effective image representation. Our model achieved competitive results on both SubTask-A and SubTask-B with the other competition teams and significantly outperforms the baselines.