Convolutional Neural Networks(CNN) has had a great success in the recent past, because of the advent of faster GPUs and memory access. CNNs are really powerful as they learn the features from data in layers such that they exhibit the structure of the V-1 features of the human brain. A huge bottleneck, in this case, is that CNNs are very large and have a very high memory footprint, and hence they cannot be employed on devices with limited storage such as mobile phone, IoT etc. In this work, we study the model complexity versus accuracy trade-off on MNSIT dataset, and give a concrete framework for handling such a problem, given the worst case accuracy that a system can tolerate. In our work, we reduce the model complexity by 236 times, and memory footprint by 19.5 times compared to the base model while achieving worst case accuracy threshold.