Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Steven Soisson

Classification of Protein Crystallization X-Ray Images Using Major Convolutional Neural Network Architectures

May 11, 2018

Soheil Ghafurian, Peter Orth, Corey Strickland, Hua Su, Sangita Patel, Steven Soisson, Belma Dogdas

Figure 1 for Classification of Protein Crystallization X-Ray Images Using Major Convolutional Neural Network Architectures

Figure 2 for Classification of Protein Crystallization X-Ray Images Using Major Convolutional Neural Network Architectures

Figure 3 for Classification of Protein Crystallization X-Ray Images Using Major Convolutional Neural Network Architectures

Figure 4 for Classification of Protein Crystallization X-Ray Images Using Major Convolutional Neural Network Architectures

Abstract:The generation of protein crystals is necessary for the study of protein molecular function and structure. This is done empirically by processing large numbers of crystallization trials and inspecting them regularly in search of those with forming crystals. To avoid missing the hard-gained crystals, this visual inspection of the trial X-ray images is done manually as opposed to the existing less accurate machine learning methods. To achieve higher accuracy for automation, we applied some of the most successful convolutional neural networks (ResNet, Inception, VGG, and AlexNet) for 10-way classification of the X-ray images. We showed that substantial classification accuracy is gained by using such networks compared to two simpler ones previously proposed for this purpose. The best accuracy was obtained from ResNet (81.43%), which corresponds to a missed crystal rate of 5.9%. This rate could be lowered to less than 0.1% by using a top-3 classification strategy. Our dataset consisted of 486,000 internally annotated images, which was augmented to more than a million to address class imbalance. We also provide a label-wise analysis of the results, identifying the main sources of error and inaccuracy.

Via

Access Paper or Ask Questions