Abstract:Analysis of 1H-NMR spectra is often hindered by large variations that occur during the collection of these spectra. Large solvent and standard peaks, base line drift and negative peaks (due to improper phasing) are among some of these variations. Furthermore, some instrument dependent alterations, such as incorrect shimming, are also embedded in the recorded spectrum. The unpredictable nature of these alterations of the signal has rendered the automated and instrument independent computer analysis of these spectra unreliable. In this paper, a novel method of extracting the information content of a signal (in this paper, frequency domain 1H-NMR spectrum), called the frequency-information transformation (FIT), is presented and compared to a previously used method (SPUTNIK). FIT can successfully extract the relevant information to a pattern matching task present in a signal, while discarding the remainder of a signal by transforming a Fourier transformed signal into an information spectrum (IS). This technique exhibits the ability of decreasing the inter-class correlation coefficients while increasing the intra-class correlation coefficients. Different spectra of the same molecule, in other words, will resemble more to each other while the spectra of different molecules will look more different from each other. This feature allows easier automated identification and analysis of molecules based on their spectral signatures using computer algorithms.
Abstract:This study reports the development of a pattern recognition search engine for a World Wide Web-based database of gas chromatography-electron impact mass spectra (GC-EIMS) of partially methylated Alditol Acetates (PMAAs). Here, we also report comparative results for two pattern recognition techniques that were employed for this study. The first technique is a statistical technique using Bayesian classifiers and Parzen density estimators. The second technique involves an artificial neural network module trained with reinforcement learning. We demonstrate here that both systems perform well in identifying spectra with small amounts of noise. Both system's performance degrades with degrading signal-to-noise ratio (SNR). When dealing with partial spectra (missing data), the artificial neural network system performs better. The developed system is implemented on the world wide web, and is intended to identify PMAAs using submitted spectra of these molecules recorded on any GC-EIMS instrument. The system, therefore, is insensitive to instrument and column dependent variations in GC-EIMS spectra.
Abstract:A new neural network architecture (PSCNN) is developed to improve performance and speed of such networks. The architecture has all the advantages of the previous models such as self-organization and possesses some other superior characteristics such as input parallelism and decision making based on consensus. Due to the properties of this network, it was studied with respect to implementation on a Parallel Processor (Ncube Machine) as well as a regular sequential machine. The architecture self organizes its own modules in a way to maximize performance. Since it is completely parallel, both recall and learning procedures are very fast. The performance of the network was compared to the Backpropagation networks in problems of language perception, remote sensing and binary logic (Exclusive-Or). PSCNN showed superior performance in all cases studied.
Abstract:Proton nuclear magnetic resonance (1H-NMR) is a widely used tool for chemical structural analysis. However, 1H-NMR spectra suffer from natural aberrations that render computer-assisted automated identification of these spectra difficult, and at times impossible. Previous efforts have successfully implemented instrument dependent or conditional identification of these spectra. In this paper, we report the first instrument independent computer-assisted automated identification system for a group of complex carbohydrates known as the xyloglucan oligosaccharides. The developed system is also implemented on the world wide web (http://www.ccrc.uga.edu) as part of an identification package called the CCRC-Net and is intended to recognize any submitted 1H-NMR spectrum of these structures with reasonable signal-to-noise ratio, recorded on any 500 MHz NMR instrument. The system uses Artificial Neural Networks (ANNs) technology and is insensitive to the instrument and environment-dependent variations in 1H-NMR spectroscopy. In this paper, comparative results of the ANN engine versus a multidimensional Bayes' classifier is also presented.