Abstract:Because of their abundance of amino acids, polysaccharides, and many other nutrients that benefit human beings, mushrooms are deservedly popular as dietary cuisine both worldwide and in China. However, if people eat poisonous fungi by mistake, they may suffer from nausea, vomiting, mental disorder, acute anemia, or even death. Each year in China, there are around 8000 people became sick, and 70 died as a result of eating toxic mushrooms by mistake. It is counted that there are thousands of kinds of mushrooms among which only around 900 types are edible, thus without specialized knowledge, the probability of eating toxic mushrooms by mistake is very high. Most people deem that the only characteristic of poisonous mushrooms is a bright colour, however, some kinds of them do not correspond to this trait. In order to prevent people from eating these poisonous mushrooms, we propose to use deep learning methods to indicate whether a mushroom is toxic through analyzing hundreds of edible and toxic mushrooms smartphone pictures. We crowdsource a mushroom image dataset that contains 250 images of poisonous mushrooms and 200 images of edible mushrooms. The Convolutional Neural Network (CNN) is a specialized type of artificial neural networks that use a mathematical operation called convolution in place of general matrix multiplication in at least one of their layers, which can generate a relatively precise result by analyzing a huge amount of images, and thus is very suitable for our research. The experimental results demonstrate that the proposed model has high credibility and can provide a decision-making basis for the selection of edible fungi, so as to reduce the morbidity and mortality caused by eating poisonous mushrooms. We also open source our hand collected mushroom image dataset so that peer researchers can also deploy their own model to advance poisonous mushroom identification.
Abstract:Transgender community is experiencing a huge disparity in mental health conditions compared with the general population. Interpreting the social medial data posted by transgender people may help us understand the sentiments of these sexual minority groups better and apply early interventions. In this study, we manually categorize 300 social media comments posted by transgender people to the sentiment of negative, positive, and neutral. 5 machine learning algorithms and 2 deep neural networks are adopted to build sentiment analysis classifiers based on the annotated data. Results show that our annotations are reliable with a high Cohen's Kappa score over 0.8 across all three classes. LSTM model yields an optimal performance of accuracy over 0.85 and AUC of 0.876. Our next step will focus on using advanced natural language processing algorithms on a larger annotated dataset.
Abstract:The global pandemic caused by COVID-19 affects our lives in all aspects. As of September 11, more than 28 million people have tested positive for COVID-19 infection, and more than 911,000 people have lost their lives in this virus battle. Some patients can not receive appropriate medical treatment due the limits of hospitalization volume and shortage of ICU beds. An estimated future hospitalization is critical so that medical resources can be allocated as needed. In this study, we propose to use 4 recurrent neural networks to infer hospitalization change for the following week compared with the current week. Results show that sequence to sequence model with attention achieves a high accuracy of 0.938 and AUC of 0.850 in the hospitalization prediction. Our work has the potential to predict the hospitalization need and send a warning to medical providers and other stakeholders when a re-surge initializes.
Abstract:ICU readmission is associated with longer hospitalization, mortality and adverse outcomes. An early recognition of ICU re-admission can help prevent patients from worse situation and lower treatment cost. As the abundance of Electronics Health Records (EHR), it is popular to design clinical decision tools with machine learning technique manipulating on healthcare large scale data. We designed data-driven predictive models to estimate the risk of ICU readmission. The discharge summary of each hospital admission was carefully represented by natural language processing techniques. Unified Medical Language System (UMLS) was further used to standardize inconsistency of discharge summaries. 5 machine learning classifiers were adopted to construct predictive models. The best configuration yielded a competitive AUC of 0.748. Our work suggests that natural language processing of discharge summaries is capable to send clinicians warning of unplanned 30-day readmission upon discharge.
Abstract:The prediction of epileptic seizure has always been extremely challenging in medical domain. However, as the development of computer technology, the application of machine learning introduced new ideas for seizure forecasting. Applying machine learning model onto the predication of epileptic seizure could help us obtain a better result and there have been plenty of scientists who have been doing such works so that there are sufficient medical data provided for researchers to do training of machine learning models.