Barriers to accessing mental health assessments including cost and stigma continues to be an impediment in mental health diagnosis and treatment. Machine learning approaches based on speech samples could help in this direction. In this work, we develop machine learning solutions to diagnose anxiety disorders from audio journals of patients. We work on a novel anxiety dataset (provided through collaboration with Kintsugi Mindful Wellness Inc.) and experiment with several models of varying complexity utilizing audio, text and a combination of multiple modalities. We show that the multi-modal and audio embeddings based approaches achieve good performance in the task achieving an AUC ROC score of 0.68-0.69.