Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Johan Ravn

Occode: an end-to-end machine learning pipeline for transcription of historical population censuses

Jun 07, 2021

Bjørn-Richard Pedersen, Einar Holsbø, Trygve Andersen, Nikita Shvetsov, Johan Ravn, Hilde Leikny Sommerseth, Lars Ailo Bongo

Figure 1 for Occode: an end-to-end machine learning pipeline for transcription of historical population censuses

Figure 2 for Occode: an end-to-end machine learning pipeline for transcription of historical population censuses

Figure 3 for Occode: an end-to-end machine learning pipeline for transcription of historical population censuses

Figure 4 for Occode: an end-to-end machine learning pipeline for transcription of historical population censuses

Abstract:Machine learning approaches achieve high accuracy for text recognition and are therefore increasingly used for the transcription of handwritten historical sources. However, using machine learning in production requires a streamlined end-to-end machine learning pipeline that scales to the dataset size, and a model that achieves high accuracy with few manual transcriptions. In addition, the correctness of the model results must be verified. This paper describes our lessons learned developing, tuning, and using the Occode end-to-end machine learning pipeline for transcribing 7,3 million rows with handwritten occupation codes in the Norwegian 1950 population census. We achieve an accuracy of 97% for the automatically transcribed codes, and we send 3% of the codes for manual verification. We verify that the occupation code distribution found in our result matches the distribution found in our training data which should be representative for the census as a whole. We believe our approach and lessons learned are useful for other transcription projects that plan to use machine learning in production. The source code is available at: https://github.com/uit-hdl/rhd-codes

Via

Access Paper or Ask Questions

Convolutional neural network for breathing phase detection in lung sounds

Mar 25, 2019

Cristina Jácome, Johan Ravn, Einar Holsbø, Juan Carlos Aviles-Solis, Hasse Melbye, Lars Ailo Bongo

Figure 1 for Convolutional neural network for breathing phase detection in lung sounds

Figure 2 for Convolutional neural network for breathing phase detection in lung sounds

Figure 3 for Convolutional neural network for breathing phase detection in lung sounds

Figure 4 for Convolutional neural network for breathing phase detection in lung sounds

Abstract:We applied deep learning to create an algorithm for breathing phase detection in lung sound recordings, and we compared the breathing phases detected by the algorithm and manually annotated by two experienced lung sound researchers. Our algorithm uses a convolutional neural network with spectrograms as the features, removing the need to specify features explicitly. We trained and evaluated the algorithm using three subsets that are larger than previously seen in the literature. We evaluated the performance of the method using two methods. First, discrete count of agreed breathing phases (using 50% overlap between a pair of boxes), shows a mean agreement with lung sound experts of 97% for inspiration and 87% for expiration. Second, the fraction of time of agreement (in seconds) gives higher pseudo-kappa values for inspiration (0.73-0.88) than expiration (0.63-0.84), showing an average sensitivity of 97% and an average specificity of 84%. With both evaluation methods, the agreement between the annotators and the algorithm shows human level performance for the algorithm. The developed algorithm is valid for detecting breathing phases in lung sound recordings.

Via

Access Paper or Ask Questions