The free energy principle (FEP), as an encompassing framework and a unified brain theory, has been widely applied to account for various problems in fields such as cognitive science, neuroscience, social interaction, and hermeneutics. As a computational model deeply rooted in math and statistics, FEP posits an optimization problem based on variational Bayes, which is solved either by dynamic programming or expectation maximization in practice. However, there seems to be a bottleneck in extending the FEP to machine learning and implementing such models with neural networks. This paper gives a preliminary attempt at bridging FEP and machine learning, via a classical neural network model, the Helmholtz machine. As a variational machine learning model, the Helmholtz machine is optimized by minimizing its free energy, the same objective as FEP. Although the Helmholtz machine is not temporal, it gives an ideal parallel to the vanilla FEP and the hierarchical model of the brain, under which the active inference and predictive coding could be formulated coherently. Besides a detailed theoretical discussion, the paper also presents a preliminary experiment to validate the hypothesis. By fine-tuning the trained neural network through active inference, the model performance is promoted to accuracy above 99\%. In the meantime, the data distribution is continuously deformed to a salience that conforms to the model representation, as a result of active sampling.