Many biological and cognitive systems do not operate deep into one or other regime of activity. Instead, they exploit critical surfaces poised at transitions in their parameter space. The pervasiveness of criticality in natural systems suggests that there may be general principles inducing this behaviour. However, there is a lack of conceptual models explaining how embodied agents propel themselves towards these critical points. In this paper, we present a learning model driving an embodied Boltzmann Machine towards critical behaviour by maximizing the heat capacity of the network. We test and corroborate the model implementing an embodied agent in the mountain car benchmark, controlled by a Boltzmann Machine that adjust its weights according to the model. We find that the neural controller reaches a point of criticality, which coincides with a transition point of the behaviour of the agent between two regimes of behaviour, maximizing the synergistic information between its sensors and the hidden and motor neurons. Finally, we discuss the potential of our learning model to study the contribution of criticality to the behaviour of embodied living systems in scenarios not necessarily constrained by biological restrictions of the examples of criticality we find in nature.