A generic algorithm for the extraction of probabilistic (Bayesian) information about model parameters from data is presented. The algorithm propagates an ensemble of particles in the product space of model parameters and outputs. Each particle update consists of a random jump in parameter space followed by a simulation of a model output and a Metropolis acceptance/rejection step based on a comparison of the simulated output to the data. The distance of a particle to the data is interpreted as an energy and the algorithm is reducing the associated temperature of the ensemble such that entropy production is minimized. If this simulated annealing is not too fast compared to the mixing speed in parameter space, the parameter marginal of the ensemble approaches the Bayesian posterior distribution. Annealing is adaptive and depends on certain extensive thermodynamic quantities that can easily be measured throughout run-time. In the general case, we propose annealing with a constant entropy production rate, which is optimal as long as annealing is not too fast. For the practically relevant special case of no prior knowledge, we derive an optimal fast annealing schedule with a non-constant entropy production rate. The algorithm does not require the calculation of the density of the model likelihood, which makes it interesting for Bayesian parameter inference with stochastic models, whose likelihood functions are typically very high dimensional integrals.