A kidney exchange program, also called a kidney paired donation program, can be viewed as a repeated, dynamic trading and allocation mechanism. This suggests that a dynamic algorithm for transplant exchange selection may have superior performance in comparison to the repeated use of a static algorithm. We confirm this hypothesis using a full scale simulation of the Canadian Kidney Paired Donation Program: learning algorithms, that attempt to learn optimal patient-donor weights in advance via dynamic simulations, do lead to improved outcomes. Specifically, our learning algorithms, designed with the objective of fairness (that is, equity in terms of transplant accessibility across cPRA groups), also lead to an increased number of transplants and shorter average waiting times. Indeed, our highest performing learning algorithm improves egalitarian fairness by 10% whilst also increasing the number of transplants by 6% and decreasing waiting times by 24%. However, our main result is much more surprising. We find that the most critical factor in determining the performance of a kidney exchange program is not the judicious assignment of positive weights (rewards) to patient-donor pairs. Rather, the key factor in increasing the number of transplants, decreasing waiting times and improving group fairness is the judicious assignment of a negative weight (penalty) to the small number of non-directed donors in the kidney exchange program.