Designing Public Transport (PT) networks able to satisfy mobility needs of people is essential to reduce the number of individual vehicles on the road, and thus pollution and congestion. Urban sustainability is thus tightly coupled to an efficient PT. Current approaches on Transport Network Design (TND) generally aim to optimize generalized cost, i.e., a unique number including operator and users' costs. Since we intend quality of PT as the capability of satisfying mobility needs, we focus instead on PT accessibility, i.e., the ease of reaching surrounding points of interest via PT. PT accessibility is generally unequally distributed in urban regions: suburbs generally suffer from poor PT accessibility, which condemns residents therein to be dependent on their private cars. We thus tackle the problem of designing bus lines so as to minimize the inequality in the geographical distribution of accessibility. We combine state-of-the-art Message Passing Neural Networks (MPNN) and Reinforcement Learning. We show the efficacy of our method against metaheuristics (classically used in TND) in a use case representing in simplified terms the city of Montreal.