The paper studies distributed Dictionary Learning (DL) problems where the learning task is distributed over a multi-agent network with time-varying (nonsymmetric) connectivity. This formulation is relevant, for instance, in big-data scenarios where massive amounts of data are collected/stored in different spatial locations and it is unfeasible to aggregate and/or process all the data in a fusion center, due to resource limitations, communication overhead or privacy considerations. We develop a general distributed algorithmic framework for the (nonconvex) DL problem and establish its asymptotic convergence. The new method hinges on Successive Convex Approximation (SCA) techniques coupled with i) a gradient tracking mechanism instrumental to locally estimate the missing global information; and ii) a consensus step, as a mechanism to distribute the computations among the agents. To the best of our knowledge, this is the first distributed algorithm with provable convergence for the DL problem and, more in general, bi-convex optimization problems over (time-varying) directed graphs.