We show that marginals of subchains of length $t$ of any finitely correlated translation invariant state on a chain can be learned, in trace distance, with $O(t^2)$ copies -- with an explicit dependence on local dimension, memory dimension and spectral properties of a certain map constructed from the state -- and computational complexity polynomial in $t$. The algorithm requires only the estimation of a marginal of a controlled size, in the worst case bounded by a multiple of the minimum bond dimension, from which it reconstructs a translation invariant matrix product operator. In the analysis, a central role is played by the theory of operator systems. A refined error bound can be proven for $C^*$-finitely correlated states, which have an operational interpretation in terms of sequential quantum channels applied to the memory system. We can also obtain an analogous error bound for a class of matrix product density operators reconstructible by local marginals. In this case, a linear number of marginals must be estimated, obtaining a sample complexity of $\tilde{O}(t^3)$. The learning algorithm also works for states that are only close to a finitely correlated state, with the potential of providing competitive algorithms for other interesting families of states.