A key step in quantum machine learning with classical inputs is the design of an embedding circuit mapping inputs to a quantum state. This paper studies a transfer learning setting in which classical-to-quantum embedding is carried out by an arbitrary parametric quantum circuit that is pre-trained based on data from a source task. At run time, the binary classifier is then optimized based on data from the target task of interest. Using an information-theoretic approach, we demonstrate that the average excess risk, or optimality gap, can be bounded in terms of two R\'enyi mutual information terms between classical input and quantum embedding under source and target tasks, as well as in terms of a measure of similarity between the source and target tasks related to the trace distance. The main theoretical results are validated on a simple binary classification example.