The ability of an agent to perform well in new and unseen environments is a crucial aspect of intelligence. In machine learning, this ability is referred to as strong or out-of-distribution generalization. However, simply considering differences in data distributions is not sufficient to fully capture differences in environments. In the present paper, we assay out-of-variable generalization, which refers to an agent's ability to handle new situations that involve variables never jointly observed before. We expect that such ability is important also for AI-driven scientific discovery: humans, too, explore 'Nature' by probing, observing and measuring subsets of variables at one time. Mathematically, it requires efficient re-use of past marginal knowledge, i.e., knowledge over subsets of variables. We study this problem, focusing on prediction tasks that involve observing overlapping, yet distinct, sets of causal parents. We show that the residual distribution of one environment encodes the partial derivative of the true generating function with respect to the unobserved causal parent. Hence, learning from the residual allows zero-shot prediction even when we never observe the outcome variable in the other environment.