In many multi-microphone algorithms, an estimate of the relative transfer functions (RTFs) of the desired speaker is required. Recently, a computationally efficient RTF vector estimation method was proposed for acoustic sensor networks, assuming that the spatial coherence (SC) of the noise component between a local microphone array and multiple external microphones is low. Aiming at optimizing the output signal-to-noise ratio (SNR), this method linearly combines multiple RTF vector estimates, where the complex-valued weights are computed using a generalized eigenvalue decomposition (GEVD). In this paper, we perform a theoretical bias analysis for the SC-based RTF vector estimation method with multiple external microphones. Assuming a certain model for the noise field, we derive an analytical expression for the weights, showing that the optimal model-based weights are real-valued and only depend on the input SNR in the external microphones. Simulations with real-world recordings show a good accordance of the GEVD-based and the model-based weights. Nevertheless, the results also indicate that in practice, estimation errors occur which the model-based weights cannot account for.