Recent large-scale Text-To-Image (T2I) models such as DALLE-3 demonstrate great potential in new applications, but also face unprecedented fairness challenges. Prior studies revealed gender biases in single-person image generation, but T2I model applications might require portraying two or more people simultaneously. Potential biases in this setting remain unexplored, leading to fairness-related risks in usage. To study these underlying facets of gender biases in T2I models, we propose a novel Paired Stereotype Test (PST) bias evaluation framework. PST prompts the model to generate two individuals in the same image. They are described with two social identities that are stereotypically associated with the opposite gender. Biases can then be measured by the level of conformation to gender stereotypes in generated images. Using PST, we evaluate DALLE-3 from 2 perspectives: biases in gendered occupation and biases in organizational power. Despite seemingly fair or even anti-stereotype single-person generations, PST still unveils gendered occupational and power associations. Moreover, compared to single-person settings, DALLE-3 generates noticeably more masculine figures under PST for individuals with male-stereotypical identities. PST is therefore effective in revealing underlying gender biases in DALLE-3 that single-person settings cannot capture. Our findings reveal the complicated patterns of gender biases in modern T2I models, further highlighting the critical fairness challenges in multimodal generative systems.