Artificial intelligence (AI) is increasingly utilized in synthesizing visuals, texts, and audio. These AI-based works, often derived from neural networks, are entering the mainstream market, as digital paintings, songs, books, and others. We conceptualize both existing and future human-in-the-loop (HITL) approaches for creative applications and to develop more expressive, nuanced, and multimodal models. Particularly, how can our expertise as curators and collaborators be encoded in AI models in an interactive manner? We examine and speculate on long term implications for models, interfaces, and machine creativity. Our selection, creation, and interpretation of AI art inherently contain our emotional responses, cultures, and contexts. Therefore, the proposed HITL may help algorithms to learn creative processes that are much harder to codify or quantify. We envision multimodal HITL processes, where texts, visuals, sounds, and other information are coupled together, with automated analysis of humans and environments. Overall, these HITL approaches will increase interaction between human and AI, and thus help the future AI systems to better understand our own creative and emotional processes.