Abstract:Many unanswerable adversarial questions fool the question-answer (QA) system with some plausible answers. Building a robust, frequently asked questions (FAQ) chatbot needs a large amount of diverse adversarial examples. Recent question generation methods are ineffective at generating many high-quality and diverse adversarial question-answer pairs from unstructured text. We propose the diversity controllable semantically valid adversarial attacker (DCSA), a high-quality, diverse, controllable method to generate standard and adversarial samples with a semantic graph. The fluent and semantically generated QA pairs fool our passage retrieval model successfully. After that, we conduct a study on the robustness and generalization of the QA model with generated QA pairs among different domains. We find that the generated data set improves the generalizability of the QA model to the new target domain and the robustness of the QA model to detect unanswerable adversarial questions.