Abstract:Question difficulty estimation remains a multifaceted challenge in educational and assessment settings. Traditional approaches often focus on surface-level linguistic features or learner comprehension levels, neglecting the intricate interplay of factors contributing to question complexity. This paper presents a novel framework for domain-specific question difficulty estimation, leveraging a suite of NLP techniques and knowledge graph analysis. We introduce four key parameters: Topic Retrieval Cost, Topic Salience, Topic Coherence, and Topic Superficiality, each capturing a distinct facet of question complexity within a given subject domain. These parameters are operationalized through topic modelling, knowledge graph analysis, and information retrieval techniques. A model trained on these features demonstrates the efficacy of our approach in predicting question difficulty. By operationalizing these parameters, our framework offers a novel approach to question complexity estimation, paving the way for more effective question generation, assessment design, and adaptive learning systems across diverse academic disciplines.