Abstract:Manually generating access control policies from an organization's high-level requirement specifications poses significant challenges. It requires laborious efforts to sift through multiple documents containing such specifications and translate their access requirements into access control policies. Also, the complexities and ambiguities of these specifications often result in errors by system administrators during the translation process, leading to data breaches. However, the automated policy generation frameworks designed to help administrators in this process are unreliable due to limitations, such as the lack of domain adaptation. Therefore, to improve the reliability of access control policy generation, we propose RAGent, a novel retrieval-based access control policy generation framework based on language models. RAGent identifies access requirements from high-level requirement specifications with an average state-of-the-art F1 score of 87.9%. Through retrieval augmented generation, RAGent then translates the identified access requirements into access control policies with an F1 score of 77.9%. Unlike existing frameworks, RAGent generates policies with complex components like purposes and conditions, in addition to subjects, actions, and resources. Moreover, RAGent automatically verifies the generated policies and iteratively refines them through a novel verification-refinement mechanism, further improving the reliability of the process by 3%, reaching the F1 score of 80.6%. We also introduce three annotated datasets for developing access control policy generation frameworks in the future, addressing the data scarcity of the domain.
Abstract:Administrator-centered access control failures can cause data breaches, putting organizations at risk of financial loss and reputation damage. Existing graphical policy configuration tools and automated policy generation frameworks attempt to help administrators configure and generate access control policies by avoiding such failures. However, graphical policy configuration tools are prone to human errors, making them unusable. On the other hand, automated policy generation frameworks are prone to erroneous predictions, making them unreliable. Therefore, to find ways to improve their usability and reliability, we conducted a Systematic Literature Review analyzing 49 publications, to identify those tools, frameworks, and their limitations. Identifying those limitations will help develop effective access control policy generation solutions while avoiding access control failures.