Abstract:Accurate specification of standard occupational classification (SOC) code is critical to the success of many U.S. work visa applications. Determination of correct SOC code relies on careful study of job requirements and comparison to definitions given by the U.S. Bureau of Labor Statistics, which is often a tedious activity. In this paper, we apply methods from natural language processing (NLP) to computationally determine SOC code based on job description. We implement and empirically evaluate a broad variety of predictive models with respect to quality of prediction and training time, and identify models best suited for this task.
Abstract:In this paper, we consider the problem of organizing supporting documents vital to U.S. work visa petitions, as well as responding to Requests For Evidence (RFE) issued by the U.S.~Citizenship and Immigration Services (USCIS). Typically, both processes require a significant amount of repetitive manual effort. To reduce the burden of mechanical work, we apply machine learning methods to automate these processes, with humans in the loop to review and edit output for submission. In particular, we use an ensemble of image and text classifiers to categorize supporting documents. We also use a text classifier to automatically identify the types of evidence being requested in an RFE, and used the identified types in conjunction with response templates and extracted fields to assemble draft responses. Empirical results suggest that our approach achieves considerable accuracy while significantly reducing processing time.