Abstract:Open peer review is a growing trend in academic publications. Public access to peer review data can benefit both the academic and publishing communities. It also serves as a great support to studies on review comment generation and further to the realization of automated scholarly paper review. However, most of the existing peer review datasets do not provide data that cover the whole peer review process. Apart from this, their data are not diversified enough as they are mainly collected from the field of computer science. These two drawbacks of the currently available peer review datasets need to be addressed to unlock more opportunities for related studies. In response to this problem, we construct MOPRD, a multidisciplinary open peer review dataset. This dataset consists of paper metadata, multiple version manuscripts, review comments, meta-reviews, author's rebuttal letters, and editorial decisions. Moreover, we design a modular guided review comment generation method based on MOPRD. Experiments show that our method delivers better performance indicated by both automatic metrics and human evaluation. We also explore other potential applications of MOPRD, including meta-review generation, editorial decision prediction, author rebuttal generation, and scientometric analysis. MOPRD is a strong endorsement for further studies in peer review-related research and other applications.
Abstract:Peer review is a widely accepted mechanism for research evaluation, playing a pivotal role in scholarly publishing. However, criticisms have long been leveled on this mechanism, mostly because of its inefficiency and subjectivity. Recent years have seen the application of artificial intelligence (AI) in assisting the peer review process. Nonetheless, with the involvement of humans, such limitations remain inevitable. In this review paper, we propose the concept of automated scholarly paper review (ASPR) and review the relevant literature and technologies to discuss the possibility of achieving a full-scale computerized review process. We further look into the challenges in ASPR with the existing technologies. On the basis of the review and discussion, we conclude that there are already corresponding research and technologies at each stage of ASPR. This verifies that ASPR can be realized in the long term as the relevant technologies continue to develop. The major difficulties in its realization lie in imperfect document parsing and representation, inadequate data, defected human-computer interaction and flawed deep logical reasoning. In the foreseeable future, ASPR and peer review will coexist in a reinforcing manner before ASPR is able to fully undertake the reviewing workload from humans.