The electronic design automation (EDA) community has been actively exploring machine learning for very-large-scale-integrated computer aided design (VLSI CAD). Many studies have explored learning based techniques for cross-stage prediction tasks in the design flow to achieve faster design convergence. Although building machine learning (ML) models usually requires a large amount of data, most studies can only generate small internal datasets for validation due to the lack of large public datasets. In this essay, we present the first open-source dataset for machine learning tasks in VLSI CAD called CircuitNet. The dataset consists of more than 10K samples extracted from versatile runs of commercial design tools based on 6 open-source RISC-V designs.