Photoacoustic imaging (PAI) is an emerging non-invasive imaging modality combining the advantages of deep ultrasound penetration and high optical contrast. Image reconstruction is an essential topic in PAI, which is unfortunately an ill-posed problem due to the complex and unknown optical/acoustic parameters in tissue. Conventional algorithms used in PAI (e.g., delay-and-sum) provide a fast solution while many artifacts remain, especially for linear array probe with limited-view issue. Convolutional neural network (CNN) has shown state-of-the-art results in computer vision, and more and more work based on CNN has been studied in medical image processing recently. In this paper, we present a non-iterative scheme filling the gap between existing direct-processing and post-processing methods, and propose a new framework Y-Net: a CNN architecture to reconstruct the PA image by optimizing both raw data and beamformed images once. The network connected two encoders with one decoder path, which optimally utilizes more information from raw data and beamformed image. The results of the test set showed good performance compared with conventional reconstruction algorithms and other deep learning methods. Our method is also validated with experiments both in-vitro and in vivo, which still performs better than other existing methods. The proposed Y-Net architecture also has high potential in medical image reconstruction for other imaging modalities beyond PAI.