Computational optical imaging (COI) systems have enabled the acquisition of high-dimensional signals through optical coding elements (OCEs). OCEs encode the high-dimensional signal in one or more snapshots, which are subsequently decoded using computational algorithms. Currently, COI systems are optimized through an end-to-end (E2E) approach, where the OCEs are modeled as a layer of a neural network and the remaining layers perform a specific imaging task. However, the performance of COI systems optimized through E2E is limited by the physical constraints imposed by these systems. This paper proposes a knowledge distillation (KD) framework for the design of highly physically constrained COI systems. This approach employs the KD methodology, which consists of a teacher-student relationship, where a high-performance, unconstrained COI system (the teacher), guides the optimization of a physically constrained system (the student) characterized by a limited number of snapshots. We validate the proposed approach, using a binary coded apertures single pixel camera for monochromatic and multispectral image reconstruction. Simulation results demonstrate the superiority of the KD scheme over traditional E2E optimization for the designing of highly physically constrained COI systems.