Hyperspectral cameras generate a large amount of data due to the presence of hundreds of spectral bands as opposed to only three channels (red, green, and blue) in traditional cameras. This requires a significant amount of data transmission between the hyperspectral image sensor and a processor used to classify/detect/track the images, frame by frame, expending high energy and causing bandwidth and security bottlenecks. To mitigate this problem, we propose a form of processing-in-pixel (PIP) that leverages advanced CMOS technologies to enable the pixel array to perform a wide range of complex operations required by the modern convolutional neural networks (CNN) for hyperspectral image recognition (HSI). Consequently, our PIP-optimized custom CNN layers effectively compress the input data, significantly reducing the bandwidth required to transmit the data downstream to the HSI processing unit. This reduces the average energy consumption associated with pixel array of cameras and the CNN processing unit by 25.06x and 3.90x respectively, compared to existing hardware implementations. Our custom models yield average test accuracies within 0.56% of the baseline models for the standard HSI benchmarks.