The massive Multiple-Input Multiple-Output (mMIMO) concept has been recently moving forward to extreme scales to address the envisioned requirements of next generation networks. However, the extension of conventional architectures will result in significant cost and power consumption. To this end, metasurface-based transceivers, consisting of microstrips of metamaterials, have recently emerged as an efficient enabler of extreme mMIMO systems. In this paper, we consider metasurface-based receivers with a $1$-bit Analog-to-Digital Converter (ADC) per microstrip and develop an analytical framework for the optimization of the analog and digital combining matrices. Our numerical results, including comparisons with fully digital, infinite-resolution MIMO, provide useful insights into the role of various system parameters.