Emerging Internet-of-Things sensing applications rely on ultra low-power (ULP) microcontroller units (MCUs) that wirelessly transmit data to the cloud. Typical MCUs nowadays consist of generic blocks, except for the protocol-specific radios implemented in hardware. Hardware radios however slow down the evolution of wireless protocols due to retrocompatiblity concerns. In this work, we explore a software-defined radio architecture by demonstrating a LoRa transceiver running on custom ULP MCU codenamed SleepRider with an ARM Cortex-M4 CPU. In SleepRider MCU, we offload the generic baseband operations (e.g., low-pass filtering) to a reconfigurable digital front-end block and use the Cortex-M4 CPU to perform the protocol-specific computations. Our software implementation of the LoRa physical layer only uses the native SIMD instructions of the Cortex-M4 to achieve real-time transmission and reception of LoRa packets. SleepRider MCU has been fabricated in a 28nm FDSOI CMOS technology and is used in a testbed to experimentally validate the software implementation. Experimental results show that the proposed software-defined radio requires only a CPU frequency of 20 MHz to correctly receive a LoRa packet, with an ultra-low power consumption of 0.42 mW on average.