Network delays, throughput bottlenecks and privacy issues push Artificial Intelligence of Things (AIoT) designers towards evaluating the feasibility of moving model training and execution (inference) as near as possible to the terminals. Meanwhile, results from the TinyML community demonstrate that, in some cases, it is possible to execute model inference directly on the terminals themselves, even if these are small microcontroller-based devices. However, to date, researchers and practitioners in the domain lack convenient all-in-one toolkits to help them evaluate the feasibility of moving execution of arbitrary models to arbitrary low-power IoT hardware. To this effect, we present in this paper U-TOE, a universal toolkit we designed to facilitate the task of AIoT designers and researchers, by combining functionalities from a low-power embedded OS, a generic model transpiler and compiler, an integrated performance measurement module, and an open-access remote IoT testbed. We provide an open source implementation of U-TOE and we demonstrate its use to experimentally evaluate the performance of a wide variety of models, on a wide variety of low-power boards, based on popular microcontroller architectures (ARM Cortex-M and RISC-V). U-TOE thus allows easily reproducible and customisable comparative evaluation experiments in this domain, on a wide variety of IoT hardware all-at-once. The availability of a toolkit such as U-TOE is desirable to accelerate the field of AIoT, towards fully exploiting the potential of edge computing.