Abstract:In-memory computing is becoming a popular architecture for deep-learning hardware accelerators recently due to its highly parallel computing, low power, and low area cost. However, in-RRAM computing (IRC) suffered from large device variation and numerous nonideal effects in hardware. Although previous approaches including these effects in model training successfully improved variation tolerance, they only considered part of the nonideal effects and relatively simple classification tasks. This paper proposes a joint hardware and software optimization strategy to design a hardware-robust IRC macro for object detection. We lower the cell current by using a low word-line voltage to enable a complete convolution calculation in one operation that minimizes the impact of nonlinear addition. We also implement ternary weight mapping and remove batch normalization for better tolerance against device variation, sense amplifier variation, and IR drop problem. An extra bias is included to overcome the limitation of the current sensing range. The proposed approach has been successfully applied to a complex object detection task with only 3.85\% mAP drop, whereas a naive design suffers catastrophic failure under these nonideal effects.