As a massive number of the Internet of Things (IoT) devices are deployed, the security and privacy issues in IoT arouse more and more attention. The IoT attacks are causing tremendous loss to the IoT networks and even threatening human safety. Compared to traditional networks, IoT networks have unique characteristics, which make the attack detection more challenging. First, the heterogeneity of platforms, protocols, software, and hardware exposes various vulnerabilities. Second, in addition to the traditional high-rate attacks, the low-rate attacks are also extensively used by IoT attackers to obfuscate the legitimate and malicious traffic. These low-rate attacks are challenging to detect and can persist in the networks. Last, the attackers are evolving to be more intelligent and can dynamically change their attack strategies based on the environment feedback to avoid being detected, making it more challenging for the defender to discover a consistent pattern to identify the attack. In order to adapt to the new characteristics in IoT attacks, we propose a reinforcement learning-based attack detection model that can automatically learn and recognize the transformation of the attack pattern. Therefore, we can continuously detect IoT attacks with less human intervention. In this paper, we explore the crucial features of IoT traffics and utilize the entropy-based metrics to detect both the high-rate and low-rate IoT attacks. Afterward, we leverage the reinforcement learning technique to continuously adjust the attack detection threshold based on the detection feedback, which optimizes the detection and the false alarm rate. We conduct extensive experiments over a real IoT attack dataset and demonstrate the effectiveness of our IoT attack detection framework.