Polar codes have gained significant attention in channel coding for their ability to approach the capacity of binary input discrete memoryless channels (B-DMCs), thanks to their reliability and efficiency in transmission. However, existing decoders often struggle to balance hardware area and performance. Stochastic computing offers a way to simplify circuits, and previous work has implemented decoding using this approach. A common issue with these methods is performance degradation caused by the introduction of correlation. This paper presents an Efficient Correlated Stochastic Polar Decoder (ECS-PD) that fundamentally addresses the issue of the `hold-state', preventing it from increasing as correlation computation progresses. We propose two optimization strategies aimed at reducing iteration latency, increasing throughput, and simplifying the circuit to improve hardware efficiency. The optimization can reduce the number of iterations by 25.2% at $E_b/N_0$ = 3 dB. Compared to other efficient designs, the proposed ECS-PD achieves higher throughput and is 2.7 times more hardware-efficient than the min-sum decoder.