Abstract:Learning Automaton (LA) is an adaptive self-organized model that improves its action-selection through interaction with an unknown environment. LA with finite action set can be classified into two main categories: fixed and variable structure. Furthermore, variable action-set learning automaton (VASLA) is one of the main subsets of variable structure learning automaton. In this paper, we propose VDHLA, a novel hybrid learning automaton model, which is a combination of fixed structure and variable action set learning automaton. In the proposed model, variable action set learning automaton can increase, decrease, or leave unchanged the depth of fixed structure learning automaton during the action switching phase. In addition, the depth of the proposed model can change in a symmetric (SVDHLA) or asymmetric (AVDHLA) manner. To the best of our knowledge, it is the first hybrid model that intelligently changes the depth of fixed structure learning automaton. Several computer simulations are conducted to study the performance of the proposed model with respect to the total number of rewards and action switching in stationary and non-stationary environments. The proposed model is compared with FSLA and VSLA. In order to determine the performance of the proposed model in a practical application, the selfish mining attack which threatens the incentive-compatibility of a proof-of-work based blockchain environment is considered. The proposed model is applied to defend against the selfish mining attack in Bitcoin and compared with the tie-breaking mechanism, which is a well-known defense. Simulation results in all environments have shown the superiority of the proposed model.