Abstract:Deep reinforcement learning excels in various domains but lacks generalizability and interoperability. Programmatic RL methods (Trivedi et al., 2021; Liu et al., 2023) reformulate solving RL tasks as synthesizing interpretable programs that can be executed in the environments. Despite encouraging results, these methods are limited to short-horizon tasks. On the other hand, representing RL policies using state machines (Inala et al., 2020) can inductively generalize to long-horizon tasks; however, it struggles to scale up to acquire diverse and complex behaviors. This work proposes Program Machine Policies (POMPs), which bridge the advantages of programmatic RL and state machine policies, allowing for the representation of complex behaviors and the address of long-term tasks. Specifically, we introduce a method that can retrieve a set of effective, diverse, compatible programs. Then, we use these programs as modes of a state machine and learn a transition function to transition among mode programs, allowing for capturing long-horizon repetitive behaviors. Our proposed framework outperforms programmatic RL and deep RL baselines on various tasks and demonstrates the ability to generalize to even longer horizons without any fine-tuning inductively. Ablation studies justify the effectiveness of our proposed search algorithm for retrieving a set of programs as modes.
Abstract:Phase aberration is an inherent side effect of ultrasound imaging due to the speed of sound inhomogeneity nature of human tissues, resulting in focusing error and reduced image contrast. This work introduces a phase aberration correction technique by leveraging a point spread function (PSF) restoration filter. A convolutional neural network (CNN) is used to estimate phase-aberrated PSFs and design the restoration filter. In addition, we incorporate coherence index weighting, derived from the restoration filtering, to further suppress sidelobe energy. Evaluation using Field II-simulated phantoms showed clearer cyst borders and reduced sidelobe energy leakage after PSF restoration and filter-derived coherence weighting, leading to improvement in image contrast and quality.