Finite element discretizations of problems in computational physics often rely on adaptive mesh refinement (AMR) to preferentially resolve regions containing important features during simulation. However, these spatial refinement strategies are often heuristic and rely on domain-specific knowledge or trial-and-error. We treat the process of adaptive mesh refinement as a local, sequential decision-making problem under incomplete information, formulating AMR as a partially observable Markov decision process. Using a deep reinforcement learning approach, we train policy networks for AMR strategy directly from numerical simulation. The training process does not require an exact solution or a high-fidelity ground truth to the partial differential equation at hand, nor does it require a pre-computed training dataset. The local nature of our reinforcement learning formulation allows the policy network to be trained inexpensively on much smaller problems than those on which they are deployed. The methodology is not specific to any particular partial differential equation, problem dimension, or numerical discretization, and can flexibly incorporate diverse problem physics. To that end, we apply the approach to a diverse set of partial differential equations, using a variety of high-order discontinuous Galerkin and hybridizable discontinuous Galerkin finite element discretizations. We show that the resultant deep reinforcement learning policies are competitive with common AMR heuristics, generalize well across problem classes, and strike a favorable balance between accuracy and cost such that they often lead to a higher accuracy per problem degree of freedom.