In this paper, we study a new latency optimization problem for Blockchain-based federated learning (BFL) in multi-server edge computing. In this system model, distributed mobile devices (MDs) communicate with a set of edge servers (ESs) to handle both machine learning (ML) model training and block mining simultaneously. To assist the ML model training for resource-constrained MDs, we develop an offloading strategy that enables MDs to transmit their data to one of the associated ESs. We then propose a new decentralized ML model aggregation solution at the edge layer based on a consensus mechanism to build a global ML model via peer-to-peer (P2P)-based Blockchain communications. We then formulate latency-aware BFL as an optimization aiming to minimize the system latency via joint consideration of the data offloading decisions, MDs' transmit power, channel bandwidth allocation for MDs' data offloading, MDs' computational allocation, and hash power allocation. To address the mixed action space of discrete offloading and continuous allocation variables, we propose a novel deep reinforcement learning scheme with a holistic design of a parameterized advantage actor critic (A2C) algorithm. Additionally, we theoretically characterize the convergence properties of the proposed BFL system in terms of the aggregation delay, mini-batch size, and number of P2P communication rounds. Our subsequent numerical evaluation demonstrates the superior performance of our proposed scheme over existing approaches in terms of model training efficiency, convergence rate, and system latency.