Conventional kernel adaptive filtering (KAF) uses a prescribed, positive definite, nonlinear function to define the Reproducing Kernel Hilbert Space (RKHS), where the optimal solution for mean square error estimation is approximated using search techniques. Instead, this paper proposes to embed the full statistics of the input data in the kernel definition, obtaining the first analytical solution for nonlinear regression and nonlinear adaptive filtering applications. We call this solution the Functional Wiener Filter (FWF). Conceptually, the methodology is an extension of Parzen's work on the autocorrelation RKHS to nonlinear functional spaces. We provide an extended functional Wiener equation, and present a solution to this equation in an explicit, finite dimensional, data-dependent RKHS. We further explain the necessary requirements to compute the analytical solution in RKHS, which is beyond traditional methodologies based on the kernel trick. The FWF analytic solution to the nonlinear minimum mean square error problem has better accuracy than other kernel-based algorithms in synthetic, stationary data. In real world time series, it has comparable accuracy to KAF but displays constant complexity with respect to number of training samples. For evaluation, it is as computationally efficient as the Wiener solution (with a larger number of dimensions than the linear case). We also show how the difference equation learned by the FWF from data can be extracted leading to system identification applications, which extend the possible applications of the FWF beyond optimal nonlinear filtering.