When the regression function belongs to the standard smooth classes consisting of univariate functions with derivatives up to the $(\gamma+1)$th order bounded by a common constant everywhere or a.e., it is well known that the minimax optimal rate of convergence in mean squared error (MSE) is $\left(\frac{\sigma^{2}}{n}\right)^{\frac{2\gamma+2}{2\gamma+3}}$ when $\gamma$ is finite and the sample size $n\rightarrow\infty$. From a nonasymptotic viewpoint that considers finite $n$, this paper shows that: for the standard H\"older and Sobolev classes, the minimax optimal rate is $\frac{\sigma^{2}\left(\gamma\vee1\right)}{n}$ when $\frac{n}{\sigma^{2}}\precsim\left(\gamma\vee1\right)^{2\gamma+3}$ and $\left(\frac{\sigma^{2}}{n}\right)^{\frac{2\gamma+2}{2\gamma+3}}$ when $\frac{n}{\sigma^{2}}\succsim\left(\gamma\vee1\right)^{2\gamma+3}$. To establish these results, we derive upper and lower bounds on the covering and packing numbers for the generalized H\"older class where the $k$th ($k=0,...,\gamma$) derivative is bounded from above by a parameter $R_{k}$ and the $\gamma$th derivative is $R_{\gamma+1}-$Lipschitz (and also for the generalized ellipsoid class of smooth functions). Our bounds sharpen the classical metric entropy results for the standard classes, and give the general dependence on $\gamma$ and $R_{k}$. By deriving the minimax optimal MSE rates under $R_{k}=1$, $R_{k}\leq\left(k-1\right)!$ and $R_{k}=k!$ (with the latter two cases motivated in our introduction) with the help of our new entropy bounds, we show a couple of interesting results that cannot be shown with the existing entropy bounds in the literature. For the H\"older class of $d-$variate functions, our result suggests that the classical asymptotic rate $\left(\frac{\sigma^{2}}{n}\right)^{\frac{2\gamma+2}{2\gamma+2+d}}$ could be an underestimate of the MSE in finite samples.