Optimizing the input probability distribution of a discrete-time channel is a standard step in the information-theoretic analysis of digital communication systems. Nevertheless, many practical communication systems transmit uniformly and independently distributed symbols drawn from regular constellation sets. The introduction of the probabilistic amplitude shaping architecture has renewed interest in using optimized probability distributions, i.e., probabilistic shaping. Traditionally, probabilistic shaping has been employed to reduce the transmit power required for a given information rate over additive noise channels. While this translates into substantive performance gains for optical fiber communication systems, the interaction of shaping and fiber nonlinearity has posed intriguing questions. At first glance, probabilistic shaping seems to exacerbate nonlinear interference noise (NLIN) due to larger higher-order standardized moments. Therefore, the optimization of shaping distributions must differ from those used for linear channels. Secondly, finite-length effects related to the memory of the nonlinear fiber channel have been observed. This suggests that the marginal input-symbol distribution is not the only consideration. This paper provides a tutorial-style discussion of probabilistic shaping for optical fiber communication. Since the distinguishing property of the channel is the signal-dependent NLIN, we speak of probabilistic shaping for nonlinearity tolerance. Our analysis builds on the first-order time-domain perturbation approximation of the nonlinear fiber channel and revisits the notion of linear and nonlinear shaping gain. We largely focus on probabilistic amplitude shaping with popular shaping methods. The concept of shaping via sequence selection is given special consideration, as it inherently optimizes a multivariate distribution for shaped constellations.