Fine-grained person perception such as body segmentation and pose estimation has been achieved with many 2D and 3D sensors such as RGB/depth cameras, radars (e.g., RF-Pose) and LiDARs. These sensors capture 2D pixels or 3D point clouds of person bodies with high spatial resolution, such that the existing Convolutional Neural Networks can be directly applied for perception. In this paper, we take one step forward to show that fine-grained person perception is possible even with 1D sensors: WiFi antennas. To our knowledge, this is the first work to perceive persons with pervasive WiFi devices, which is cheaper and power efficient than radars and LiDARs, invariant to illumination, and has little privacy concern comparing to cameras. We used two sets of off-the-shelf WiFi antennas to acquire signals, i.e., one transmitter set and one receiver set. Each set contains three antennas lined-up as a regular household WiFi router. The WiFi signal generated by a transmitter antenna, penetrates through and reflects on human bodies, furniture and walls, and then superposes at a receiver antenna as a 1D signal sample (instead of 2D pixels or 3D point clouds). We developed a deep learning approach that uses annotations on 2D images, takes the received 1D WiFi signals as inputs, and performs body segmentation and pose estimation in an end-to-end manner. Experimental results on over 100000 frames under 16 indoor scenes demonstrate that Person-in-WiFi achieved person perception comparable to approaches using 2D images.