We introduce a new pipeline for hand localization and fingertip detection. For RGB images captured from an egocentric vision mobile camera, hand and fingertip detection remains a challenging problem due to factors like background complexity and hand shape variety. To address these issues accurately and robustly, we build a large scale dataset named Ego-Fingertip and propose a bi-level cascaded pipeline of convolutional neural networks, namely, Attention-based Hand Detector as well as Multi-point Fingertip Detector. The proposed method significantly tackles challenges and achieves satisfactorily accurate prediction and real-time performance compared to previous hand and fingertip detection methods.