This paper focuses on a robotic picking tasks in cluttered scenario. Because of the diversity of objects and clutter by placing, it is much difficult to recognize and estimate their pose before grasping. Here, we use U-net, a special Convolution Neural Networks (CNN), to combine RGB images and depth information to predict picking region without recognition and pose estimation. The efficiency of diverse visual input of the network were compared, including RGB, RGB-D and RGB-Points. And we found the RGB-Points input could get a precision of 95.74%.