We study the classification or detection problems where the label only suggests whether any instance of a class exists or does not exist in a training sample. No further information, e.g., the number of instances of each class, their locations or relative orders in the training data, is exploited. The model can be learned by maximizing the likelihood of the event that in a given training sample, instances of certain classes exist, while no instance of other classes exists. We use image recognition as the example task to develop our method, although it is applicable to data with higher or lower dimensions without much modification. Our method can be used to learn all convolutional neural networks for object detection and localization, e.g., reading street view house numbers in images with varying sizes, without using any further processing.