In undirected graphical models, learning the graph structure and learning the functions that relate the predictive variables (features) to the responses given the structure are two topics that have been widely investigated in machine learning and statistics. Learning graphical models in two stages will have problems because graph structure may change after considering the features. The main contribution of this paper is the proposed method that learns the graph structure and functions on the graph at the same time. General graphical models with binary outcomes conditioned on predictive variables are proved to be equivalent to multivariate Bernoulli model. The reparameterization of the potential functions in graphical model by conditional log odds ratios in multivariate Bernoulli model offers advantage in the representation of the conditional independence structure in the model. Additionally, we impose a structure penalty on groups of conditional log odds ratios to learn the graph structure. These groups of functions are designed with overlaps to enforce hierarchical function selection. In this way, we are able to shrink higher order interactions to obtain a sparse graph structure. Simulation studies show that the method is able to recover the graph structure. The analysis of county data from Census Bureau gives interesting relations between unemployment rate, crime and others discovered by the model.