We consider the problem of learning a certain type of lexical semantic knowledge that can be expressed as a binary relation between words, such as the so-called sub-categorization of verbs (a verb-noun relation) and the compound noun phrase relation (a noun-noun relation). Specifically, we view this problem as an on-line learning problem in the sense of Littlestone's learning model in which the learner's goal is to minimize the total number of prediction mistakes. In the computational learning theory literature, Goldman, Rivest and Schapire and subsequently Goldman and Warmuth have considered the on-line learning problem for binary relations R : X * Y -> {0, 1} in which one of the domain sets X can be partitioned into a relatively small number of types, namely clusters consisting of behaviorally indistinguishable members of X. In this paper, we extend this model and suppose that both of the sets X, Y can be partitioned into a small number of types, and propose a host of prediction algorithms which are two-dimensional extensions of Goldman and Warmuth's weighted majority type algorithm proposed for the original model. We apply these algorithms to the learning problem for the `compound noun phrase' relation, in which a noun is related to another just in case they can form a noun phrase together. Our experimental results show that all of our algorithms out-perform Goldman and Warmuth's algorithm. We also theoretically analyze the performance of one of our algorithms, in the form of an upper bound on the worst case number of prediction mistakes it makes.