In this paper, we propose a new, scalable approach for the task of object based image search or object recognition. Despite the very large literature existing on the scalability issues in CBIR in the sense of retrieval approaches, the scalability of media and scalability of features remain an issue. In our work we tackle the problem of scalability and structural organization of features. The proposed features are nested local graphs built upon sets of SURF feature points with Delaunay triangulation. A Bag-of-Visual-Words (BoVW) framework is applied on these graphs, giving birth to a Bag-of-Graph-Words representation. The nested nature of the descriptors consists in scaling from trivial Delaunay graphs - isolated feature points - by increasing the number of nodes layer by layer up to graphs with maximal number of nodes. For each layer of graphs its proper visual dictionary is built. The experiments conducted on the SIVAL data set reveal that the graph features at different layers exhibit complementary performances on the same content. The nested approach, the combination of all existing layers, yields significant improvement of the object recognition performance compared to single level approaches.