Abstract
Thanks to the fast development of Internet and image capturing devices, the available images online have gone through an exponential growth. Efficient indexing and retrieval methods are crucial in order to leverage the web image dataset. This has important impact to a number of research areas such as image recognition, image retrieval and computer graphics. In this chapter, we review the current popular image representation and corresponding large-scale index technologies. For global representation, we review tree and hash based index structures. For local features, which recently receive lots of attention for their invariance properties to lighting, scale and rotation, we review inverted list indexing and the related “long query problem”. Then we introduce an image decomposition approach to convert the local feature representation from high dimensional sparse feature vectors to (relatively) low dimensional dense feature vectors with residual information. We also discuss a specially designed index structure to facilitate efficient storage and retrieval for this image representation. At the end of the chapter, we present extensive experiment results on a 2.3 million image database to demonstrate the efficacy of the image decomposition approach.
Keywords: Image retrieval, image indexing, data driven image understanding, inverted list, long query, search engine, image search, visual search, bag of words, dimension reduction, latent dirichlet allocation.