Here's a link to the publisher's Web page for my paper on distance-based indexing, which I presented at SPIRE '05 in the first week of November 2005. It's © Springer-Verlag (the publisher of the printed version) and in deference to their wish for a one-year blackout to encourage purchase of the book, I've waited before posting the paper here; but now that the year is up I'm posting the 410K PDF file for people who don't have access to Springer-Verlag's online service. Abstract below; BIBTeX database of this paper and my references also available.
Data structures for similarity search are commonly evaluated on data in vector spaces, but distance-based data structures are also applicable to non-vector spaces with no natural concept of dimensionality. The intrinsic dimensionality statistic of Chávez and Navarro provides a way to compare the performance of similarity indexing and search algorithms across different spaces, and predict the performance of index data structures on non-vector spaces by relating them to equivalent vector spaces. We characterise its asymptotic behaviour, and give experimental results to calibrate these comparisons.
No comments yet.