Measuring the Difficulty of Distance-Based Indexing

3 November 2005 - updated 12 May 2008
[Site traffic Strip-O-Meter]

Click to censor the Strip-O-Meter.

Here's a link to the publisher's Web page for my paper on distance-based indexing, which I presented at SPIRE '05 in the first week of November 2005. It's © Springer-Verlag (the publisher of the printed version) and in deference to their wish for a one-year blackout to encourage purchase of the book, I've waited before posting the paper here; but now that the year is up I'm posting the 410K PDF file for people who don't have access to Springer-Verlag's online service. Abstract below; BIBTeX database of this paper and my references also available.

Data structures for similarity search are commonly evaluated on data in vector spaces, but distance-based data structures are also applicable to non-vector spaces with no natural concept of dimensionality. The intrinsic dimensionality statistic of Chávez and Navarro provides a way to compare the performance of similarity indexing and search algorithms across different spaces, and predict the performance of index data structures on non-vector spaces by relating them to equivalent vector spaces. We characterise its asymptotic behaviour, and give experimental results to calibrate these comparisons.

Comments

No comments yet.

New comments are disabled, pending transition to new site code.
Copyright 2005, 2008 Matthew Skala
Updates to this site: [RSS syndication file]