I recently gave an informal talk at the 2009 BIRS Workshop on Mathematics of String Spaces and Algorithmic Applications. This is mostly an expanded and retreaded version of my SISAP'08 "Reverse Similarity Search" talk. So why post another page about it? Because for the BIRS version of the talk, I was wearing a wire, and there's a voice recording (14Megabyte MP3 file, 50 minutes) which you can listen to while you follow along in the PDF slides. Here's the abstract:
We discuss the complexity of a class of decision problems derived from distance-based index data structures. Finding a point that would be stored in a given leaf of a data structure, or proving that no such point exists, may be polynomial-time or NP-complete depending on the underlying metric space. This problem arises from questions of security in robust hash or obfuscated recognition applications. The problem is analysed for spaces of strings and vectors with a variety of metrics: strings with Hamming distance; the usual (Levenshtein) edit distance; an edit distance we introduce called Superghost distance; arbitrary weighted tree metrics; and real vectors with Minkowski $L_p$ metrics (of which the Euclidean distance is a special case). We also summarize some results on distance permutations and blind substring search.