This paper was originally published as a Technical Report No.210 in January 2005 for the Ivan E. Seidenberg School of Computer Science and Information Systems (ISSCSIS), Pace university.

Document Type



Similarity and dissimilarity measures play an important role in pattern classification and clustering. For a century, researchers have searched for a good measure. Here, we review, categorize, and evaluate various binary vector similarity/dissimilarity measures. One of the most contentious disputes in the similarity measure selection problem is whether the measure includes or excludes negative matches. While inner-product based similarity measures consider only positive matches, other conventional measures credit both positive and negative matches equally. Hence, we propose an enhanced similarity measure that gives variable credits and show that it is superior to conventional measures in iris biometric authentication and offline handwritten character recognition applications. Finally, the proposed similarity measure can be further boosted by applying weights and we demonstrate that it outperforms the weighted Hamming distance.