Page 1 of 1

Fuzzy logic - Damerau-Levenshtein

Posted: Tue Jan 18, 2022 10:29 pm
by HSeguin
I want to match names (including 2 fields: first name and last name). I found on this site the very useful APL functions (dist and fuzzy) based on the Levenshtein distance methodology that address this issue:

I have various sources with peoples names that I am trying to match. Even if most of this info has been input by the people themselves, you are faced with a multitude of typing errors. If the Levenshtein distance is low (ex: 1) I would assume that there is a match.

Only problem is when there is a character inversion. With no other difference, the Levenshtein approach would calculate a value of 2. Ex: 'Hubert' dist 'Hubetr'.

I think that this is a bit high when in fact this is only 1 error.

I read that a variant is the Damerau-Levenshtein methodology that does exactly that.

Can anybody tell me if a modified version of the dist function exists somewhere?


Hubert Séguin