I have a list of names and lastnames on a CSV file, lets suppose
Michael,Keaton
Matt,Damon
Jim,Carrey
and I need to evaluate from typewriter node to check if someone is on the list… If the name is typed as the list, I got a hit with the Sift Node, but if just one character is different I wont get a Hit.
What is the best way to do a evaluation that returns a percentage of similarity between 2 strings ?
Example “Michael,Keaton” vs “Michael,Keaton” 100%
Example “Michael,Keaton” vs “Michael,Weaton” 96%
Example “Michael,Keaton” vs “Michael” 47% onlist.v4p (9.8 KB)
one thing you can try is the kNearestNeighbour (String) classifier in the machine learning pack:
for training, give each of the inputs its own class and see what it spits out with the classifier.
internally it uses a string-distance function that, for two different strings, it calculates the steps of change it has to make to get to the other strings. kNearestNeighbour_String_.v4p (7.7 KB)
and iirc @microdee had some nodes to calculate string-distances as well…
yup, they are in mp.essentials and they are quite usable although not the best design. I’m planning to have a node where you can select algorithm in the future instead of all of them having separate nodes.