This Wiki Page discusses the High Level Component Architecture of OpenMatchEngine.
Search
Search component is responsible for identifying likely matching records (a.k.a potential candidates) of a record. The idea is to reduce the number of redundant comparisons by generating an array of finger printing bands for a record in which all its likely matching records will have atleast one of their finger prints.
Each record will have n finger prints and m finger printing bands. A finger printing band will have a start finger print and an end finger print. The number of finger prints and finger printing bands will differ from record to record. Every record will have atleast one of its finger prints falling between atleast one of its finger printing ranges.
Match
Match component is responsible for doing fuzzy matching between two records. The idea is to generate a similarity score or confidence score or match score between two records being compared on the basis of how far apart they are.
RulesProcessor
RulesProcessor component is responsible for applying various sets of rules on a record as it is being searched and matched. One example of a rule is to ignore "Inc." and "Corp" from a record.
Rules
Rules represents a Knowledge base of rules which must be processed as a record is being searched or matched.
Stablization
Stablization component is responsible for stablizing a record for Phonetic and typo errors.
NYSIIS and Double Meta Phone are example of Stablization algorithms.
Comments (0)
You don't have permission to comment on this page.