MITIGATION OF CONFLICTS BETWEEN CONTENT MATCHERS IN AUTOMATED DOCUMENT ANALYSIS
Abstract A method and apparatus for performing, by a processing device implementing a plurality of content matchers that each identify occurrences of respectively corresponding content types, automated document analysis of a document comprising a body of text, the method comprising: executing, by th...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Abstract A method and apparatus for performing, by a processing device implementing a plurality of content matchers that each identify occurrences of respectively corresponding content types, automated document analysis of a document comprising a body of text, the method comprising: executing, by the processing device, each content matcher of the plurality of content matchers to identify, for each content matcher, a match in the body of the text and assigning a match strength to each match, where each match is an occurrence of a content type corresponding to the content matcher of the plurality of content matchers that identified the match; identifying, by the processing device, a conflict in content types between a first match assigned by a first content matcher of the plurality of conflict matchers and a second match assigned by a second content matcher of the plurality of content matchers, the first match having a first match strength and the second match having a second match strength; determining, by the processing device, whether either of the first match strength or the second match strength is greater than the other; responsive to a determination that neither of the first match strength and the second match strength is greater than the other, determining whether either of the first matcher rank or the second matcher rank is greater than the other; responsive to a determination that one of the first matcher rank and the second matcher rank is greater than the other, discarding the match of the first and second matches corresponding to the lesser of the first and second matcher ranks; and responsive to a determination that the first matcher rank and the second matcher rank are equal, determining, by the processing device, whether to discard both the first and the second matches or to keep both the first and second matches. |
---|