Profile of a dictionary compiled from scanning over one million words of surgical pathology narrative text
An anatomic pathology natural language dictionary (LEXICON) has evolved over a nine-year period, a result of scanning over one million words of narrative text from tissue examination request forms and surgical pathology reports. The text is parsed into individual words which are looked up in LEXICON...
Gespeichert in:
Veröffentlicht in: | Computers and biomedical research 1980-08, Vol.13 (4), p.382-398 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | An anatomic pathology natural language dictionary (LEXICON) has evolved over a nine-year period, a result of scanning over one million words of narrative text from tissue examination request forms and surgical pathology reports. The text is parsed into individual words which are looked up in LEXICON and flagged by action codes which determine usage in constructing a KWIC index file and an on-line database retrievable by keywords. The LEXICON now resides on an IBM
370
168
system and has survived several transfers between computer systems. An update program is used after each batch of narrative text is scanned to modify LEXICON. LEXICON now contains 24,228 medical and nonmedical terms, 24.8% are errors (misspellings), 45.9% are keywords retrievable on and off line, 52.2% of the words are cross-referenced to a supplementary word. A preliminary study shows that many of the “nonmedical” terms in LEXICON carry significant medical information, and that there is considerable overlap of medical words among LEXICON, SNOMED, and ICDA-8. Our LEXICON appears to be an intermediate step in the process of evolving an algorithm capable of “understanding” medical narrative text. |
---|---|
ISSN: | 0010-4809 1090-2368 |
DOI: | 10.1016/0010-4809(80)90029-4 |