Automatically assigning semantic role labels to parts of documents

Machine learning, artificial intelligence, and other computer-implemented methods are used to identify various semantically important chunks in documents, automatically label them with appropriate datatypes and semantic roles, and use this enhanced information to assist authors and to support downst...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Begun, Andrew Paul, Toprani, Bhaven, Jaffri, Taqi, Marti Orosa, Luis, Paoli, Jean, Taron, Michael, Sawicki, Marcin, Zhou, Xiaoquan, Pavlopoulou, Christina, Wu, Zhaofeng, Palmer, Michael, Gupta, Kush, Sarangi, Swagatika, Wadia, Zubin Rustom, Hoang, Andrew Minh, Pricoiu, Elena, Zhang, Yue, DeRose, Steven, Watson, David, Shehadeh, Manar, Paliakkara, Jerome George, Fan, Joshua Yongshin, Liu, Zhanlin, White, Eric
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Machine learning, artificial intelligence, and other computer-implemented methods are used to identify various semantically important chunks in documents, automatically label them with appropriate datatypes and semantic roles, and use this enhanced information to assist authors and to support downstream processes. Chunk locations, datatypes, and semantic roles can often be automatically determined from what is here called "context", to wit, the combination of their formatting, structure, and content; those of adjacent or nearby content; overall patterns of occurrence in a document, and similarities of all these things across documents (mainly but not exclusively among documents in the same document set). Similarity is not limited to exact or fuzzy string or property comparisons, but may include similarity of natural language grammatical structure, ML (machine learning) techniques such as measuring similarity of word, chunk, and other embeddings, and the datatypes and semantic roles of previously-identified chunks.