Ontology-Based Information Extraction for Labeling Radical Online Content Using Distant Supervision
Social media companies dedicate significant resources to create machine-learning models to label harmful content on their platforms, including content promoting violent, extremist beliefs. These models have to evolve over time to keep up with a dynamic threat landscape. Over time, as new violent ide...
Gespeichert in:
Veröffentlicht in: | Information systems research 2024-03, Vol.35 (1), p.203-225 |
---|---|
1. Verfasser: | |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 225 |
---|---|
container_issue | 1 |
container_start_page | 203 |
container_title | Information systems research |
container_volume | 35 |
creator | Etudo, Ugochukwu |
description | Social media companies dedicate significant resources to create machine-learning models to label harmful content on their platforms, including content promoting violent, extremist beliefs. These models have to evolve over time to keep up with a dynamic threat landscape. Over time, as new violent ideologies emerge, existing models will fail to detect them. Training fresh models for the task is risky (there are new model biases to understand), time consuming (you will need to see many examples to predict new examples), and cost-ineffective. We propose an approach that prioritizes the evolution and representation of radical ideas by creating a computer program to explicitly keep track of ideologies. We show how this program uses state-of-the art deep-learning models to create human and machine-readable representations of radical ideologies by automatically consuming content symbolic of those ideologies. Our approach validates the notion that violent ideologies differ in content but are homogenous in structure. With just a few examples of content, the program creates powerful representations that can be used to automatically detect additional content with surprising accuracy. This process greatly reduces the time and resources necessary to adapt existing content-labeling models to the changing ideological and rhetorical landscape.
Radical, terroristic organizations pose threats to business, government, and society. The ubiquity of the modern Web and its participatory architecture have enabled such groups to become full-blown online propaganda machines. Today, radicalization that eventually leads to acts of terror occurs predominantly on the Web. Radical ideologies can be spread, in many cases unchecked, by malicious actors who take advantage of the frequently lax surveillance apparatus of online social platforms. This paper argues that an overlooked, essential first step to interdicting this threat is the large-scale, structured collection of knowledge regarding these ideologies in open machine-readable formats. Using Collective Action Framing Theory, this study develops a trio of design artifacts: the Terror Beliefs Ontology (TBO) for a general ontology of terroristic ideology, the Frame Discovery System (FDS) to automatically populate this ontology, and the Frame Resonance Detection System (FRDS) to accurately identify online personae or postings that espouse a radical ideology known to TBO. With a comprehensive evaluation, we demonstrate how these three ins |
doi_str_mv | 10.1287/isre.2023.1223 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3059956097</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3059956097</sourcerecordid><originalsourceid>FETCH-LOGICAL-c362t-eeb5f19a085b0ffce09b72351938964ac391f4260298b79b5d52339afd4e67513</originalsourceid><addsrcrecordid>eNqFkEtPAyEUhYnRxFrduiZxPZXHMDMstVZt0qSJ2jVhGGhoplCBGvvvZVoTl27gHjjnXvgAuMVogklT39sY9IQgQrMk9AyMMCNVwRitznONyrqo83IJrmLcIIQo5XQE1NIl3_v1oXiUUXdw7owPW5msd3D2nYJUxzIfwoVsdW_dGr7JzirZw6XLUsOpd0m7BFdxuHyyMcms3vc7Hb5szOlrcGFkH_XN7z4Gq-fZx_S1WCxf5tOHRaFoRVKhdcsM5hI1rEXGKI14WxPKMKcNr0qpKMemJBUivGlr3rKOkfwJabpSVzXDdAzuTn13wX_udUxi4_fB5ZGCIsY5qxCvs2tycqngY0ZmxC7YrQwHgZEYQIoBpBhAigFkDsBTQCvvbPyzNw2nuET5jWNQnCz2iC_-1_IHWpeATw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3059956097</pqid></control><display><type>article</type><title>Ontology-Based Information Extraction for Labeling Radical Online Content Using Distant Supervision</title><source>Informs</source><creator>Etudo, Ugochukwu</creator><creatorcontrib>Etudo, Ugochukwu</creatorcontrib><description>Social media companies dedicate significant resources to create machine-learning models to label harmful content on their platforms, including content promoting violent, extremist beliefs. These models have to evolve over time to keep up with a dynamic threat landscape. Over time, as new violent ideologies emerge, existing models will fail to detect them. Training fresh models for the task is risky (there are new model biases to understand), time consuming (you will need to see many examples to predict new examples), and cost-ineffective. We propose an approach that prioritizes the evolution and representation of radical ideas by creating a computer program to explicitly keep track of ideologies. We show how this program uses state-of-the art deep-learning models to create human and machine-readable representations of radical ideologies by automatically consuming content symbolic of those ideologies. Our approach validates the notion that violent ideologies differ in content but are homogenous in structure. With just a few examples of content, the program creates powerful representations that can be used to automatically detect additional content with surprising accuracy. This process greatly reduces the time and resources necessary to adapt existing content-labeling models to the changing ideological and rhetorical landscape.
Radical, terroristic organizations pose threats to business, government, and society. The ubiquity of the modern Web and its participatory architecture have enabled such groups to become full-blown online propaganda machines. Today, radicalization that eventually leads to acts of terror occurs predominantly on the Web. Radical ideologies can be spread, in many cases unchecked, by malicious actors who take advantage of the frequently lax surveillance apparatus of online social platforms. This paper argues that an overlooked, essential first step to interdicting this threat is the large-scale, structured collection of knowledge regarding these ideologies in open machine-readable formats. Using Collective Action Framing Theory, this study develops a trio of design artifacts: the Terror Beliefs Ontology (TBO) for a general ontology of terroristic ideology, the Frame Discovery System (FDS) to automatically populate this ontology, and the Frame Resonance Detection System (FRDS) to accurately identify online personae or postings that espouse a radical ideology known to TBO. With a comprehensive evaluation, we demonstrate how these three instantiated design artifacts, working in concert, can automatically construct a knowledge representation of heterogeneous terroristic ideologies and accurately detect radical online postings. We offer the first design that can assign Web text to any radical ideology without the use of a hand-labeled training corpus.
History:
Olivia Sheng, Senior editor; Huimin Zhao, Associate Editor.
Funding:
This work was partially supported by the Virginia Commonwealth University Presidential Research Quest (PeRQ) Fund.
Supplemental Material:
The e-companion is available at
https://doi.org/10.1287/isre.2023.1223
.</description><identifier>ISSN: 1047-7047</identifier><identifier>EISSN: 1526-5536</identifier><identifier>DOI: 10.1287/isre.2023.1223</identifier><language>eng</language><publisher>Linthicum: INFORMS</publisher><subject>Collective Action Framing Theory ; distant supervision ; Ideology ; Knowledge representation ; named entity recognition ; Ontology ; Propaganda ; Radicalism ; relation extraction ; Social networks ; terrorism ; Websites</subject><ispartof>Information systems research, 2024-03, Vol.35 (1), p.203-225</ispartof><rights>Copyright Institute for Operations Research and the Management Sciences Mar 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c362t-eeb5f19a085b0ffce09b72351938964ac391f4260298b79b5d52339afd4e67513</citedby><cites>FETCH-LOGICAL-c362t-eeb5f19a085b0ffce09b72351938964ac391f4260298b79b5d52339afd4e67513</cites><orcidid>0000-0002-5690-4282 ; 0000-0002-1706-5383</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://pubsonline.informs.org/doi/full/10.1287/isre.2023.1223$$EHTML$$P50$$Ginforms$$H</linktohtml><link.rule.ids>314,780,784,3692,27924,27925,62616</link.rule.ids></links><search><creatorcontrib>Etudo, Ugochukwu</creatorcontrib><title>Ontology-Based Information Extraction for Labeling Radical Online Content Using Distant Supervision</title><title>Information systems research</title><description>Social media companies dedicate significant resources to create machine-learning models to label harmful content on their platforms, including content promoting violent, extremist beliefs. These models have to evolve over time to keep up with a dynamic threat landscape. Over time, as new violent ideologies emerge, existing models will fail to detect them. Training fresh models for the task is risky (there are new model biases to understand), time consuming (you will need to see many examples to predict new examples), and cost-ineffective. We propose an approach that prioritizes the evolution and representation of radical ideas by creating a computer program to explicitly keep track of ideologies. We show how this program uses state-of-the art deep-learning models to create human and machine-readable representations of radical ideologies by automatically consuming content symbolic of those ideologies. Our approach validates the notion that violent ideologies differ in content but are homogenous in structure. With just a few examples of content, the program creates powerful representations that can be used to automatically detect additional content with surprising accuracy. This process greatly reduces the time and resources necessary to adapt existing content-labeling models to the changing ideological and rhetorical landscape.
Radical, terroristic organizations pose threats to business, government, and society. The ubiquity of the modern Web and its participatory architecture have enabled such groups to become full-blown online propaganda machines. Today, radicalization that eventually leads to acts of terror occurs predominantly on the Web. Radical ideologies can be spread, in many cases unchecked, by malicious actors who take advantage of the frequently lax surveillance apparatus of online social platforms. This paper argues that an overlooked, essential first step to interdicting this threat is the large-scale, structured collection of knowledge regarding these ideologies in open machine-readable formats. Using Collective Action Framing Theory, this study develops a trio of design artifacts: the Terror Beliefs Ontology (TBO) for a general ontology of terroristic ideology, the Frame Discovery System (FDS) to automatically populate this ontology, and the Frame Resonance Detection System (FRDS) to accurately identify online personae or postings that espouse a radical ideology known to TBO. With a comprehensive evaluation, we demonstrate how these three instantiated design artifacts, working in concert, can automatically construct a knowledge representation of heterogeneous terroristic ideologies and accurately detect radical online postings. We offer the first design that can assign Web text to any radical ideology without the use of a hand-labeled training corpus.
History:
Olivia Sheng, Senior editor; Huimin Zhao, Associate Editor.
Funding:
This work was partially supported by the Virginia Commonwealth University Presidential Research Quest (PeRQ) Fund.
Supplemental Material:
The e-companion is available at
https://doi.org/10.1287/isre.2023.1223
.</description><subject>Collective Action Framing Theory</subject><subject>distant supervision</subject><subject>Ideology</subject><subject>Knowledge representation</subject><subject>named entity recognition</subject><subject>Ontology</subject><subject>Propaganda</subject><subject>Radicalism</subject><subject>relation extraction</subject><subject>Social networks</subject><subject>terrorism</subject><subject>Websites</subject><issn>1047-7047</issn><issn>1526-5536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNqFkEtPAyEUhYnRxFrduiZxPZXHMDMstVZt0qSJ2jVhGGhoplCBGvvvZVoTl27gHjjnXvgAuMVogklT39sY9IQgQrMk9AyMMCNVwRitznONyrqo83IJrmLcIIQo5XQE1NIl3_v1oXiUUXdw7owPW5msd3D2nYJUxzIfwoVsdW_dGr7JzirZw6XLUsOpd0m7BFdxuHyyMcms3vc7Hb5szOlrcGFkH_XN7z4Gq-fZx_S1WCxf5tOHRaFoRVKhdcsM5hI1rEXGKI14WxPKMKcNr0qpKMemJBUivGlr3rKOkfwJabpSVzXDdAzuTn13wX_udUxi4_fB5ZGCIsY5qxCvs2tycqngY0ZmxC7YrQwHgZEYQIoBpBhAigFkDsBTQCvvbPyzNw2nuET5jWNQnCz2iC_-1_IHWpeATw</recordid><startdate>20240301</startdate><enddate>20240301</enddate><creator>Etudo, Ugochukwu</creator><general>INFORMS</general><general>Institute for Operations Research and the Management Sciences</general><scope>OQ6</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope><orcidid>https://orcid.org/0000-0002-5690-4282</orcidid><orcidid>https://orcid.org/0000-0002-1706-5383</orcidid></search><sort><creationdate>20240301</creationdate><title>Ontology-Based Information Extraction for Labeling Radical Online Content Using Distant Supervision</title><author>Etudo, Ugochukwu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c362t-eeb5f19a085b0ffce09b72351938964ac391f4260298b79b5d52339afd4e67513</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Collective Action Framing Theory</topic><topic>distant supervision</topic><topic>Ideology</topic><topic>Knowledge representation</topic><topic>named entity recognition</topic><topic>Ontology</topic><topic>Propaganda</topic><topic>Radicalism</topic><topic>relation extraction</topic><topic>Social networks</topic><topic>terrorism</topic><topic>Websites</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Etudo, Ugochukwu</creatorcontrib><collection>ECONIS</collection><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><jtitle>Information systems research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Etudo, Ugochukwu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Ontology-Based Information Extraction for Labeling Radical Online Content Using Distant Supervision</atitle><jtitle>Information systems research</jtitle><date>2024-03-01</date><risdate>2024</risdate><volume>35</volume><issue>1</issue><spage>203</spage><epage>225</epage><pages>203-225</pages><issn>1047-7047</issn><eissn>1526-5536</eissn><abstract>Social media companies dedicate significant resources to create machine-learning models to label harmful content on their platforms, including content promoting violent, extremist beliefs. These models have to evolve over time to keep up with a dynamic threat landscape. Over time, as new violent ideologies emerge, existing models will fail to detect them. Training fresh models for the task is risky (there are new model biases to understand), time consuming (you will need to see many examples to predict new examples), and cost-ineffective. We propose an approach that prioritizes the evolution and representation of radical ideas by creating a computer program to explicitly keep track of ideologies. We show how this program uses state-of-the art deep-learning models to create human and machine-readable representations of radical ideologies by automatically consuming content symbolic of those ideologies. Our approach validates the notion that violent ideologies differ in content but are homogenous in structure. With just a few examples of content, the program creates powerful representations that can be used to automatically detect additional content with surprising accuracy. This process greatly reduces the time and resources necessary to adapt existing content-labeling models to the changing ideological and rhetorical landscape.
Radical, terroristic organizations pose threats to business, government, and society. The ubiquity of the modern Web and its participatory architecture have enabled such groups to become full-blown online propaganda machines. Today, radicalization that eventually leads to acts of terror occurs predominantly on the Web. Radical ideologies can be spread, in many cases unchecked, by malicious actors who take advantage of the frequently lax surveillance apparatus of online social platforms. This paper argues that an overlooked, essential first step to interdicting this threat is the large-scale, structured collection of knowledge regarding these ideologies in open machine-readable formats. Using Collective Action Framing Theory, this study develops a trio of design artifacts: the Terror Beliefs Ontology (TBO) for a general ontology of terroristic ideology, the Frame Discovery System (FDS) to automatically populate this ontology, and the Frame Resonance Detection System (FRDS) to accurately identify online personae or postings that espouse a radical ideology known to TBO. With a comprehensive evaluation, we demonstrate how these three instantiated design artifacts, working in concert, can automatically construct a knowledge representation of heterogeneous terroristic ideologies and accurately detect radical online postings. We offer the first design that can assign Web text to any radical ideology without the use of a hand-labeled training corpus.
History:
Olivia Sheng, Senior editor; Huimin Zhao, Associate Editor.
Funding:
This work was partially supported by the Virginia Commonwealth University Presidential Research Quest (PeRQ) Fund.
Supplemental Material:
The e-companion is available at
https://doi.org/10.1287/isre.2023.1223
.</abstract><cop>Linthicum</cop><pub>INFORMS</pub><doi>10.1287/isre.2023.1223</doi><tpages>23</tpages><orcidid>https://orcid.org/0000-0002-5690-4282</orcidid><orcidid>https://orcid.org/0000-0002-1706-5383</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1047-7047 |
ispartof | Information systems research, 2024-03, Vol.35 (1), p.203-225 |
issn | 1047-7047 1526-5536 |
language | eng |
recordid | cdi_proquest_journals_3059956097 |
source | Informs |
subjects | Collective Action Framing Theory distant supervision Ideology Knowledge representation named entity recognition Ontology Propaganda Radicalism relation extraction Social networks terrorism Websites |
title | Ontology-Based Information Extraction for Labeling Radical Online Content Using Distant Supervision |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-30T20%3A58%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Ontology-Based%20Information%20Extraction%20for%20Labeling%20Radical%20Online%20Content%20Using%20Distant%20Supervision&rft.jtitle=Information%20systems%20research&rft.au=Etudo,%20Ugochukwu&rft.date=2024-03-01&rft.volume=35&rft.issue=1&rft.spage=203&rft.epage=225&rft.pages=203-225&rft.issn=1047-7047&rft.eissn=1526-5536&rft_id=info:doi/10.1287/isre.2023.1223&rft_dat=%3Cproquest_cross%3E3059956097%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3059956097&rft_id=info:pmid/&rfr_iscdi=true |