Are Foundation Models the Next-Generation Social Media Content Moderators?
Recent progress in artificial intelligence (AI) tools and systems has been significant, especially in their reasoning and efficiency. Notable examples include generative AI-based large language models (LLMs) like Generative Pre-trained Transformer 3.5 (GPT-3.5), GPT-4, and Gemini, among others. In o...
Gespeichert in:
Veröffentlicht in: | IEEE intelligent systems 2024-11, Vol.39 (6), p.70-80 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 80 |
---|---|
container_issue | 6 |
container_start_page | 70 |
container_title | IEEE intelligent systems |
container_volume | 39 |
creator | Nadeem, Mohammad Javed, Laeeba Sohail, Shahab Saquib Cambria, Erik Hussain, Amir Cambria, Erik |
description | Recent progress in artificial intelligence (AI) tools and systems has been significant, especially in their reasoning and efficiency. Notable examples include generative AI-based large language models (LLMs) like Generative Pre-trained Transformer 3.5 (GPT-3.5), GPT-4, and Gemini, among others. In our work, we evaluated the effectiveness of fine-tuned deep learning models compared to general-purpose LLMs in moderating image-based content. We used deep learning models such as convolutional neural networks, ResNet50, and VGG-16, trained them for violence detection on an image dataset, and tested them on a separate dataset. The same test dataset was also evaluated using Large Language and Vision Assistant (LLaVa) and GPT-4, two LLMs that can process images. The results demonstrate that VGG-16 model had the highest accuracy at 0.94, while LLaVa had the lowest at 0.66. GPT-4 showed superiority over LLaVa with an accuracy value of 0.9242. LLaVa recorded the highest precision of all models. |
doi_str_mv | 10.1109/MIS.2024.3477109 |
format | Article |
fullrecord | <record><control><sourceid>ieee_RIE</sourceid><recordid>TN_cdi_ieee_primary_10779331</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10779331</ieee_id><sourcerecordid>10779331</sourcerecordid><originalsourceid>FETCH-LOGICAL-i106t-1068e6a45ebdf101902eea6ad05856470f52feb9831accf0a8401a271e0af27f3</originalsourceid><addsrcrecordid>eNotjE1PwzAQRH0AiVJ658DBfyBh13bi5ISqiH6ghh4K52qbrIVRiFFiJPj3RJTLPGneaIS4RUgRobyvt4dUgTKpNtZOxYWYYWYwwdyqK3E9ju8ASgMWM_G0HFiuwlffUvShl3VouRtlfGP5zN8xWXPPw1kdQuOpkzW3nmQV-sh9_NtPPgzjw424dNSNvPjnXLyuHl-qTbLbr7fVcpd4hDwmUxSck8n41DoELEExU04tZEWWGwsuU45PZaGRmsYBFQaQlEUGcso6PRd351_PzMfPwX_Q8HNEsLbUGvUv-jxJ2A</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Are Foundation Models the Next-Generation Social Media Content Moderators?</title><source>IEEE Electronic Library (IEL)</source><creator>Nadeem, Mohammad ; Javed, Laeeba ; Sohail, Shahab Saquib ; Cambria, Erik ; Hussain, Amir ; Cambria, Erik</creator><creatorcontrib>Nadeem, Mohammad ; Javed, Laeeba ; Sohail, Shahab Saquib ; Cambria, Erik ; Hussain, Amir ; Cambria, Erik</creatorcontrib><description>Recent progress in artificial intelligence (AI) tools and systems has been significant, especially in their reasoning and efficiency. Notable examples include generative AI-based large language models (LLMs) like Generative Pre-trained Transformer 3.5 (GPT-3.5), GPT-4, and Gemini, among others. In our work, we evaluated the effectiveness of fine-tuned deep learning models compared to general-purpose LLMs in moderating image-based content. We used deep learning models such as convolutional neural networks, ResNet50, and VGG-16, trained them for violence detection on an image dataset, and tested them on a separate dataset. The same test dataset was also evaluated using Large Language and Vision Assistant (LLaVa) and GPT-4, two LLMs that can process images. The results demonstrate that VGG-16 model had the highest accuracy at 0.94, while LLaVa had the lowest at 0.66. GPT-4 showed superiority over LLaVa with an accuracy value of 0.9242. LLaVa recorded the highest precision of all models.</description><identifier>ISSN: 1541-1672</identifier><identifier>DOI: 10.1109/MIS.2024.3477109</identifier><identifier>CODEN: IISYF7</identifier><language>eng</language><publisher>IEEE</publisher><subject>Accuracy ; Data models ; Deep learning ; Large language models ; Natural language processing ; Next generation networking ; Residual neural networks ; Social networking (online) ; Training ; Transformers</subject><ispartof>IEEE intelligent systems, 2024-11, Vol.39 (6), p.70-80</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0002-8080-082X ; 0009-0009-9258-881X ; 0000-0003-3664-5014 ; 0000-0002-3030-1280 ; 0000-0002-5944-7371</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10779331$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10779331$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Nadeem, Mohammad</creatorcontrib><creatorcontrib>Javed, Laeeba</creatorcontrib><creatorcontrib>Sohail, Shahab Saquib</creatorcontrib><creatorcontrib>Cambria, Erik</creatorcontrib><creatorcontrib>Hussain, Amir</creatorcontrib><creatorcontrib>Cambria, Erik</creatorcontrib><title>Are Foundation Models the Next-Generation Social Media Content Moderators?</title><title>IEEE intelligent systems</title><addtitle>MIS</addtitle><description>Recent progress in artificial intelligence (AI) tools and systems has been significant, especially in their reasoning and efficiency. Notable examples include generative AI-based large language models (LLMs) like Generative Pre-trained Transformer 3.5 (GPT-3.5), GPT-4, and Gemini, among others. In our work, we evaluated the effectiveness of fine-tuned deep learning models compared to general-purpose LLMs in moderating image-based content. We used deep learning models such as convolutional neural networks, ResNet50, and VGG-16, trained them for violence detection on an image dataset, and tested them on a separate dataset. The same test dataset was also evaluated using Large Language and Vision Assistant (LLaVa) and GPT-4, two LLMs that can process images. The results demonstrate that VGG-16 model had the highest accuracy at 0.94, while LLaVa had the lowest at 0.66. GPT-4 showed superiority over LLaVa with an accuracy value of 0.9242. LLaVa recorded the highest precision of all models.</description><subject>Accuracy</subject><subject>Data models</subject><subject>Deep learning</subject><subject>Large language models</subject><subject>Natural language processing</subject><subject>Next generation networking</subject><subject>Residual neural networks</subject><subject>Social networking (online)</subject><subject>Training</subject><subject>Transformers</subject><issn>1541-1672</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNotjE1PwzAQRH0AiVJ658DBfyBh13bi5ISqiH6ghh4K52qbrIVRiFFiJPj3RJTLPGneaIS4RUgRobyvt4dUgTKpNtZOxYWYYWYwwdyqK3E9ju8ASgMWM_G0HFiuwlffUvShl3VouRtlfGP5zN8xWXPPw1kdQuOpkzW3nmQV-sh9_NtPPgzjw424dNSNvPjnXLyuHl-qTbLbr7fVcpd4hDwmUxSck8n41DoELEExU04tZEWWGwsuU45PZaGRmsYBFQaQlEUGcso6PRd351_PzMfPwX_Q8HNEsLbUGvUv-jxJ2A</recordid><startdate>202411</startdate><enddate>202411</enddate><creator>Nadeem, Mohammad</creator><creator>Javed, Laeeba</creator><creator>Sohail, Shahab Saquib</creator><creator>Cambria, Erik</creator><creator>Hussain, Amir</creator><creator>Cambria, Erik</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><orcidid>https://orcid.org/0000-0002-8080-082X</orcidid><orcidid>https://orcid.org/0009-0009-9258-881X</orcidid><orcidid>https://orcid.org/0000-0003-3664-5014</orcidid><orcidid>https://orcid.org/0000-0002-3030-1280</orcidid><orcidid>https://orcid.org/0000-0002-5944-7371</orcidid></search><sort><creationdate>202411</creationdate><title>Are Foundation Models the Next-Generation Social Media Content Moderators?</title><author>Nadeem, Mohammad ; Javed, Laeeba ; Sohail, Shahab Saquib ; Cambria, Erik ; Hussain, Amir ; Cambria, Erik</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i106t-1068e6a45ebdf101902eea6ad05856470f52feb9831accf0a8401a271e0af27f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Data models</topic><topic>Deep learning</topic><topic>Large language models</topic><topic>Natural language processing</topic><topic>Next generation networking</topic><topic>Residual neural networks</topic><topic>Social networking (online)</topic><topic>Training</topic><topic>Transformers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Nadeem, Mohammad</creatorcontrib><creatorcontrib>Javed, Laeeba</creatorcontrib><creatorcontrib>Sohail, Shahab Saquib</creatorcontrib><creatorcontrib>Cambria, Erik</creatorcontrib><creatorcontrib>Hussain, Amir</creatorcontrib><creatorcontrib>Cambria, Erik</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><jtitle>IEEE intelligent systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Nadeem, Mohammad</au><au>Javed, Laeeba</au><au>Sohail, Shahab Saquib</au><au>Cambria, Erik</au><au>Hussain, Amir</au><au>Cambria, Erik</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Are Foundation Models the Next-Generation Social Media Content Moderators?</atitle><jtitle>IEEE intelligent systems</jtitle><stitle>MIS</stitle><date>2024-11</date><risdate>2024</risdate><volume>39</volume><issue>6</issue><spage>70</spage><epage>80</epage><pages>70-80</pages><issn>1541-1672</issn><coden>IISYF7</coden><abstract>Recent progress in artificial intelligence (AI) tools and systems has been significant, especially in their reasoning and efficiency. Notable examples include generative AI-based large language models (LLMs) like Generative Pre-trained Transformer 3.5 (GPT-3.5), GPT-4, and Gemini, among others. In our work, we evaluated the effectiveness of fine-tuned deep learning models compared to general-purpose LLMs in moderating image-based content. We used deep learning models such as convolutional neural networks, ResNet50, and VGG-16, trained them for violence detection on an image dataset, and tested them on a separate dataset. The same test dataset was also evaluated using Large Language and Vision Assistant (LLaVa) and GPT-4, two LLMs that can process images. The results demonstrate that VGG-16 model had the highest accuracy at 0.94, while LLaVa had the lowest at 0.66. GPT-4 showed superiority over LLaVa with an accuracy value of 0.9242. LLaVa recorded the highest precision of all models.</abstract><pub>IEEE</pub><doi>10.1109/MIS.2024.3477109</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0002-8080-082X</orcidid><orcidid>https://orcid.org/0009-0009-9258-881X</orcidid><orcidid>https://orcid.org/0000-0003-3664-5014</orcidid><orcidid>https://orcid.org/0000-0002-3030-1280</orcidid><orcidid>https://orcid.org/0000-0002-5944-7371</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1541-1672 |
ispartof | IEEE intelligent systems, 2024-11, Vol.39 (6), p.70-80 |
issn | 1541-1672 |
language | eng |
recordid | cdi_ieee_primary_10779331 |
source | IEEE Electronic Library (IEL) |
subjects | Accuracy Data models Deep learning Large language models Natural language processing Next generation networking Residual neural networks Social networking (online) Training Transformers |
title | Are Foundation Models the Next-Generation Social Media Content Moderators? |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T12%3A09%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Are%20Foundation%20Models%20the%20Next-Generation%20Social%20Media%20Content%20Moderators?&rft.jtitle=IEEE%20intelligent%20systems&rft.au=Nadeem,%20Mohammad&rft.date=2024-11&rft.volume=39&rft.issue=6&rft.spage=70&rft.epage=80&rft.pages=70-80&rft.issn=1541-1672&rft.coden=IISYF7&rft_id=info:doi/10.1109/MIS.2024.3477109&rft_dat=%3Cieee_RIE%3E10779331%3C/ieee_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10779331&rfr_iscdi=true |