CCU-Llama: A Knowledge Extraction LLM for Carbon Capture and Utilization by Mining Scientific Literature Data

As the rate of carbon dioxide emissions directly contributes to global warming, there have been various attempts in the research community to develop novel pathways that mitigate and control this impact. These outcomes of their research are primarily documented in articles, and finding effective sol...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Industrial & engineering chemistry research 2024-10, Vol.63 (41), p.17585-17598
Hauptverfasser: Jami, Harshitha Chandra, Singh, Pushp Raj, Kumar, Avan, Bakshi, Bhavik R., Ramteke, Manojkumar, Kodamana, Hariprasad
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 17598
container_issue 41
container_start_page 17585
container_title Industrial & engineering chemistry research
container_volume 63
creator Jami, Harshitha Chandra
Singh, Pushp Raj
Kumar, Avan
Bakshi, Bhavik R.
Ramteke, Manojkumar
Kodamana, Hariprasad
description As the rate of carbon dioxide emissions directly contributes to global warming, there have been various attempts in the research community to develop novel pathways that mitigate and control this impact. These outcomes of their research are primarily documented in articles, and finding effective solutions necessitates the ability to scan the scientific literature and extract relevant information. In this study, we propose a large language model (LLM), CCU-Llama, for extracting knowledge about carbon capture and utilization (CCU). To create CCU-Llama, employ Llama-2 LLM and apply pretraining and transfer-learning techniques using CCU research articles sourced from the Elsevier database via API. Thorough preprocessing eliminates irrelevant content from the extracted article text. This proposed LLM model performs two major tasks: (i) the CCU-technology potential knowledge extraction task, which provides information about technology and its impact using sentence pair extraction and sentence pairing classification with accuracy of 0.835 and 0.779, respectively, and (ii) creating an interface like a chatbot with a visualization task that can respond to any query related to CCU. CCU-Llama outperforms ChatGPT in F1-score and IFnumeric with scores of 0.7366 and 0.301, respectively, compared to ChatGPT’s scores of 0.6822 and 0.003. This work would help to rapidly analyze the number of carbon source capture and utilization applications. Ultimately, extracted knowledge about CCU technologies can be used to guide the transition toward the goal of net-zero greenhouse gas emissions.
doi_str_mv 10.1021/acs.iecr.4c01656
format Article
fullrecord <record><control><sourceid>acs_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1021_acs_iecr_4c01656</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>a612015071</sourcerecordid><originalsourceid>FETCH-LOGICAL-a163t-8eaacc1c84addb1cfc52a0b2aa10afc27815155043e32f47eec6ebc42faa44d33</originalsourceid><addsrcrecordid>eNp1kE1PwzAMhiMEEmNw55gfQEuSJl3EbSpjIDpxgJ0r102mTF06pZlg_Hq6jysny_LzWvZDyD1nKWeCPwL2qTMYUomM5yq_ICOuBEsUk-qSjJjWOlFaq2ty0_drxphSUo7IpiiWSdnCBp7olL777rs1zcrQ2U8MgNF1npblgtou0AJCPbQFbOMuGAq-ocvoWvcLR6ze04Xzzq_oJzrjo7MOaemiCXDknyHCLbmy0Pbm7lzHZPky-ypek_Jj_lZMywR4nsVEGwBEjlpC09QcLSoBrBYAnIFFMdFccTV8lplMWDkxBnNToxQWQMomy8aEnfZi6Po-GFttg9tA2FecVQdd1aCrOuiqzrqGyMMpcpisu13ww4H_43_EaG_7</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>CCU-Llama: A Knowledge Extraction LLM for Carbon Capture and Utilization by Mining Scientific Literature Data</title><source>American Chemical Society Journals</source><creator>Jami, Harshitha Chandra ; Singh, Pushp Raj ; Kumar, Avan ; Bakshi, Bhavik R. ; Ramteke, Manojkumar ; Kodamana, Hariprasad</creator><creatorcontrib>Jami, Harshitha Chandra ; Singh, Pushp Raj ; Kumar, Avan ; Bakshi, Bhavik R. ; Ramteke, Manojkumar ; Kodamana, Hariprasad</creatorcontrib><description>As the rate of carbon dioxide emissions directly contributes to global warming, there have been various attempts in the research community to develop novel pathways that mitigate and control this impact. These outcomes of their research are primarily documented in articles, and finding effective solutions necessitates the ability to scan the scientific literature and extract relevant information. In this study, we propose a large language model (LLM), CCU-Llama, for extracting knowledge about carbon capture and utilization (CCU). To create CCU-Llama, employ Llama-2 LLM and apply pretraining and transfer-learning techniques using CCU research articles sourced from the Elsevier database via API. Thorough preprocessing eliminates irrelevant content from the extracted article text. This proposed LLM model performs two major tasks: (i) the CCU-technology potential knowledge extraction task, which provides information about technology and its impact using sentence pair extraction and sentence pairing classification with accuracy of 0.835 and 0.779, respectively, and (ii) creating an interface like a chatbot with a visualization task that can respond to any query related to CCU. CCU-Llama outperforms ChatGPT in F1-score and IFnumeric with scores of 0.7366 and 0.301, respectively, compared to ChatGPT’s scores of 0.6822 and 0.003. This work would help to rapidly analyze the number of carbon source capture and utilization applications. Ultimately, extracted knowledge about CCU technologies can be used to guide the transition toward the goal of net-zero greenhouse gas emissions.</description><identifier>ISSN: 0888-5885</identifier><identifier>EISSN: 1520-5045</identifier><identifier>DOI: 10.1021/acs.iecr.4c01656</identifier><language>eng</language><publisher>American Chemical Society</publisher><subject>Process Systems Engineering</subject><ispartof>Industrial &amp; engineering chemistry research, 2024-10, Vol.63 (41), p.17585-17598</ispartof><rights>2024 American Chemical Society</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-a163t-8eaacc1c84addb1cfc52a0b2aa10afc27815155043e32f47eec6ebc42faa44d33</cites><orcidid>0000-0002-6604-8408 ; 0000-0003-3166-2712 ; 0000-0002-3837-8952</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://pubs.acs.org/doi/pdf/10.1021/acs.iecr.4c01656$$EPDF$$P50$$Gacs$$H</linktopdf><linktohtml>$$Uhttps://pubs.acs.org/doi/10.1021/acs.iecr.4c01656$$EHTML$$P50$$Gacs$$H</linktohtml><link.rule.ids>315,781,785,2766,27080,27928,27929,56742,56792</link.rule.ids></links><search><creatorcontrib>Jami, Harshitha Chandra</creatorcontrib><creatorcontrib>Singh, Pushp Raj</creatorcontrib><creatorcontrib>Kumar, Avan</creatorcontrib><creatorcontrib>Bakshi, Bhavik R.</creatorcontrib><creatorcontrib>Ramteke, Manojkumar</creatorcontrib><creatorcontrib>Kodamana, Hariprasad</creatorcontrib><title>CCU-Llama: A Knowledge Extraction LLM for Carbon Capture and Utilization by Mining Scientific Literature Data</title><title>Industrial &amp; engineering chemistry research</title><addtitle>Ind. Eng. Chem. Res</addtitle><description>As the rate of carbon dioxide emissions directly contributes to global warming, there have been various attempts in the research community to develop novel pathways that mitigate and control this impact. These outcomes of their research are primarily documented in articles, and finding effective solutions necessitates the ability to scan the scientific literature and extract relevant information. In this study, we propose a large language model (LLM), CCU-Llama, for extracting knowledge about carbon capture and utilization (CCU). To create CCU-Llama, employ Llama-2 LLM and apply pretraining and transfer-learning techniques using CCU research articles sourced from the Elsevier database via API. Thorough preprocessing eliminates irrelevant content from the extracted article text. This proposed LLM model performs two major tasks: (i) the CCU-technology potential knowledge extraction task, which provides information about technology and its impact using sentence pair extraction and sentence pairing classification with accuracy of 0.835 and 0.779, respectively, and (ii) creating an interface like a chatbot with a visualization task that can respond to any query related to CCU. CCU-Llama outperforms ChatGPT in F1-score and IFnumeric with scores of 0.7366 and 0.301, respectively, compared to ChatGPT’s scores of 0.6822 and 0.003. This work would help to rapidly analyze the number of carbon source capture and utilization applications. Ultimately, extracted knowledge about CCU technologies can be used to guide the transition toward the goal of net-zero greenhouse gas emissions.</description><subject>Process Systems Engineering</subject><issn>0888-5885</issn><issn>1520-5045</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp1kE1PwzAMhiMEEmNw55gfQEuSJl3EbSpjIDpxgJ0r102mTF06pZlg_Hq6jysny_LzWvZDyD1nKWeCPwL2qTMYUomM5yq_ICOuBEsUk-qSjJjWOlFaq2ty0_drxphSUo7IpiiWSdnCBp7olL777rs1zcrQ2U8MgNF1npblgtou0AJCPbQFbOMuGAq-ocvoWvcLR6ze04Xzzq_oJzrjo7MOaemiCXDknyHCLbmy0Pbm7lzHZPky-ypek_Jj_lZMywR4nsVEGwBEjlpC09QcLSoBrBYAnIFFMdFccTV8lplMWDkxBnNToxQWQMomy8aEnfZi6Po-GFttg9tA2FecVQdd1aCrOuiqzrqGyMMpcpisu13ww4H_43_EaG_7</recordid><startdate>20241004</startdate><enddate>20241004</enddate><creator>Jami, Harshitha Chandra</creator><creator>Singh, Pushp Raj</creator><creator>Kumar, Avan</creator><creator>Bakshi, Bhavik R.</creator><creator>Ramteke, Manojkumar</creator><creator>Kodamana, Hariprasad</creator><general>American Chemical Society</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-6604-8408</orcidid><orcidid>https://orcid.org/0000-0003-3166-2712</orcidid><orcidid>https://orcid.org/0000-0002-3837-8952</orcidid></search><sort><creationdate>20241004</creationdate><title>CCU-Llama: A Knowledge Extraction LLM for Carbon Capture and Utilization by Mining Scientific Literature Data</title><author>Jami, Harshitha Chandra ; Singh, Pushp Raj ; Kumar, Avan ; Bakshi, Bhavik R. ; Ramteke, Manojkumar ; Kodamana, Hariprasad</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a163t-8eaacc1c84addb1cfc52a0b2aa10afc27815155043e32f47eec6ebc42faa44d33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Process Systems Engineering</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Jami, Harshitha Chandra</creatorcontrib><creatorcontrib>Singh, Pushp Raj</creatorcontrib><creatorcontrib>Kumar, Avan</creatorcontrib><creatorcontrib>Bakshi, Bhavik R.</creatorcontrib><creatorcontrib>Ramteke, Manojkumar</creatorcontrib><creatorcontrib>Kodamana, Hariprasad</creatorcontrib><collection>CrossRef</collection><jtitle>Industrial &amp; engineering chemistry research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Jami, Harshitha Chandra</au><au>Singh, Pushp Raj</au><au>Kumar, Avan</au><au>Bakshi, Bhavik R.</au><au>Ramteke, Manojkumar</au><au>Kodamana, Hariprasad</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>CCU-Llama: A Knowledge Extraction LLM for Carbon Capture and Utilization by Mining Scientific Literature Data</atitle><jtitle>Industrial &amp; engineering chemistry research</jtitle><addtitle>Ind. Eng. Chem. Res</addtitle><date>2024-10-04</date><risdate>2024</risdate><volume>63</volume><issue>41</issue><spage>17585</spage><epage>17598</epage><pages>17585-17598</pages><issn>0888-5885</issn><eissn>1520-5045</eissn><abstract>As the rate of carbon dioxide emissions directly contributes to global warming, there have been various attempts in the research community to develop novel pathways that mitigate and control this impact. These outcomes of their research are primarily documented in articles, and finding effective solutions necessitates the ability to scan the scientific literature and extract relevant information. In this study, we propose a large language model (LLM), CCU-Llama, for extracting knowledge about carbon capture and utilization (CCU). To create CCU-Llama, employ Llama-2 LLM and apply pretraining and transfer-learning techniques using CCU research articles sourced from the Elsevier database via API. Thorough preprocessing eliminates irrelevant content from the extracted article text. This proposed LLM model performs two major tasks: (i) the CCU-technology potential knowledge extraction task, which provides information about technology and its impact using sentence pair extraction and sentence pairing classification with accuracy of 0.835 and 0.779, respectively, and (ii) creating an interface like a chatbot with a visualization task that can respond to any query related to CCU. CCU-Llama outperforms ChatGPT in F1-score and IFnumeric with scores of 0.7366 and 0.301, respectively, compared to ChatGPT’s scores of 0.6822 and 0.003. This work would help to rapidly analyze the number of carbon source capture and utilization applications. Ultimately, extracted knowledge about CCU technologies can be used to guide the transition toward the goal of net-zero greenhouse gas emissions.</abstract><pub>American Chemical Society</pub><doi>10.1021/acs.iecr.4c01656</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-6604-8408</orcidid><orcidid>https://orcid.org/0000-0003-3166-2712</orcidid><orcidid>https://orcid.org/0000-0002-3837-8952</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0888-5885
ispartof Industrial & engineering chemistry research, 2024-10, Vol.63 (41), p.17585-17598
issn 0888-5885
1520-5045
language eng
recordid cdi_crossref_primary_10_1021_acs_iecr_4c01656
source American Chemical Society Journals
subjects Process Systems Engineering
title CCU-Llama: A Knowledge Extraction LLM for Carbon Capture and Utilization by Mining Scientific Literature Data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-17T13%3A32%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acs_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=CCU-Llama:%20A%20Knowledge%20Extraction%20LLM%20for%20Carbon%20Capture%20and%20Utilization%20by%20Mining%20Scientific%20Literature%20Data&rft.jtitle=Industrial%20&%20engineering%20chemistry%20research&rft.au=Jami,%20Harshitha%20Chandra&rft.date=2024-10-04&rft.volume=63&rft.issue=41&rft.spage=17585&rft.epage=17598&rft.pages=17585-17598&rft.issn=0888-5885&rft.eissn=1520-5045&rft_id=info:doi/10.1021/acs.iecr.4c01656&rft_dat=%3Cacs_cross%3Ea612015071%3C/acs_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true