CCU-Llama: A Knowledge Extraction LLM for Carbon Capture and Utilization by Mining Scientific Literature Data
As the rate of carbon dioxide emissions directly contributes to global warming, there have been various attempts in the research community to develop novel pathways that mitigate and control this impact. These outcomes of their research are primarily documented in articles, and finding effective sol...
Gespeichert in:
Veröffentlicht in: | Industrial & engineering chemistry research 2024-10, Vol.63 (41), p.17585-17598 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 17598 |
---|---|
container_issue | 41 |
container_start_page | 17585 |
container_title | Industrial & engineering chemistry research |
container_volume | 63 |
creator | Jami, Harshitha Chandra Singh, Pushp Raj Kumar, Avan Bakshi, Bhavik R. Ramteke, Manojkumar Kodamana, Hariprasad |
description | As the rate of carbon dioxide emissions directly contributes to global warming, there have been various attempts in the research community to develop novel pathways that mitigate and control this impact. These outcomes of their research are primarily documented in articles, and finding effective solutions necessitates the ability to scan the scientific literature and extract relevant information. In this study, we propose a large language model (LLM), CCU-Llama, for extracting knowledge about carbon capture and utilization (CCU). To create CCU-Llama, employ Llama-2 LLM and apply pretraining and transfer-learning techniques using CCU research articles sourced from the Elsevier database via API. Thorough preprocessing eliminates irrelevant content from the extracted article text. This proposed LLM model performs two major tasks: (i) the CCU-technology potential knowledge extraction task, which provides information about technology and its impact using sentence pair extraction and sentence pairing classification with accuracy of 0.835 and 0.779, respectively, and (ii) creating an interface like a chatbot with a visualization task that can respond to any query related to CCU. CCU-Llama outperforms ChatGPT in F1-score and IFnumeric with scores of 0.7366 and 0.301, respectively, compared to ChatGPT’s scores of 0.6822 and 0.003. This work would help to rapidly analyze the number of carbon source capture and utilization applications. Ultimately, extracted knowledge about CCU technologies can be used to guide the transition toward the goal of net-zero greenhouse gas emissions. |
doi_str_mv | 10.1021/acs.iecr.4c01656 |
format | Article |
fullrecord | <record><control><sourceid>acs_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1021_acs_iecr_4c01656</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>a612015071</sourcerecordid><originalsourceid>FETCH-LOGICAL-a163t-8eaacc1c84addb1cfc52a0b2aa10afc27815155043e32f47eec6ebc42faa44d33</originalsourceid><addsrcrecordid>eNp1kE1PwzAMhiMEEmNw55gfQEuSJl3EbSpjIDpxgJ0r102mTF06pZlg_Hq6jysny_LzWvZDyD1nKWeCPwL2qTMYUomM5yq_ICOuBEsUk-qSjJjWOlFaq2ty0_drxphSUo7IpiiWSdnCBp7olL777rs1zcrQ2U8MgNF1npblgtou0AJCPbQFbOMuGAq-ocvoWvcLR6ze04Xzzq_oJzrjo7MOaemiCXDknyHCLbmy0Pbm7lzHZPky-ypek_Jj_lZMywR4nsVEGwBEjlpC09QcLSoBrBYAnIFFMdFccTV8lplMWDkxBnNToxQWQMomy8aEnfZi6Po-GFttg9tA2FecVQdd1aCrOuiqzrqGyMMpcpisu13ww4H_43_EaG_7</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>CCU-Llama: A Knowledge Extraction LLM for Carbon Capture and Utilization by Mining Scientific Literature Data</title><source>American Chemical Society Journals</source><creator>Jami, Harshitha Chandra ; Singh, Pushp Raj ; Kumar, Avan ; Bakshi, Bhavik R. ; Ramteke, Manojkumar ; Kodamana, Hariprasad</creator><creatorcontrib>Jami, Harshitha Chandra ; Singh, Pushp Raj ; Kumar, Avan ; Bakshi, Bhavik R. ; Ramteke, Manojkumar ; Kodamana, Hariprasad</creatorcontrib><description>As the rate of carbon dioxide emissions directly contributes to global warming, there have been various attempts in the research community to develop novel pathways that mitigate and control this impact. These outcomes of their research are primarily documented in articles, and finding effective solutions necessitates the ability to scan the scientific literature and extract relevant information. In this study, we propose a large language model (LLM), CCU-Llama, for extracting knowledge about carbon capture and utilization (CCU). To create CCU-Llama, employ Llama-2 LLM and apply pretraining and transfer-learning techniques using CCU research articles sourced from the Elsevier database via API. Thorough preprocessing eliminates irrelevant content from the extracted article text. This proposed LLM model performs two major tasks: (i) the CCU-technology potential knowledge extraction task, which provides information about technology and its impact using sentence pair extraction and sentence pairing classification with accuracy of 0.835 and 0.779, respectively, and (ii) creating an interface like a chatbot with a visualization task that can respond to any query related to CCU. CCU-Llama outperforms ChatGPT in F1-score and IFnumeric with scores of 0.7366 and 0.301, respectively, compared to ChatGPT’s scores of 0.6822 and 0.003. This work would help to rapidly analyze the number of carbon source capture and utilization applications. Ultimately, extracted knowledge about CCU technologies can be used to guide the transition toward the goal of net-zero greenhouse gas emissions.</description><identifier>ISSN: 0888-5885</identifier><identifier>EISSN: 1520-5045</identifier><identifier>DOI: 10.1021/acs.iecr.4c01656</identifier><language>eng</language><publisher>American Chemical Society</publisher><subject>Process Systems Engineering</subject><ispartof>Industrial & engineering chemistry research, 2024-10, Vol.63 (41), p.17585-17598</ispartof><rights>2024 American Chemical Society</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-a163t-8eaacc1c84addb1cfc52a0b2aa10afc27815155043e32f47eec6ebc42faa44d33</cites><orcidid>0000-0002-6604-8408 ; 0000-0003-3166-2712 ; 0000-0002-3837-8952</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://pubs.acs.org/doi/pdf/10.1021/acs.iecr.4c01656$$EPDF$$P50$$Gacs$$H</linktopdf><linktohtml>$$Uhttps://pubs.acs.org/doi/10.1021/acs.iecr.4c01656$$EHTML$$P50$$Gacs$$H</linktohtml><link.rule.ids>315,781,785,2766,27080,27928,27929,56742,56792</link.rule.ids></links><search><creatorcontrib>Jami, Harshitha Chandra</creatorcontrib><creatorcontrib>Singh, Pushp Raj</creatorcontrib><creatorcontrib>Kumar, Avan</creatorcontrib><creatorcontrib>Bakshi, Bhavik R.</creatorcontrib><creatorcontrib>Ramteke, Manojkumar</creatorcontrib><creatorcontrib>Kodamana, Hariprasad</creatorcontrib><title>CCU-Llama: A Knowledge Extraction LLM for Carbon Capture and Utilization by Mining Scientific Literature Data</title><title>Industrial & engineering chemistry research</title><addtitle>Ind. Eng. Chem. Res</addtitle><description>As the rate of carbon dioxide emissions directly contributes to global warming, there have been various attempts in the research community to develop novel pathways that mitigate and control this impact. These outcomes of their research are primarily documented in articles, and finding effective solutions necessitates the ability to scan the scientific literature and extract relevant information. In this study, we propose a large language model (LLM), CCU-Llama, for extracting knowledge about carbon capture and utilization (CCU). To create CCU-Llama, employ Llama-2 LLM and apply pretraining and transfer-learning techniques using CCU research articles sourced from the Elsevier database via API. Thorough preprocessing eliminates irrelevant content from the extracted article text. This proposed LLM model performs two major tasks: (i) the CCU-technology potential knowledge extraction task, which provides information about technology and its impact using sentence pair extraction and sentence pairing classification with accuracy of 0.835 and 0.779, respectively, and (ii) creating an interface like a chatbot with a visualization task that can respond to any query related to CCU. CCU-Llama outperforms ChatGPT in F1-score and IFnumeric with scores of 0.7366 and 0.301, respectively, compared to ChatGPT’s scores of 0.6822 and 0.003. This work would help to rapidly analyze the number of carbon source capture and utilization applications. Ultimately, extracted knowledge about CCU technologies can be used to guide the transition toward the goal of net-zero greenhouse gas emissions.</description><subject>Process Systems Engineering</subject><issn>0888-5885</issn><issn>1520-5045</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp1kE1PwzAMhiMEEmNw55gfQEuSJl3EbSpjIDpxgJ0r102mTF06pZlg_Hq6jysny_LzWvZDyD1nKWeCPwL2qTMYUomM5yq_ICOuBEsUk-qSjJjWOlFaq2ty0_drxphSUo7IpiiWSdnCBp7olL777rs1zcrQ2U8MgNF1npblgtou0AJCPbQFbOMuGAq-ocvoWvcLR6ze04Xzzq_oJzrjo7MOaemiCXDknyHCLbmy0Pbm7lzHZPky-ypek_Jj_lZMywR4nsVEGwBEjlpC09QcLSoBrBYAnIFFMdFccTV8lplMWDkxBnNToxQWQMomy8aEnfZi6Po-GFttg9tA2FecVQdd1aCrOuiqzrqGyMMpcpisu13ww4H_43_EaG_7</recordid><startdate>20241004</startdate><enddate>20241004</enddate><creator>Jami, Harshitha Chandra</creator><creator>Singh, Pushp Raj</creator><creator>Kumar, Avan</creator><creator>Bakshi, Bhavik R.</creator><creator>Ramteke, Manojkumar</creator><creator>Kodamana, Hariprasad</creator><general>American Chemical Society</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-6604-8408</orcidid><orcidid>https://orcid.org/0000-0003-3166-2712</orcidid><orcidid>https://orcid.org/0000-0002-3837-8952</orcidid></search><sort><creationdate>20241004</creationdate><title>CCU-Llama: A Knowledge Extraction LLM for Carbon Capture and Utilization by Mining Scientific Literature Data</title><author>Jami, Harshitha Chandra ; Singh, Pushp Raj ; Kumar, Avan ; Bakshi, Bhavik R. ; Ramteke, Manojkumar ; Kodamana, Hariprasad</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a163t-8eaacc1c84addb1cfc52a0b2aa10afc27815155043e32f47eec6ebc42faa44d33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Process Systems Engineering</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Jami, Harshitha Chandra</creatorcontrib><creatorcontrib>Singh, Pushp Raj</creatorcontrib><creatorcontrib>Kumar, Avan</creatorcontrib><creatorcontrib>Bakshi, Bhavik R.</creatorcontrib><creatorcontrib>Ramteke, Manojkumar</creatorcontrib><creatorcontrib>Kodamana, Hariprasad</creatorcontrib><collection>CrossRef</collection><jtitle>Industrial & engineering chemistry research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Jami, Harshitha Chandra</au><au>Singh, Pushp Raj</au><au>Kumar, Avan</au><au>Bakshi, Bhavik R.</au><au>Ramteke, Manojkumar</au><au>Kodamana, Hariprasad</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>CCU-Llama: A Knowledge Extraction LLM for Carbon Capture and Utilization by Mining Scientific Literature Data</atitle><jtitle>Industrial & engineering chemistry research</jtitle><addtitle>Ind. Eng. Chem. Res</addtitle><date>2024-10-04</date><risdate>2024</risdate><volume>63</volume><issue>41</issue><spage>17585</spage><epage>17598</epage><pages>17585-17598</pages><issn>0888-5885</issn><eissn>1520-5045</eissn><abstract>As the rate of carbon dioxide emissions directly contributes to global warming, there have been various attempts in the research community to develop novel pathways that mitigate and control this impact. These outcomes of their research are primarily documented in articles, and finding effective solutions necessitates the ability to scan the scientific literature and extract relevant information. In this study, we propose a large language model (LLM), CCU-Llama, for extracting knowledge about carbon capture and utilization (CCU). To create CCU-Llama, employ Llama-2 LLM and apply pretraining and transfer-learning techniques using CCU research articles sourced from the Elsevier database via API. Thorough preprocessing eliminates irrelevant content from the extracted article text. This proposed LLM model performs two major tasks: (i) the CCU-technology potential knowledge extraction task, which provides information about technology and its impact using sentence pair extraction and sentence pairing classification with accuracy of 0.835 and 0.779, respectively, and (ii) creating an interface like a chatbot with a visualization task that can respond to any query related to CCU. CCU-Llama outperforms ChatGPT in F1-score and IFnumeric with scores of 0.7366 and 0.301, respectively, compared to ChatGPT’s scores of 0.6822 and 0.003. This work would help to rapidly analyze the number of carbon source capture and utilization applications. Ultimately, extracted knowledge about CCU technologies can be used to guide the transition toward the goal of net-zero greenhouse gas emissions.</abstract><pub>American Chemical Society</pub><doi>10.1021/acs.iecr.4c01656</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-6604-8408</orcidid><orcidid>https://orcid.org/0000-0003-3166-2712</orcidid><orcidid>https://orcid.org/0000-0002-3837-8952</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0888-5885 |
ispartof | Industrial & engineering chemistry research, 2024-10, Vol.63 (41), p.17585-17598 |
issn | 0888-5885 1520-5045 |
language | eng |
recordid | cdi_crossref_primary_10_1021_acs_iecr_4c01656 |
source | American Chemical Society Journals |
subjects | Process Systems Engineering |
title | CCU-Llama: A Knowledge Extraction LLM for Carbon Capture and Utilization by Mining Scientific Literature Data |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-17T13%3A32%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acs_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=CCU-Llama:%20A%20Knowledge%20Extraction%20LLM%20for%20Carbon%20Capture%20and%20Utilization%20by%20Mining%20Scientific%20Literature%20Data&rft.jtitle=Industrial%20&%20engineering%20chemistry%20research&rft.au=Jami,%20Harshitha%20Chandra&rft.date=2024-10-04&rft.volume=63&rft.issue=41&rft.spage=17585&rft.epage=17598&rft.pages=17585-17598&rft.issn=0888-5885&rft.eissn=1520-5045&rft_id=info:doi/10.1021/acs.iecr.4c01656&rft_dat=%3Cacs_cross%3Ea612015071%3C/acs_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |