Text Mining in 19th-Century Essays for Investigating a Possible Collaborative Authorship Problem: John Stuart Mill and Harriet Taylor Mill
In this work, we use machine learning techniques to address a research question regarding the authorship of two famous essays in the nineteenth century. On Liberty (1859) and The Subjection of Women (1869) were published under John Stuart Mill's name, a widely studied nineteenth-century British...
Gespeichert in:
Veröffentlicht in: | IEEE access 2022, Vol.10, p.20937-20947 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 20947 |
---|---|
container_issue | |
container_start_page | 20937 |
container_title | IEEE access |
container_volume | 10 |
creator | Neocleous, Andreas Kataliakos, Giorgos Loizides, Antis |
description | In this work, we use machine learning techniques to address a research question regarding the authorship of two famous essays in the nineteenth century. On Liberty (1859) and The Subjection of Women (1869) were published under John Stuart Mill's name, a widely studied nineteenth-century British philosopher. Mill himself attributed them to collaboration with his wife and partner, Harriet Taylor Mill. More than 150 years later, the question remains whether the author of these two canonical texts in the history of political thought was solely John Stuart Mill. Experts are divided on taking John Stuart Mill's attribution at face value, since Harriet Taylor Mill had died in 1858. Addressing this question, we use a dataset consisted in essays of both authors, to train three state-of-the-art classifiers that are able to learn and distinguish the writing style of each author. Then, we use the models built to attribute the two famous essays of disputed authorship to one of the two. From the results, we conclude that the classifiers are able to learn the two classes very well, and they return high accuracies on the validation set. Regarding the test set, most of the models attribute the two essays to John Stuart Mill, however, the contribution of Harriet Taylor Mill is shown for some chunks of text of both essays. These results, we conclude, explain why experts are divided on this particular research question. |
doi_str_mv | 10.1109/ACCESS.2022.3152201 |
format | Article |
fullrecord | <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_9715084</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9715084</ieee_id><doaj_id>oai_doaj_org_article_30fe8b46b9d047d5a240769c81ca3c26</doaj_id><sourcerecordid>2635045195</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-8ed912180d02d3c392515363a2f1b6cb9dbd6f2a250b78cc18a88cd865819e0f3</originalsourceid><addsrcrecordid>eNpNUdFu2yAUtaZNWtX2C_qCtGdnXDA27C2ysjVTp1VK9oww4ITINRngavmFffVwXVXjBXTOuede7imKO8ArACw-r9t2s9utCCZkRYERguFdcUWgFiVltH7_3_tjcRvjCefDM8Saq-Lv3v5J6Icb3XhAbkQg0rFs7ZimcEGbGNUlot4HtB2fbUzuoNIsVOjRx-i6waLWD4PqfMjEs0XrKR19iEd3Ro_BZ_7pC_rujyPapUmFudEwIDUadK9CcDahvboM2X7Gb4oPvRqivX29r4tfXzf79r58-Plt264fSl1hnkpujQACHBtMDNVUEAb5a1SRHrpad8J0pu6JIgx3DdcauOJcG14zDsLinl4X28XXeHWS5-CeVLhIr5x8AXw4yDyq04OVFPeWd1WdTXHVGKZIhZtaaA5aUU3q7PVp8ToH_3vKG5InP4Uxjy8zy3DFQLCsootKh7y2YPu3roDlnKFcMpRzhvI1w1x1t1Q5a-1bhWiAYV7Rf3NemAw</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2635045195</pqid></control><display><type>article</type><title>Text Mining in 19th-Century Essays for Investigating a Possible Collaborative Authorship Problem: John Stuart Mill and Harriet Taylor Mill</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Neocleous, Andreas ; Kataliakos, Giorgos ; Loizides, Antis</creator><creatorcontrib>Neocleous, Andreas ; Kataliakos, Giorgos ; Loizides, Antis</creatorcontrib><description>In this work, we use machine learning techniques to address a research question regarding the authorship of two famous essays in the nineteenth century. On Liberty (1859) and The Subjection of Women (1869) were published under John Stuart Mill's name, a widely studied nineteenth-century British philosopher. Mill himself attributed them to collaboration with his wife and partner, Harriet Taylor Mill. More than 150 years later, the question remains whether the author of these two canonical texts in the history of political thought was solely John Stuart Mill. Experts are divided on taking John Stuart Mill's attribution at face value, since Harriet Taylor Mill had died in 1858. Addressing this question, we use a dataset consisted in essays of both authors, to train three state-of-the-art classifiers that are able to learn and distinguish the writing style of each author. Then, we use the models built to attribute the two famous essays of disputed authorship to one of the two. From the results, we conclude that the classifiers are able to learn the two classes very well, and they return high accuracies on the validation set. Regarding the test set, most of the models attribute the two essays to John Stuart Mill, however, the contribution of Harriet Taylor Mill is shown for some chunks of text of both essays. These results, we conclude, explain why experts are divided on this particular research question.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2022.3152201</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Authorship ; Authorship attribution ; Classifiers ; Collaboration ; Feature extraction ; feature selection ; Machine learning ; Mill, John Stuart (1806-1873) ; Questions ; Reliability ; Syntactics ; Task analysis ; text classification ; Text mining ; Training ; Writing</subject><ispartof>IEEE access, 2022, Vol.10, p.20937-20947</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-8ed912180d02d3c392515363a2f1b6cb9dbd6f2a250b78cc18a88cd865819e0f3</citedby><cites>FETCH-LOGICAL-c408t-8ed912180d02d3c392515363a2f1b6cb9dbd6f2a250b78cc18a88cd865819e0f3</cites><orcidid>0000-0002-3587-6059 ; 0000-0002-0730-2314</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9715084$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,778,782,862,2098,4012,27616,27906,27907,27908,54916</link.rule.ids></links><search><creatorcontrib>Neocleous, Andreas</creatorcontrib><creatorcontrib>Kataliakos, Giorgos</creatorcontrib><creatorcontrib>Loizides, Antis</creatorcontrib><title>Text Mining in 19th-Century Essays for Investigating a Possible Collaborative Authorship Problem: John Stuart Mill and Harriet Taylor Mill</title><title>IEEE access</title><addtitle>Access</addtitle><description>In this work, we use machine learning techniques to address a research question regarding the authorship of two famous essays in the nineteenth century. On Liberty (1859) and The Subjection of Women (1869) were published under John Stuart Mill's name, a widely studied nineteenth-century British philosopher. Mill himself attributed them to collaboration with his wife and partner, Harriet Taylor Mill. More than 150 years later, the question remains whether the author of these two canonical texts in the history of political thought was solely John Stuart Mill. Experts are divided on taking John Stuart Mill's attribution at face value, since Harriet Taylor Mill had died in 1858. Addressing this question, we use a dataset consisted in essays of both authors, to train three state-of-the-art classifiers that are able to learn and distinguish the writing style of each author. Then, we use the models built to attribute the two famous essays of disputed authorship to one of the two. From the results, we conclude that the classifiers are able to learn the two classes very well, and they return high accuracies on the validation set. Regarding the test set, most of the models attribute the two essays to John Stuart Mill, however, the contribution of Harriet Taylor Mill is shown for some chunks of text of both essays. These results, we conclude, explain why experts are divided on this particular research question.</description><subject>Authorship</subject><subject>Authorship attribution</subject><subject>Classifiers</subject><subject>Collaboration</subject><subject>Feature extraction</subject><subject>feature selection</subject><subject>Machine learning</subject><subject>Mill, John Stuart (1806-1873)</subject><subject>Questions</subject><subject>Reliability</subject><subject>Syntactics</subject><subject>Task analysis</subject><subject>text classification</subject><subject>Text mining</subject><subject>Training</subject><subject>Writing</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUdFu2yAUtaZNWtX2C_qCtGdnXDA27C2ysjVTp1VK9oww4ITINRngavmFffVwXVXjBXTOuede7imKO8ArACw-r9t2s9utCCZkRYERguFdcUWgFiVltH7_3_tjcRvjCefDM8Saq-Lv3v5J6Icb3XhAbkQg0rFs7ZimcEGbGNUlot4HtB2fbUzuoNIsVOjRx-i6waLWD4PqfMjEs0XrKR19iEd3Ro_BZ_7pC_rujyPapUmFudEwIDUadK9CcDahvboM2X7Gb4oPvRqivX29r4tfXzf79r58-Plt264fSl1hnkpujQACHBtMDNVUEAb5a1SRHrpad8J0pu6JIgx3DdcauOJcG14zDsLinl4X28XXeHWS5-CeVLhIr5x8AXw4yDyq04OVFPeWd1WdTXHVGKZIhZtaaA5aUU3q7PVp8ToH_3vKG5InP4Uxjy8zy3DFQLCsootKh7y2YPu3roDlnKFcMpRzhvI1w1x1t1Q5a-1bhWiAYV7Rf3NemAw</recordid><startdate>2022</startdate><enddate>2022</enddate><creator>Neocleous, Andreas</creator><creator>Kataliakos, Giorgos</creator><creator>Loizides, Antis</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-3587-6059</orcidid><orcidid>https://orcid.org/0000-0002-0730-2314</orcidid></search><sort><creationdate>2022</creationdate><title>Text Mining in 19th-Century Essays for Investigating a Possible Collaborative Authorship Problem: John Stuart Mill and Harriet Taylor Mill</title><author>Neocleous, Andreas ; Kataliakos, Giorgos ; Loizides, Antis</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-8ed912180d02d3c392515363a2f1b6cb9dbd6f2a250b78cc18a88cd865819e0f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Authorship</topic><topic>Authorship attribution</topic><topic>Classifiers</topic><topic>Collaboration</topic><topic>Feature extraction</topic><topic>feature selection</topic><topic>Machine learning</topic><topic>Mill, John Stuart (1806-1873)</topic><topic>Questions</topic><topic>Reliability</topic><topic>Syntactics</topic><topic>Task analysis</topic><topic>text classification</topic><topic>Text mining</topic><topic>Training</topic><topic>Writing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Neocleous, Andreas</creatorcontrib><creatorcontrib>Kataliakos, Giorgos</creatorcontrib><creatorcontrib>Loizides, Antis</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Neocleous, Andreas</au><au>Kataliakos, Giorgos</au><au>Loizides, Antis</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Text Mining in 19th-Century Essays for Investigating a Possible Collaborative Authorship Problem: John Stuart Mill and Harriet Taylor Mill</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2022</date><risdate>2022</risdate><volume>10</volume><spage>20937</spage><epage>20947</epage><pages>20937-20947</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>In this work, we use machine learning techniques to address a research question regarding the authorship of two famous essays in the nineteenth century. On Liberty (1859) and The Subjection of Women (1869) were published under John Stuart Mill's name, a widely studied nineteenth-century British philosopher. Mill himself attributed them to collaboration with his wife and partner, Harriet Taylor Mill. More than 150 years later, the question remains whether the author of these two canonical texts in the history of political thought was solely John Stuart Mill. Experts are divided on taking John Stuart Mill's attribution at face value, since Harriet Taylor Mill had died in 1858. Addressing this question, we use a dataset consisted in essays of both authors, to train three state-of-the-art classifiers that are able to learn and distinguish the writing style of each author. Then, we use the models built to attribute the two famous essays of disputed authorship to one of the two. From the results, we conclude that the classifiers are able to learn the two classes very well, and they return high accuracies on the validation set. Regarding the test set, most of the models attribute the two essays to John Stuart Mill, however, the contribution of Harriet Taylor Mill is shown for some chunks of text of both essays. These results, we conclude, explain why experts are divided on this particular research question.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2022.3152201</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0002-3587-6059</orcidid><orcidid>https://orcid.org/0000-0002-0730-2314</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2169-3536 |
ispartof | IEEE access, 2022, Vol.10, p.20937-20947 |
issn | 2169-3536 2169-3536 |
language | eng |
recordid | cdi_ieee_primary_9715084 |
source | IEEE Open Access Journals; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals |
subjects | Authorship Authorship attribution Classifiers Collaboration Feature extraction feature selection Machine learning Mill, John Stuart (1806-1873) Questions Reliability Syntactics Task analysis text classification Text mining Training Writing |
title | Text Mining in 19th-Century Essays for Investigating a Possible Collaborative Authorship Problem: John Stuart Mill and Harriet Taylor Mill |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T07%3A47%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Text%20Mining%20in%2019th-Century%20Essays%20for%20Investigating%20a%20Possible%20Collaborative%20Authorship%20Problem:%20John%20Stuart%20Mill%20and%20Harriet%20Taylor%20Mill&rft.jtitle=IEEE%20access&rft.au=Neocleous,%20Andreas&rft.date=2022&rft.volume=10&rft.spage=20937&rft.epage=20947&rft.pages=20937-20947&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2022.3152201&rft_dat=%3Cproquest_ieee_%3E2635045195%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2635045195&rft_id=info:pmid/&rft_ieee_id=9715084&rft_doaj_id=oai_doaj_org_article_30fe8b46b9d047d5a240769c81ca3c26&rfr_iscdi=true |