Text Mining in 19th-Century Essays for Investigating a Possible Collaborative Authorship Problem: John Stuart Mill and Harriet Taylor Mill

In this work, we use machine learning techniques to address a research question regarding the authorship of two famous essays in the nineteenth century. On Liberty (1859) and The Subjection of Women (1869) were published under John Stuart Mill's name, a widely studied nineteenth-century British...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2022, Vol.10, p.20937-20947
Hauptverfasser: Neocleous, Andreas, Kataliakos, Giorgos, Loizides, Antis
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 20947
container_issue
container_start_page 20937
container_title IEEE access
container_volume 10
creator Neocleous, Andreas
Kataliakos, Giorgos
Loizides, Antis
description In this work, we use machine learning techniques to address a research question regarding the authorship of two famous essays in the nineteenth century. On Liberty (1859) and The Subjection of Women (1869) were published under John Stuart Mill's name, a widely studied nineteenth-century British philosopher. Mill himself attributed them to collaboration with his wife and partner, Harriet Taylor Mill. More than 150 years later, the question remains whether the author of these two canonical texts in the history of political thought was solely John Stuart Mill. Experts are divided on taking John Stuart Mill's attribution at face value, since Harriet Taylor Mill had died in 1858. Addressing this question, we use a dataset consisted in essays of both authors, to train three state-of-the-art classifiers that are able to learn and distinguish the writing style of each author. Then, we use the models built to attribute the two famous essays of disputed authorship to one of the two. From the results, we conclude that the classifiers are able to learn the two classes very well, and they return high accuracies on the validation set. Regarding the test set, most of the models attribute the two essays to John Stuart Mill, however, the contribution of Harriet Taylor Mill is shown for some chunks of text of both essays. These results, we conclude, explain why experts are divided on this particular research question.
doi_str_mv 10.1109/ACCESS.2022.3152201
format Article
fullrecord <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_9715084</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9715084</ieee_id><doaj_id>oai_doaj_org_article_30fe8b46b9d047d5a240769c81ca3c26</doaj_id><sourcerecordid>2635045195</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-8ed912180d02d3c392515363a2f1b6cb9dbd6f2a250b78cc18a88cd865819e0f3</originalsourceid><addsrcrecordid>eNpNUdFu2yAUtaZNWtX2C_qCtGdnXDA27C2ysjVTp1VK9oww4ITINRngavmFffVwXVXjBXTOuede7imKO8ArACw-r9t2s9utCCZkRYERguFdcUWgFiVltH7_3_tjcRvjCefDM8Saq-Lv3v5J6Icb3XhAbkQg0rFs7ZimcEGbGNUlot4HtB2fbUzuoNIsVOjRx-i6waLWD4PqfMjEs0XrKR19iEd3Ro_BZ_7pC_rujyPapUmFudEwIDUadK9CcDahvboM2X7Gb4oPvRqivX29r4tfXzf79r58-Plt264fSl1hnkpujQACHBtMDNVUEAb5a1SRHrpad8J0pu6JIgx3DdcauOJcG14zDsLinl4X28XXeHWS5-CeVLhIr5x8AXw4yDyq04OVFPeWd1WdTXHVGKZIhZtaaA5aUU3q7PVp8ToH_3vKG5InP4Uxjy8zy3DFQLCsootKh7y2YPu3roDlnKFcMpRzhvI1w1x1t1Q5a-1bhWiAYV7Rf3NemAw</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2635045195</pqid></control><display><type>article</type><title>Text Mining in 19th-Century Essays for Investigating a Possible Collaborative Authorship Problem: John Stuart Mill and Harriet Taylor Mill</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Neocleous, Andreas ; Kataliakos, Giorgos ; Loizides, Antis</creator><creatorcontrib>Neocleous, Andreas ; Kataliakos, Giorgos ; Loizides, Antis</creatorcontrib><description>In this work, we use machine learning techniques to address a research question regarding the authorship of two famous essays in the nineteenth century. On Liberty (1859) and The Subjection of Women (1869) were published under John Stuart Mill's name, a widely studied nineteenth-century British philosopher. Mill himself attributed them to collaboration with his wife and partner, Harriet Taylor Mill. More than 150 years later, the question remains whether the author of these two canonical texts in the history of political thought was solely John Stuart Mill. Experts are divided on taking John Stuart Mill's attribution at face value, since Harriet Taylor Mill had died in 1858. Addressing this question, we use a dataset consisted in essays of both authors, to train three state-of-the-art classifiers that are able to learn and distinguish the writing style of each author. Then, we use the models built to attribute the two famous essays of disputed authorship to one of the two. From the results, we conclude that the classifiers are able to learn the two classes very well, and they return high accuracies on the validation set. Regarding the test set, most of the models attribute the two essays to John Stuart Mill, however, the contribution of Harriet Taylor Mill is shown for some chunks of text of both essays. These results, we conclude, explain why experts are divided on this particular research question.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2022.3152201</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Authorship ; Authorship attribution ; Classifiers ; Collaboration ; Feature extraction ; feature selection ; Machine learning ; Mill, John Stuart (1806-1873) ; Questions ; Reliability ; Syntactics ; Task analysis ; text classification ; Text mining ; Training ; Writing</subject><ispartof>IEEE access, 2022, Vol.10, p.20937-20947</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-8ed912180d02d3c392515363a2f1b6cb9dbd6f2a250b78cc18a88cd865819e0f3</citedby><cites>FETCH-LOGICAL-c408t-8ed912180d02d3c392515363a2f1b6cb9dbd6f2a250b78cc18a88cd865819e0f3</cites><orcidid>0000-0002-3587-6059 ; 0000-0002-0730-2314</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9715084$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,778,782,862,2098,4012,27616,27906,27907,27908,54916</link.rule.ids></links><search><creatorcontrib>Neocleous, Andreas</creatorcontrib><creatorcontrib>Kataliakos, Giorgos</creatorcontrib><creatorcontrib>Loizides, Antis</creatorcontrib><title>Text Mining in 19th-Century Essays for Investigating a Possible Collaborative Authorship Problem: John Stuart Mill and Harriet Taylor Mill</title><title>IEEE access</title><addtitle>Access</addtitle><description>In this work, we use machine learning techniques to address a research question regarding the authorship of two famous essays in the nineteenth century. On Liberty (1859) and The Subjection of Women (1869) were published under John Stuart Mill's name, a widely studied nineteenth-century British philosopher. Mill himself attributed them to collaboration with his wife and partner, Harriet Taylor Mill. More than 150 years later, the question remains whether the author of these two canonical texts in the history of political thought was solely John Stuart Mill. Experts are divided on taking John Stuart Mill's attribution at face value, since Harriet Taylor Mill had died in 1858. Addressing this question, we use a dataset consisted in essays of both authors, to train three state-of-the-art classifiers that are able to learn and distinguish the writing style of each author. Then, we use the models built to attribute the two famous essays of disputed authorship to one of the two. From the results, we conclude that the classifiers are able to learn the two classes very well, and they return high accuracies on the validation set. Regarding the test set, most of the models attribute the two essays to John Stuart Mill, however, the contribution of Harriet Taylor Mill is shown for some chunks of text of both essays. These results, we conclude, explain why experts are divided on this particular research question.</description><subject>Authorship</subject><subject>Authorship attribution</subject><subject>Classifiers</subject><subject>Collaboration</subject><subject>Feature extraction</subject><subject>feature selection</subject><subject>Machine learning</subject><subject>Mill, John Stuart (1806-1873)</subject><subject>Questions</subject><subject>Reliability</subject><subject>Syntactics</subject><subject>Task analysis</subject><subject>text classification</subject><subject>Text mining</subject><subject>Training</subject><subject>Writing</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUdFu2yAUtaZNWtX2C_qCtGdnXDA27C2ysjVTp1VK9oww4ITINRngavmFffVwXVXjBXTOuede7imKO8ArACw-r9t2s9utCCZkRYERguFdcUWgFiVltH7_3_tjcRvjCefDM8Saq-Lv3v5J6Icb3XhAbkQg0rFs7ZimcEGbGNUlot4HtB2fbUzuoNIsVOjRx-i6waLWD4PqfMjEs0XrKR19iEd3Ro_BZ_7pC_rujyPapUmFudEwIDUadK9CcDahvboM2X7Gb4oPvRqivX29r4tfXzf79r58-Plt264fSl1hnkpujQACHBtMDNVUEAb5a1SRHrpad8J0pu6JIgx3DdcauOJcG14zDsLinl4X28XXeHWS5-CeVLhIr5x8AXw4yDyq04OVFPeWd1WdTXHVGKZIhZtaaA5aUU3q7PVp8ToH_3vKG5InP4Uxjy8zy3DFQLCsootKh7y2YPu3roDlnKFcMpRzhvI1w1x1t1Q5a-1bhWiAYV7Rf3NemAw</recordid><startdate>2022</startdate><enddate>2022</enddate><creator>Neocleous, Andreas</creator><creator>Kataliakos, Giorgos</creator><creator>Loizides, Antis</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-3587-6059</orcidid><orcidid>https://orcid.org/0000-0002-0730-2314</orcidid></search><sort><creationdate>2022</creationdate><title>Text Mining in 19th-Century Essays for Investigating a Possible Collaborative Authorship Problem: John Stuart Mill and Harriet Taylor Mill</title><author>Neocleous, Andreas ; Kataliakos, Giorgos ; Loizides, Antis</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-8ed912180d02d3c392515363a2f1b6cb9dbd6f2a250b78cc18a88cd865819e0f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Authorship</topic><topic>Authorship attribution</topic><topic>Classifiers</topic><topic>Collaboration</topic><topic>Feature extraction</topic><topic>feature selection</topic><topic>Machine learning</topic><topic>Mill, John Stuart (1806-1873)</topic><topic>Questions</topic><topic>Reliability</topic><topic>Syntactics</topic><topic>Task analysis</topic><topic>text classification</topic><topic>Text mining</topic><topic>Training</topic><topic>Writing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Neocleous, Andreas</creatorcontrib><creatorcontrib>Kataliakos, Giorgos</creatorcontrib><creatorcontrib>Loizides, Antis</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Neocleous, Andreas</au><au>Kataliakos, Giorgos</au><au>Loizides, Antis</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Text Mining in 19th-Century Essays for Investigating a Possible Collaborative Authorship Problem: John Stuart Mill and Harriet Taylor Mill</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2022</date><risdate>2022</risdate><volume>10</volume><spage>20937</spage><epage>20947</epage><pages>20937-20947</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>In this work, we use machine learning techniques to address a research question regarding the authorship of two famous essays in the nineteenth century. On Liberty (1859) and The Subjection of Women (1869) were published under John Stuart Mill's name, a widely studied nineteenth-century British philosopher. Mill himself attributed them to collaboration with his wife and partner, Harriet Taylor Mill. More than 150 years later, the question remains whether the author of these two canonical texts in the history of political thought was solely John Stuart Mill. Experts are divided on taking John Stuart Mill's attribution at face value, since Harriet Taylor Mill had died in 1858. Addressing this question, we use a dataset consisted in essays of both authors, to train three state-of-the-art classifiers that are able to learn and distinguish the writing style of each author. Then, we use the models built to attribute the two famous essays of disputed authorship to one of the two. From the results, we conclude that the classifiers are able to learn the two classes very well, and they return high accuracies on the validation set. Regarding the test set, most of the models attribute the two essays to John Stuart Mill, however, the contribution of Harriet Taylor Mill is shown for some chunks of text of both essays. These results, we conclude, explain why experts are divided on this particular research question.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2022.3152201</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0002-3587-6059</orcidid><orcidid>https://orcid.org/0000-0002-0730-2314</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2169-3536
ispartof IEEE access, 2022, Vol.10, p.20937-20947
issn 2169-3536
2169-3536
language eng
recordid cdi_ieee_primary_9715084
source IEEE Open Access Journals; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects Authorship
Authorship attribution
Classifiers
Collaboration
Feature extraction
feature selection
Machine learning
Mill, John Stuart (1806-1873)
Questions
Reliability
Syntactics
Task analysis
text classification
Text mining
Training
Writing
title Text Mining in 19th-Century Essays for Investigating a Possible Collaborative Authorship Problem: John Stuart Mill and Harriet Taylor Mill
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T07%3A47%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Text%20Mining%20in%2019th-Century%20Essays%20for%20Investigating%20a%20Possible%20Collaborative%20Authorship%20Problem:%20John%20Stuart%20Mill%20and%20Harriet%20Taylor%20Mill&rft.jtitle=IEEE%20access&rft.au=Neocleous,%20Andreas&rft.date=2022&rft.volume=10&rft.spage=20937&rft.epage=20947&rft.pages=20937-20947&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2022.3152201&rft_dat=%3Cproquest_ieee_%3E2635045195%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2635045195&rft_id=info:pmid/&rft_ieee_id=9715084&rft_doaj_id=oai_doaj_org_article_30fe8b46b9d047d5a240769c81ca3c26&rfr_iscdi=true