A simple way to improve multivariate analyses of paleoecological data sets
Multivariate methods such as cluster analysis and ordination are basic to paleoecology, but the messy nature of fossil occurrence data often makes it difficult to recover clear patterns. A recently described faunal similarity index based on the Forbes coefficient improves results when its complement...
Gespeichert in:
Veröffentlicht in: | Paleobiology 2015-06, Vol.41 (3), p.377-386 |
---|---|
1. Verfasser: | |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 386 |
---|---|
container_issue | 3 |
container_start_page | 377 |
container_title | Paleobiology |
container_volume | 41 |
creator | Alroy, John |
description | Multivariate methods such as cluster analysis and ordination are basic to paleoecology, but the messy nature of fossil occurrence data often makes it difficult to recover clear patterns. A recently described faunal similarity index based on the Forbes coefficient improves results when its complement is employed as a distance metric. This index involves adding terms to the Forbes equation and ignoring one of the counts it employs (that of species found in neither of the samples under consideration). Analyses of simulated data matrices demonstrate its advantages. These matrices include large and small samples from two partially overlapping species pools. In a cluster analysis, the widely used Dice coefficient and the Euclidean distance metric both create groupings that reflect sample size, the Simpson index suggests large differences that do not exist, and the corrected Forbes index creates groupings based strictly on true faunal overlap. In a principal coordinates analysis (PCoA) the Forbes index almost removes the sample-size signal but other approaches create a second axis strongly dominated by sample size. Meanwhile, species lists of late Pleistocene mammals from the United States capture biogeographic signals that standard ordination methods do recover, but the adjusted Forbes coefficient spaces the points out more sensibly. Finally, when biome-scale lists for living mammals are added to the data set and extinct species are removed, correspondence analysis misleadingly separates out the biome lists, and PCoA based on the Dice coefficient places them to the edge of the cloud of fossil assemblage data points. PCoA based on the Forbes index places them in more reasonable positions. Thus, only the adjusted Forbes index is able to recover true biological patterns. These results suggest that the index may be useful in analyzing not only paleontological data sets but any data set that includes species lists having highly variable lengths. |
doi_str_mv | 10.1017/pab.2014.21 |
format | Article |
fullrecord | <record><control><sourceid>jstor_proqu</sourceid><recordid>TN_cdi_proquest_miscellaneous_1722170689</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><cupid>10_1017_pab_2014_21</cupid><jstor_id>44017937</jstor_id><sourcerecordid>44017937</sourcerecordid><originalsourceid>FETCH-LOGICAL-a447t-15ece77a47e2093b4be8f1ac514cc3bf2c85b933a9d52608395f8735fe0435683</originalsourceid><addsrcrecordid>eNp9kM1rGzEUxEVJoE7SU88FQS8pYR19rrTHENK0JZBLehZv5bdGRmu50jrB_33lOKSlhJ4eAz9m5g0hHzmbc8bN5Qb6uWBczQV_R2a8k7bRUvIjMmOsU42VRr4nJ6WsWNW6NTPy44qWMG4i0ifY0SnRKnJ6RDpu4xQeIQeYkMIa4q5goWmgG4iY0KeYlsFDpAuYgBacyhk5HiAW_PByT8nPrzcP19-au_vb79dXdw0oZaaGa_RoDCiDgnWyVz3agYPXXHkv-0F4q_tOSugWWrTMyk4P1kg9IFNSt1aekvODby36a4tlcmMoHmOENaZtcdwIwQ1rbVfRz_-gq7TN9Zk9xS23tuWiUhcHyudUSsbBbXIYIe8cZ26_q6u7uv2uTvBKfzrQqzKl_IoqVcGuDvzqtsRUfMC1x6eU4-JPdHVqHWNSPT_TvGTD2OewWOJfFd9M_3Lg-5DSGv_b9DdWv56K</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1718188612</pqid></control><display><type>article</type><title>A simple way to improve multivariate analyses of paleoecological data sets</title><source>Jstor Complete Legacy</source><source>Cambridge University Press Journals Complete</source><creator>Alroy, John</creator><creatorcontrib>Alroy, John</creatorcontrib><description>Multivariate methods such as cluster analysis and ordination are basic to paleoecology, but the messy nature of fossil occurrence data often makes it difficult to recover clear patterns. A recently described faunal similarity index based on the Forbes coefficient improves results when its complement is employed as a distance metric. This index involves adding terms to the Forbes equation and ignoring one of the counts it employs (that of species found in neither of the samples under consideration). Analyses of simulated data matrices demonstrate its advantages. These matrices include large and small samples from two partially overlapping species pools. In a cluster analysis, the widely used Dice coefficient and the Euclidean distance metric both create groupings that reflect sample size, the Simpson index suggests large differences that do not exist, and the corrected Forbes index creates groupings based strictly on true faunal overlap. In a principal coordinates analysis (PCoA) the Forbes index almost removes the sample-size signal but other approaches create a second axis strongly dominated by sample size. Meanwhile, species lists of late Pleistocene mammals from the United States capture biogeographic signals that standard ordination methods do recover, but the adjusted Forbes coefficient spaces the points out more sensibly. Finally, when biome-scale lists for living mammals are added to the data set and extinct species are removed, correspondence analysis misleadingly separates out the biome lists, and PCoA based on the Dice coefficient places them to the edge of the cloud of fossil assemblage data points. PCoA based on the Forbes index places them in more reasonable positions. Thus, only the adjusted Forbes index is able to recover true biological patterns. These results suggest that the index may be useful in analyzing not only paleontological data sets but any data set that includes species lists having highly variable lengths.</description><identifier>ISSN: 0094-8373</identifier><identifier>EISSN: 1938-5331</identifier><identifier>DOI: 10.1017/pab.2014.21</identifier><identifier>CODEN: PALBBM</identifier><language>eng</language><publisher>New York, USA: The Paleontological Society</publisher><subject>Animal ecology ; Biomes ; Cenozoic ; Chordata ; Cluster analysis ; Datasets ; Extinct species ; faunal studies ; FEATURED ARTICLE ; Forbes index ; Fossils ; Mammalia ; Mammals ; Multivariate analysis ; North America ; Ordination ; Paleoecology ; Paleontology ; Pleistocene ; principal coordinates analysis ; Quaternary ; statistical analysis ; Synecology ; Tetrapoda ; United States ; upper Pleistocene ; Vertebrata ; vertebrate</subject><ispartof>Paleobiology, 2015-06, Vol.41 (3), p.377-386</ispartof><rights>2015 The Paleontological Society. All rights reserved.</rights><rights>Copyright © 2015 The Paleontological Society. All rights reserved.</rights><rights>GeoRef, Copyright 2020, American Geosciences Institute. Reference includes data from GeoScienceWorld @Alexandria, VA @USA @United States. Abstract, Copyright, The Paleontological Society</rights><rights>Copyright © 2015 The Paleontological Society</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a447t-15ece77a47e2093b4be8f1ac514cc3bf2c85b933a9d52608395f8735fe0435683</citedby><cites>FETCH-LOGICAL-a447t-15ece77a47e2093b4be8f1ac514cc3bf2c85b933a9d52608395f8735fe0435683</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/44017937$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.cambridge.org/core/product/identifier/S0094837314000219/type/journal_article$$EHTML$$P50$$Gcambridge$$H</linktohtml><link.rule.ids>164,314,776,780,799,27901,27902,55603,57992,58225</link.rule.ids></links><search><creatorcontrib>Alroy, John</creatorcontrib><title>A simple way to improve multivariate analyses of paleoecological data sets</title><title>Paleobiology</title><addtitle>Paleobiology</addtitle><description>Multivariate methods such as cluster analysis and ordination are basic to paleoecology, but the messy nature of fossil occurrence data often makes it difficult to recover clear patterns. A recently described faunal similarity index based on the Forbes coefficient improves results when its complement is employed as a distance metric. This index involves adding terms to the Forbes equation and ignoring one of the counts it employs (that of species found in neither of the samples under consideration). Analyses of simulated data matrices demonstrate its advantages. These matrices include large and small samples from two partially overlapping species pools. In a cluster analysis, the widely used Dice coefficient and the Euclidean distance metric both create groupings that reflect sample size, the Simpson index suggests large differences that do not exist, and the corrected Forbes index creates groupings based strictly on true faunal overlap. In a principal coordinates analysis (PCoA) the Forbes index almost removes the sample-size signal but other approaches create a second axis strongly dominated by sample size. Meanwhile, species lists of late Pleistocene mammals from the United States capture biogeographic signals that standard ordination methods do recover, but the adjusted Forbes coefficient spaces the points out more sensibly. Finally, when biome-scale lists for living mammals are added to the data set and extinct species are removed, correspondence analysis misleadingly separates out the biome lists, and PCoA based on the Dice coefficient places them to the edge of the cloud of fossil assemblage data points. PCoA based on the Forbes index places them in more reasonable positions. Thus, only the adjusted Forbes index is able to recover true biological patterns. These results suggest that the index may be useful in analyzing not only paleontological data sets but any data set that includes species lists having highly variable lengths.</description><subject>Animal ecology</subject><subject>Biomes</subject><subject>Cenozoic</subject><subject>Chordata</subject><subject>Cluster analysis</subject><subject>Datasets</subject><subject>Extinct species</subject><subject>faunal studies</subject><subject>FEATURED ARTICLE</subject><subject>Forbes index</subject><subject>Fossils</subject><subject>Mammalia</subject><subject>Mammals</subject><subject>Multivariate analysis</subject><subject>North America</subject><subject>Ordination</subject><subject>Paleoecology</subject><subject>Paleontology</subject><subject>Pleistocene</subject><subject>principal coordinates analysis</subject><subject>Quaternary</subject><subject>statistical analysis</subject><subject>Synecology</subject><subject>Tetrapoda</subject><subject>United States</subject><subject>upper Pleistocene</subject><subject>Vertebrata</subject><subject>vertebrate</subject><issn>0094-8373</issn><issn>1938-5331</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNp9kM1rGzEUxEVJoE7SU88FQS8pYR19rrTHENK0JZBLehZv5bdGRmu50jrB_33lOKSlhJ4eAz9m5g0hHzmbc8bN5Qb6uWBczQV_R2a8k7bRUvIjMmOsU42VRr4nJ6WsWNW6NTPy44qWMG4i0ifY0SnRKnJ6RDpu4xQeIQeYkMIa4q5goWmgG4iY0KeYlsFDpAuYgBacyhk5HiAW_PByT8nPrzcP19-au_vb79dXdw0oZaaGa_RoDCiDgnWyVz3agYPXXHkv-0F4q_tOSugWWrTMyk4P1kg9IFNSt1aekvODby36a4tlcmMoHmOENaZtcdwIwQ1rbVfRz_-gq7TN9Zk9xS23tuWiUhcHyudUSsbBbXIYIe8cZ26_q6u7uv2uTvBKfzrQqzKl_IoqVcGuDvzqtsRUfMC1x6eU4-JPdHVqHWNSPT_TvGTD2OewWOJfFd9M_3Lg-5DSGv_b9DdWv56K</recordid><startdate>20150601</startdate><enddate>20150601</enddate><creator>Alroy, John</creator><general>The Paleontological Society</general><general>Cambridge University Press</general><general>Paleontological Society</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>4T-</scope><scope>4U-</scope><scope>7QL</scope><scope>7SN</scope><scope>7T7</scope><scope>7U9</scope><scope>88A</scope><scope>8AF</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FH</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>BKSAR</scope><scope>C1K</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>F1W</scope><scope>FR3</scope><scope>GNUQQ</scope><scope>H94</scope><scope>H95</scope><scope>HCIFZ</scope><scope>L.G</scope><scope>LK8</scope><scope>M7N</scope><scope>M7P</scope><scope>P64</scope><scope>PCBAR</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>S0X</scope></search><sort><creationdate>20150601</creationdate><title>A simple way to improve multivariate analyses of paleoecological data sets</title><author>Alroy, John</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a447t-15ece77a47e2093b4be8f1ac514cc3bf2c85b933a9d52608395f8735fe0435683</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Animal ecology</topic><topic>Biomes</topic><topic>Cenozoic</topic><topic>Chordata</topic><topic>Cluster analysis</topic><topic>Datasets</topic><topic>Extinct species</topic><topic>faunal studies</topic><topic>FEATURED ARTICLE</topic><topic>Forbes index</topic><topic>Fossils</topic><topic>Mammalia</topic><topic>Mammals</topic><topic>Multivariate analysis</topic><topic>North America</topic><topic>Ordination</topic><topic>Paleoecology</topic><topic>Paleontology</topic><topic>Pleistocene</topic><topic>principal coordinates analysis</topic><topic>Quaternary</topic><topic>statistical analysis</topic><topic>Synecology</topic><topic>Tetrapoda</topic><topic>United States</topic><topic>upper Pleistocene</topic><topic>Vertebrata</topic><topic>vertebrate</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Alroy, John</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Docstoc</collection><collection>University Readers</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Ecology Abstracts</collection><collection>Industrial and Applied Microbiology Abstracts (Microbiology A)</collection><collection>Virology and AIDS Abstracts</collection><collection>Biology Database (Alumni Edition)</collection><collection>STEM Database</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Natural Science Collection</collection><collection>Earth, Atmospheric & Aquatic Science Collection</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Engineering Research Database</collection><collection>ProQuest Central Student</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) 1: Biological Sciences & Living Resources</collection><collection>SciTech Premium Collection</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) Professional</collection><collection>ProQuest Biological Science Collection</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biological Science Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Earth, Atmospheric & Aquatic Science Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>SIRS Editorial</collection><jtitle>Paleobiology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Alroy, John</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A simple way to improve multivariate analyses of paleoecological data sets</atitle><jtitle>Paleobiology</jtitle><addtitle>Paleobiology</addtitle><date>2015-06-01</date><risdate>2015</risdate><volume>41</volume><issue>3</issue><spage>377</spage><epage>386</epage><pages>377-386</pages><issn>0094-8373</issn><eissn>1938-5331</eissn><coden>PALBBM</coden><abstract>Multivariate methods such as cluster analysis and ordination are basic to paleoecology, but the messy nature of fossil occurrence data often makes it difficult to recover clear patterns. A recently described faunal similarity index based on the Forbes coefficient improves results when its complement is employed as a distance metric. This index involves adding terms to the Forbes equation and ignoring one of the counts it employs (that of species found in neither of the samples under consideration). Analyses of simulated data matrices demonstrate its advantages. These matrices include large and small samples from two partially overlapping species pools. In a cluster analysis, the widely used Dice coefficient and the Euclidean distance metric both create groupings that reflect sample size, the Simpson index suggests large differences that do not exist, and the corrected Forbes index creates groupings based strictly on true faunal overlap. In a principal coordinates analysis (PCoA) the Forbes index almost removes the sample-size signal but other approaches create a second axis strongly dominated by sample size. Meanwhile, species lists of late Pleistocene mammals from the United States capture biogeographic signals that standard ordination methods do recover, but the adjusted Forbes coefficient spaces the points out more sensibly. Finally, when biome-scale lists for living mammals are added to the data set and extinct species are removed, correspondence analysis misleadingly separates out the biome lists, and PCoA based on the Dice coefficient places them to the edge of the cloud of fossil assemblage data points. PCoA based on the Forbes index places them in more reasonable positions. Thus, only the adjusted Forbes index is able to recover true biological patterns. These results suggest that the index may be useful in analyzing not only paleontological data sets but any data set that includes species lists having highly variable lengths.</abstract><cop>New York, USA</cop><pub>The Paleontological Society</pub><doi>10.1017/pab.2014.21</doi><tpages>10</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0094-8373 |
ispartof | Paleobiology, 2015-06, Vol.41 (3), p.377-386 |
issn | 0094-8373 1938-5331 |
language | eng |
recordid | cdi_proquest_miscellaneous_1722170689 |
source | Jstor Complete Legacy; Cambridge University Press Journals Complete |
subjects | Animal ecology Biomes Cenozoic Chordata Cluster analysis Datasets Extinct species faunal studies FEATURED ARTICLE Forbes index Fossils Mammalia Mammals Multivariate analysis North America Ordination Paleoecology Paleontology Pleistocene principal coordinates analysis Quaternary statistical analysis Synecology Tetrapoda United States upper Pleistocene Vertebrata vertebrate |
title | A simple way to improve multivariate analyses of paleoecological data sets |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-14T01%3A05%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20simple%20way%20to%20improve%20multivariate%20analyses%20of%20paleoecological%20data%20sets&rft.jtitle=Paleobiology&rft.au=Alroy,%20John&rft.date=2015-06-01&rft.volume=41&rft.issue=3&rft.spage=377&rft.epage=386&rft.pages=377-386&rft.issn=0094-8373&rft.eissn=1938-5331&rft.coden=PALBBM&rft_id=info:doi/10.1017/pab.2014.21&rft_dat=%3Cjstor_proqu%3E44017937%3C/jstor_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1718188612&rft_id=info:pmid/&rft_cupid=10_1017_pab_2014_21&rft_jstor_id=44017937&rfr_iscdi=true |