A simple way to improve multivariate analyses of paleoecological data sets

Multivariate methods such as cluster analysis and ordination are basic to paleoecology, but the messy nature of fossil occurrence data often makes it difficult to recover clear patterns. A recently described faunal similarity index based on the Forbes coefficient improves results when its complement...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Paleobiology 2015-06, Vol.41 (3), p.377-386
1. Verfasser: Alroy, John
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 386
container_issue 3
container_start_page 377
container_title Paleobiology
container_volume 41
creator Alroy, John
description Multivariate methods such as cluster analysis and ordination are basic to paleoecology, but the messy nature of fossil occurrence data often makes it difficult to recover clear patterns. A recently described faunal similarity index based on the Forbes coefficient improves results when its complement is employed as a distance metric. This index involves adding terms to the Forbes equation and ignoring one of the counts it employs (that of species found in neither of the samples under consideration). Analyses of simulated data matrices demonstrate its advantages. These matrices include large and small samples from two partially overlapping species pools. In a cluster analysis, the widely used Dice coefficient and the Euclidean distance metric both create groupings that reflect sample size, the Simpson index suggests large differences that do not exist, and the corrected Forbes index creates groupings based strictly on true faunal overlap. In a principal coordinates analysis (PCoA) the Forbes index almost removes the sample-size signal but other approaches create a second axis strongly dominated by sample size. Meanwhile, species lists of late Pleistocene mammals from the United States capture biogeographic signals that standard ordination methods do recover, but the adjusted Forbes coefficient spaces the points out more sensibly. Finally, when biome-scale lists for living mammals are added to the data set and extinct species are removed, correspondence analysis misleadingly separates out the biome lists, and PCoA based on the Dice coefficient places them to the edge of the cloud of fossil assemblage data points. PCoA based on the Forbes index places them in more reasonable positions. Thus, only the adjusted Forbes index is able to recover true biological patterns. These results suggest that the index may be useful in analyzing not only paleontological data sets but any data set that includes species lists having highly variable lengths.
doi_str_mv 10.1017/pab.2014.21
format Article
fullrecord <record><control><sourceid>jstor_proqu</sourceid><recordid>TN_cdi_proquest_miscellaneous_1722170689</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><cupid>10_1017_pab_2014_21</cupid><jstor_id>44017937</jstor_id><sourcerecordid>44017937</sourcerecordid><originalsourceid>FETCH-LOGICAL-a447t-15ece77a47e2093b4be8f1ac514cc3bf2c85b933a9d52608395f8735fe0435683</originalsourceid><addsrcrecordid>eNp9kM1rGzEUxEVJoE7SU88FQS8pYR19rrTHENK0JZBLehZv5bdGRmu50jrB_33lOKSlhJ4eAz9m5g0hHzmbc8bN5Qb6uWBczQV_R2a8k7bRUvIjMmOsU42VRr4nJ6WsWNW6NTPy44qWMG4i0ifY0SnRKnJ6RDpu4xQeIQeYkMIa4q5goWmgG4iY0KeYlsFDpAuYgBacyhk5HiAW_PByT8nPrzcP19-au_vb79dXdw0oZaaGa_RoDCiDgnWyVz3agYPXXHkv-0F4q_tOSugWWrTMyk4P1kg9IFNSt1aekvODby36a4tlcmMoHmOENaZtcdwIwQ1rbVfRz_-gq7TN9Zk9xS23tuWiUhcHyudUSsbBbXIYIe8cZ26_q6u7uv2uTvBKfzrQqzKl_IoqVcGuDvzqtsRUfMC1x6eU4-JPdHVqHWNSPT_TvGTD2OewWOJfFd9M_3Lg-5DSGv_b9DdWv56K</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1718188612</pqid></control><display><type>article</type><title>A simple way to improve multivariate analyses of paleoecological data sets</title><source>Jstor Complete Legacy</source><source>Cambridge University Press Journals Complete</source><creator>Alroy, John</creator><creatorcontrib>Alroy, John</creatorcontrib><description>Multivariate methods such as cluster analysis and ordination are basic to paleoecology, but the messy nature of fossil occurrence data often makes it difficult to recover clear patterns. A recently described faunal similarity index based on the Forbes coefficient improves results when its complement is employed as a distance metric. This index involves adding terms to the Forbes equation and ignoring one of the counts it employs (that of species found in neither of the samples under consideration). Analyses of simulated data matrices demonstrate its advantages. These matrices include large and small samples from two partially overlapping species pools. In a cluster analysis, the widely used Dice coefficient and the Euclidean distance metric both create groupings that reflect sample size, the Simpson index suggests large differences that do not exist, and the corrected Forbes index creates groupings based strictly on true faunal overlap. In a principal coordinates analysis (PCoA) the Forbes index almost removes the sample-size signal but other approaches create a second axis strongly dominated by sample size. Meanwhile, species lists of late Pleistocene mammals from the United States capture biogeographic signals that standard ordination methods do recover, but the adjusted Forbes coefficient spaces the points out more sensibly. Finally, when biome-scale lists for living mammals are added to the data set and extinct species are removed, correspondence analysis misleadingly separates out the biome lists, and PCoA based on the Dice coefficient places them to the edge of the cloud of fossil assemblage data points. PCoA based on the Forbes index places them in more reasonable positions. Thus, only the adjusted Forbes index is able to recover true biological patterns. These results suggest that the index may be useful in analyzing not only paleontological data sets but any data set that includes species lists having highly variable lengths.</description><identifier>ISSN: 0094-8373</identifier><identifier>EISSN: 1938-5331</identifier><identifier>DOI: 10.1017/pab.2014.21</identifier><identifier>CODEN: PALBBM</identifier><language>eng</language><publisher>New York, USA: The Paleontological Society</publisher><subject>Animal ecology ; Biomes ; Cenozoic ; Chordata ; Cluster analysis ; Datasets ; Extinct species ; faunal studies ; FEATURED ARTICLE ; Forbes index ; Fossils ; Mammalia ; Mammals ; Multivariate analysis ; North America ; Ordination ; Paleoecology ; Paleontology ; Pleistocene ; principal coordinates analysis ; Quaternary ; statistical analysis ; Synecology ; Tetrapoda ; United States ; upper Pleistocene ; Vertebrata ; vertebrate</subject><ispartof>Paleobiology, 2015-06, Vol.41 (3), p.377-386</ispartof><rights>2015 The Paleontological Society. All rights reserved.</rights><rights>Copyright © 2015 The Paleontological Society. All rights reserved.</rights><rights>GeoRef, Copyright 2020, American Geosciences Institute. Reference includes data from GeoScienceWorld @Alexandria, VA @USA @United States. Abstract, Copyright, The Paleontological Society</rights><rights>Copyright © 2015 The Paleontological Society</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a447t-15ece77a47e2093b4be8f1ac514cc3bf2c85b933a9d52608395f8735fe0435683</citedby><cites>FETCH-LOGICAL-a447t-15ece77a47e2093b4be8f1ac514cc3bf2c85b933a9d52608395f8735fe0435683</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/44017937$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.cambridge.org/core/product/identifier/S0094837314000219/type/journal_article$$EHTML$$P50$$Gcambridge$$H</linktohtml><link.rule.ids>164,314,776,780,799,27901,27902,55603,57992,58225</link.rule.ids></links><search><creatorcontrib>Alroy, John</creatorcontrib><title>A simple way to improve multivariate analyses of paleoecological data sets</title><title>Paleobiology</title><addtitle>Paleobiology</addtitle><description>Multivariate methods such as cluster analysis and ordination are basic to paleoecology, but the messy nature of fossil occurrence data often makes it difficult to recover clear patterns. A recently described faunal similarity index based on the Forbes coefficient improves results when its complement is employed as a distance metric. This index involves adding terms to the Forbes equation and ignoring one of the counts it employs (that of species found in neither of the samples under consideration). Analyses of simulated data matrices demonstrate its advantages. These matrices include large and small samples from two partially overlapping species pools. In a cluster analysis, the widely used Dice coefficient and the Euclidean distance metric both create groupings that reflect sample size, the Simpson index suggests large differences that do not exist, and the corrected Forbes index creates groupings based strictly on true faunal overlap. In a principal coordinates analysis (PCoA) the Forbes index almost removes the sample-size signal but other approaches create a second axis strongly dominated by sample size. Meanwhile, species lists of late Pleistocene mammals from the United States capture biogeographic signals that standard ordination methods do recover, but the adjusted Forbes coefficient spaces the points out more sensibly. Finally, when biome-scale lists for living mammals are added to the data set and extinct species are removed, correspondence analysis misleadingly separates out the biome lists, and PCoA based on the Dice coefficient places them to the edge of the cloud of fossil assemblage data points. PCoA based on the Forbes index places them in more reasonable positions. Thus, only the adjusted Forbes index is able to recover true biological patterns. These results suggest that the index may be useful in analyzing not only paleontological data sets but any data set that includes species lists having highly variable lengths.</description><subject>Animal ecology</subject><subject>Biomes</subject><subject>Cenozoic</subject><subject>Chordata</subject><subject>Cluster analysis</subject><subject>Datasets</subject><subject>Extinct species</subject><subject>faunal studies</subject><subject>FEATURED ARTICLE</subject><subject>Forbes index</subject><subject>Fossils</subject><subject>Mammalia</subject><subject>Mammals</subject><subject>Multivariate analysis</subject><subject>North America</subject><subject>Ordination</subject><subject>Paleoecology</subject><subject>Paleontology</subject><subject>Pleistocene</subject><subject>principal coordinates analysis</subject><subject>Quaternary</subject><subject>statistical analysis</subject><subject>Synecology</subject><subject>Tetrapoda</subject><subject>United States</subject><subject>upper Pleistocene</subject><subject>Vertebrata</subject><subject>vertebrate</subject><issn>0094-8373</issn><issn>1938-5331</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNp9kM1rGzEUxEVJoE7SU88FQS8pYR19rrTHENK0JZBLehZv5bdGRmu50jrB_33lOKSlhJ4eAz9m5g0hHzmbc8bN5Qb6uWBczQV_R2a8k7bRUvIjMmOsU42VRr4nJ6WsWNW6NTPy44qWMG4i0ifY0SnRKnJ6RDpu4xQeIQeYkMIa4q5goWmgG4iY0KeYlsFDpAuYgBacyhk5HiAW_PByT8nPrzcP19-au_vb79dXdw0oZaaGa_RoDCiDgnWyVz3agYPXXHkv-0F4q_tOSugWWrTMyk4P1kg9IFNSt1aekvODby36a4tlcmMoHmOENaZtcdwIwQ1rbVfRz_-gq7TN9Zk9xS23tuWiUhcHyudUSsbBbXIYIe8cZ26_q6u7uv2uTvBKfzrQqzKl_IoqVcGuDvzqtsRUfMC1x6eU4-JPdHVqHWNSPT_TvGTD2OewWOJfFd9M_3Lg-5DSGv_b9DdWv56K</recordid><startdate>20150601</startdate><enddate>20150601</enddate><creator>Alroy, John</creator><general>The Paleontological Society</general><general>Cambridge University Press</general><general>Paleontological Society</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>4T-</scope><scope>4U-</scope><scope>7QL</scope><scope>7SN</scope><scope>7T7</scope><scope>7U9</scope><scope>88A</scope><scope>8AF</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FH</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>BKSAR</scope><scope>C1K</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>F1W</scope><scope>FR3</scope><scope>GNUQQ</scope><scope>H94</scope><scope>H95</scope><scope>HCIFZ</scope><scope>L.G</scope><scope>LK8</scope><scope>M7N</scope><scope>M7P</scope><scope>P64</scope><scope>PCBAR</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>S0X</scope></search><sort><creationdate>20150601</creationdate><title>A simple way to improve multivariate analyses of paleoecological data sets</title><author>Alroy, John</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a447t-15ece77a47e2093b4be8f1ac514cc3bf2c85b933a9d52608395f8735fe0435683</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Animal ecology</topic><topic>Biomes</topic><topic>Cenozoic</topic><topic>Chordata</topic><topic>Cluster analysis</topic><topic>Datasets</topic><topic>Extinct species</topic><topic>faunal studies</topic><topic>FEATURED ARTICLE</topic><topic>Forbes index</topic><topic>Fossils</topic><topic>Mammalia</topic><topic>Mammals</topic><topic>Multivariate analysis</topic><topic>North America</topic><topic>Ordination</topic><topic>Paleoecology</topic><topic>Paleontology</topic><topic>Pleistocene</topic><topic>principal coordinates analysis</topic><topic>Quaternary</topic><topic>statistical analysis</topic><topic>Synecology</topic><topic>Tetrapoda</topic><topic>United States</topic><topic>upper Pleistocene</topic><topic>Vertebrata</topic><topic>vertebrate</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Alroy, John</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Docstoc</collection><collection>University Readers</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Ecology Abstracts</collection><collection>Industrial and Applied Microbiology Abstracts (Microbiology A)</collection><collection>Virology and AIDS Abstracts</collection><collection>Biology Database (Alumni Edition)</collection><collection>STEM Database</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Natural Science Collection</collection><collection>Earth, Atmospheric &amp; Aquatic Science Collection</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Engineering Research Database</collection><collection>ProQuest Central Student</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) 1: Biological Sciences &amp; Living Resources</collection><collection>SciTech Premium Collection</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) Professional</collection><collection>ProQuest Biological Science Collection</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biological Science Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Earth, Atmospheric &amp; Aquatic Science Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>SIRS Editorial</collection><jtitle>Paleobiology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Alroy, John</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A simple way to improve multivariate analyses of paleoecological data sets</atitle><jtitle>Paleobiology</jtitle><addtitle>Paleobiology</addtitle><date>2015-06-01</date><risdate>2015</risdate><volume>41</volume><issue>3</issue><spage>377</spage><epage>386</epage><pages>377-386</pages><issn>0094-8373</issn><eissn>1938-5331</eissn><coden>PALBBM</coden><abstract>Multivariate methods such as cluster analysis and ordination are basic to paleoecology, but the messy nature of fossil occurrence data often makes it difficult to recover clear patterns. A recently described faunal similarity index based on the Forbes coefficient improves results when its complement is employed as a distance metric. This index involves adding terms to the Forbes equation and ignoring one of the counts it employs (that of species found in neither of the samples under consideration). Analyses of simulated data matrices demonstrate its advantages. These matrices include large and small samples from two partially overlapping species pools. In a cluster analysis, the widely used Dice coefficient and the Euclidean distance metric both create groupings that reflect sample size, the Simpson index suggests large differences that do not exist, and the corrected Forbes index creates groupings based strictly on true faunal overlap. In a principal coordinates analysis (PCoA) the Forbes index almost removes the sample-size signal but other approaches create a second axis strongly dominated by sample size. Meanwhile, species lists of late Pleistocene mammals from the United States capture biogeographic signals that standard ordination methods do recover, but the adjusted Forbes coefficient spaces the points out more sensibly. Finally, when biome-scale lists for living mammals are added to the data set and extinct species are removed, correspondence analysis misleadingly separates out the biome lists, and PCoA based on the Dice coefficient places them to the edge of the cloud of fossil assemblage data points. PCoA based on the Forbes index places them in more reasonable positions. Thus, only the adjusted Forbes index is able to recover true biological patterns. These results suggest that the index may be useful in analyzing not only paleontological data sets but any data set that includes species lists having highly variable lengths.</abstract><cop>New York, USA</cop><pub>The Paleontological Society</pub><doi>10.1017/pab.2014.21</doi><tpages>10</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0094-8373
ispartof Paleobiology, 2015-06, Vol.41 (3), p.377-386
issn 0094-8373
1938-5331
language eng
recordid cdi_proquest_miscellaneous_1722170689
source Jstor Complete Legacy; Cambridge University Press Journals Complete
subjects Animal ecology
Biomes
Cenozoic
Chordata
Cluster analysis
Datasets
Extinct species
faunal studies
FEATURED ARTICLE
Forbes index
Fossils
Mammalia
Mammals
Multivariate analysis
North America
Ordination
Paleoecology
Paleontology
Pleistocene
principal coordinates analysis
Quaternary
statistical analysis
Synecology
Tetrapoda
United States
upper Pleistocene
Vertebrata
vertebrate
title A simple way to improve multivariate analyses of paleoecological data sets
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-14T01%3A05%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20simple%20way%20to%20improve%20multivariate%20analyses%20of%20paleoecological%20data%20sets&rft.jtitle=Paleobiology&rft.au=Alroy,%20John&rft.date=2015-06-01&rft.volume=41&rft.issue=3&rft.spage=377&rft.epage=386&rft.pages=377-386&rft.issn=0094-8373&rft.eissn=1938-5331&rft.coden=PALBBM&rft_id=info:doi/10.1017/pab.2014.21&rft_dat=%3Cjstor_proqu%3E44017937%3C/jstor_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1718188612&rft_id=info:pmid/&rft_cupid=10_1017_pab_2014_21&rft_jstor_id=44017937&rfr_iscdi=true