Comparison of merged and non-merged similarity clustering analysis methods

Distribution data of 4638 species in seven geographic regions of Shanxi Province were examined as a small sample, of 7766 species in 14 geographic regions of Inner Mongolia as a medium sample, and of 16804 genera in 67 ecological regions of China as a large sample. Statistical analyses of the three...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Sheng tai xue bao 2013-06, Vol.33 (11), p.3480-3487
Hauptverfasser: Liu, X, Shen, Q, Zhang, S, Yang, D, Ren, Y
Format: Artikel
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 3487
container_issue 11
container_start_page 3480
container_title Sheng tai xue bao
container_volume 33
creator Liu, X
Shen, Q
Zhang, S
Yang, D
Ren, Y
description Distribution data of 4638 species in seven geographic regions of Shanxi Province were examined as a small sample, of 7766 species in 14 geographic regions of Inner Mongolia as a medium sample, and of 16804 genera in 67 ecological regions of China as a large sample. Statistical analyses of the three data groups were conducted separately, using a traditional merged method (similarity clustering analysis, SCA) and a new non-merged method (multivariate similarity clustering analysis (MSCA)). A critical comparison of the two methods demonstrates that the non-merged method can attain a result suitable for both logistics of biological statistics and geography, regardless of the scale of the data. The merged method (SCA) may achieve a result closely resembling that of the non-merged method when dealing with a fewer number of geographic regions. However, with an increased number of geographic regions, the clustering structure with the merged method may create a change at a different level -- so much as to cause a complete loss of functionality. Regardless of the magnitude of difference between results of the two kinds of clustering, their nature will be totally different. The non-merged method similarity coefficients are inherent, independent of each other and exist simultaneously, the clustering result reflects the relationship and distance of all involved geographic regions, and all the coefficients are easily calculated with no strict orders. In the merged method, however, every coefficient was considered to be founded upon or be the result of clustering. The non-merged coefficient is the basis for the merged coefficient's emergence, which is a result of the non-merged coefficient's disappearance after merging. All of the calculations depend on input data and the deduced result is strictly in alphabetical order. It should be noted that the newest or final coefficients were worked out or generated, whereas the non-merged coefficients as well as the involved geographic regions had to be eliminated or discarded. The newest clustering coefficients were constantly generated, subsequently disappearing with the circulation. MSCA, in agreement with the value and huge contribution by SCA methods, can correct errors or inaccuracy that caused by merging or descending order during clustering by SCA method. It especially avoids some lost branches in the clustering result that are very important to the relationship, and cannot find any similarity level that requires indicatio
doi_str_mv 10.5846/stxb201203090319
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1770317426</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1770317426</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2669-4cb78586659fb310dd595668e68d0447844361471b9f90875a70331ed0ab98c13</originalsourceid><addsrcrecordid>eNqFkDtPwzAUhT2ARCnsjBlZAvfGjh8jqniqEgvMkRM7xSiJi28q0X-PUTuxMF0d3e-c4WPsCuGm1kLe0vzdVoAVcDDA0ZywBQJACYbzM3ZO9An5hdws2MsqjlubAsWpiH0x-rTxrrCTK6Y4lcdIYQxDhuZ90Q07mn0K0yZDdthToFyaP6KjC3ba24H85fEu2fvD_dvqqVy_Pj6v7tZlV0lpStG1Stdaytr0LUdwrja1lNpL7UAIpYXgEoXC1vQGtKqtAs7RO7Ct0R3yJbs-7G5T_Np5mpsxUOeHwU4-7qhBlQuoRCX_R0VlDGQ3VUbhgHYpEiXfN9sURpv2DULza7X5a5X_ADIha80</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1429901002</pqid></control><display><type>article</type><title>Comparison of merged and non-merged similarity clustering analysis methods</title><source>EZB-FREE-00999 freely available EZB journals</source><creator>Liu, X ; Shen, Q ; Zhang, S ; Yang, D ; Ren, Y</creator><creatorcontrib>Liu, X ; Shen, Q ; Zhang, S ; Yang, D ; Ren, Y</creatorcontrib><description>Distribution data of 4638 species in seven geographic regions of Shanxi Province were examined as a small sample, of 7766 species in 14 geographic regions of Inner Mongolia as a medium sample, and of 16804 genera in 67 ecological regions of China as a large sample. Statistical analyses of the three data groups were conducted separately, using a traditional merged method (similarity clustering analysis, SCA) and a new non-merged method (multivariate similarity clustering analysis (MSCA)). A critical comparison of the two methods demonstrates that the non-merged method can attain a result suitable for both logistics of biological statistics and geography, regardless of the scale of the data. The merged method (SCA) may achieve a result closely resembling that of the non-merged method when dealing with a fewer number of geographic regions. However, with an increased number of geographic regions, the clustering structure with the merged method may create a change at a different level -- so much as to cause a complete loss of functionality. Regardless of the magnitude of difference between results of the two kinds of clustering, their nature will be totally different. The non-merged method similarity coefficients are inherent, independent of each other and exist simultaneously, the clustering result reflects the relationship and distance of all involved geographic regions, and all the coefficients are easily calculated with no strict orders. In the merged method, however, every coefficient was considered to be founded upon or be the result of clustering. The non-merged coefficient is the basis for the merged coefficient's emergence, which is a result of the non-merged coefficient's disappearance after merging. All of the calculations depend on input data and the deduced result is strictly in alphabetical order. It should be noted that the newest or final coefficients were worked out or generated, whereas the non-merged coefficients as well as the involved geographic regions had to be eliminated or discarded. The newest clustering coefficients were constantly generated, subsequently disappearing with the circulation. MSCA, in agreement with the value and huge contribution by SCA methods, can correct errors or inaccuracy that caused by merging or descending order during clustering by SCA method. It especially avoids some lost branches in the clustering result that are very important to the relationship, and cannot find any similarity level that requires indication in some detail. In summary, the MSCA method can solve many of the problems of the SCA method. The clustering achieves greater accuracy, which makes the results fit ecological reality. Also, our modified MSCA method can easily perform macroscopic clustering analysis of ecosystem data, which has never been completely accomplished before.</description><identifier>ISSN: 1000-0933</identifier><identifier>DOI: 10.5846/stxb201203090319</identifier><language>chi ; eng</language><subject>Clustering ; Coefficients ; Ecology ; Merging ; Samples ; Similarity ; Statistical analysis ; Statistical methods</subject><ispartof>Sheng tai xue bao, 2013-06, Vol.33 (11), p.3480-3487</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c2669-4cb78586659fb310dd595668e68d0447844361471b9f90875a70331ed0ab98c13</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Liu, X</creatorcontrib><creatorcontrib>Shen, Q</creatorcontrib><creatorcontrib>Zhang, S</creatorcontrib><creatorcontrib>Yang, D</creatorcontrib><creatorcontrib>Ren, Y</creatorcontrib><title>Comparison of merged and non-merged similarity clustering analysis methods</title><title>Sheng tai xue bao</title><description>Distribution data of 4638 species in seven geographic regions of Shanxi Province were examined as a small sample, of 7766 species in 14 geographic regions of Inner Mongolia as a medium sample, and of 16804 genera in 67 ecological regions of China as a large sample. Statistical analyses of the three data groups were conducted separately, using a traditional merged method (similarity clustering analysis, SCA) and a new non-merged method (multivariate similarity clustering analysis (MSCA)). A critical comparison of the two methods demonstrates that the non-merged method can attain a result suitable for both logistics of biological statistics and geography, regardless of the scale of the data. The merged method (SCA) may achieve a result closely resembling that of the non-merged method when dealing with a fewer number of geographic regions. However, with an increased number of geographic regions, the clustering structure with the merged method may create a change at a different level -- so much as to cause a complete loss of functionality. Regardless of the magnitude of difference between results of the two kinds of clustering, their nature will be totally different. The non-merged method similarity coefficients are inherent, independent of each other and exist simultaneously, the clustering result reflects the relationship and distance of all involved geographic regions, and all the coefficients are easily calculated with no strict orders. In the merged method, however, every coefficient was considered to be founded upon or be the result of clustering. The non-merged coefficient is the basis for the merged coefficient's emergence, which is a result of the non-merged coefficient's disappearance after merging. All of the calculations depend on input data and the deduced result is strictly in alphabetical order. It should be noted that the newest or final coefficients were worked out or generated, whereas the non-merged coefficients as well as the involved geographic regions had to be eliminated or discarded. The newest clustering coefficients were constantly generated, subsequently disappearing with the circulation. MSCA, in agreement with the value and huge contribution by SCA methods, can correct errors or inaccuracy that caused by merging or descending order during clustering by SCA method. It especially avoids some lost branches in the clustering result that are very important to the relationship, and cannot find any similarity level that requires indication in some detail. In summary, the MSCA method can solve many of the problems of the SCA method. The clustering achieves greater accuracy, which makes the results fit ecological reality. Also, our modified MSCA method can easily perform macroscopic clustering analysis of ecosystem data, which has never been completely accomplished before.</description><subject>Clustering</subject><subject>Coefficients</subject><subject>Ecology</subject><subject>Merging</subject><subject>Samples</subject><subject>Similarity</subject><subject>Statistical analysis</subject><subject>Statistical methods</subject><issn>1000-0933</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><recordid>eNqFkDtPwzAUhT2ARCnsjBlZAvfGjh8jqniqEgvMkRM7xSiJi28q0X-PUTuxMF0d3e-c4WPsCuGm1kLe0vzdVoAVcDDA0ZywBQJACYbzM3ZO9An5hdws2MsqjlubAsWpiH0x-rTxrrCTK6Y4lcdIYQxDhuZ90Q07mn0K0yZDdthToFyaP6KjC3ba24H85fEu2fvD_dvqqVy_Pj6v7tZlV0lpStG1Stdaytr0LUdwrja1lNpL7UAIpYXgEoXC1vQGtKqtAs7RO7Ct0R3yJbs-7G5T_Np5mpsxUOeHwU4-7qhBlQuoRCX_R0VlDGQ3VUbhgHYpEiXfN9sURpv2DULza7X5a5X_ADIha80</recordid><startdate>20130601</startdate><enddate>20130601</enddate><creator>Liu, X</creator><creator>Shen, Q</creator><creator>Zhang, S</creator><creator>Yang, D</creator><creator>Ren, Y</creator><scope>AAYXX</scope><scope>CITATION</scope><scope>8FD</scope><scope>FR3</scope><scope>KR7</scope></search><sort><creationdate>20130601</creationdate><title>Comparison of merged and non-merged similarity clustering analysis methods</title><author>Liu, X ; Shen, Q ; Zhang, S ; Yang, D ; Ren, Y</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2669-4cb78586659fb310dd595668e68d0447844361471b9f90875a70331ed0ab98c13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>chi ; eng</language><creationdate>2013</creationdate><topic>Clustering</topic><topic>Coefficients</topic><topic>Ecology</topic><topic>Merging</topic><topic>Samples</topic><topic>Similarity</topic><topic>Statistical analysis</topic><topic>Statistical methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, X</creatorcontrib><creatorcontrib>Shen, Q</creatorcontrib><creatorcontrib>Zhang, S</creatorcontrib><creatorcontrib>Yang, D</creatorcontrib><creatorcontrib>Ren, Y</creatorcontrib><collection>CrossRef</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Civil Engineering Abstracts</collection><jtitle>Sheng tai xue bao</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, X</au><au>Shen, Q</au><au>Zhang, S</au><au>Yang, D</au><au>Ren, Y</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Comparison of merged and non-merged similarity clustering analysis methods</atitle><jtitle>Sheng tai xue bao</jtitle><date>2013-06-01</date><risdate>2013</risdate><volume>33</volume><issue>11</issue><spage>3480</spage><epage>3487</epage><pages>3480-3487</pages><issn>1000-0933</issn><abstract>Distribution data of 4638 species in seven geographic regions of Shanxi Province were examined as a small sample, of 7766 species in 14 geographic regions of Inner Mongolia as a medium sample, and of 16804 genera in 67 ecological regions of China as a large sample. Statistical analyses of the three data groups were conducted separately, using a traditional merged method (similarity clustering analysis, SCA) and a new non-merged method (multivariate similarity clustering analysis (MSCA)). A critical comparison of the two methods demonstrates that the non-merged method can attain a result suitable for both logistics of biological statistics and geography, regardless of the scale of the data. The merged method (SCA) may achieve a result closely resembling that of the non-merged method when dealing with a fewer number of geographic regions. However, with an increased number of geographic regions, the clustering structure with the merged method may create a change at a different level -- so much as to cause a complete loss of functionality. Regardless of the magnitude of difference between results of the two kinds of clustering, their nature will be totally different. The non-merged method similarity coefficients are inherent, independent of each other and exist simultaneously, the clustering result reflects the relationship and distance of all involved geographic regions, and all the coefficients are easily calculated with no strict orders. In the merged method, however, every coefficient was considered to be founded upon or be the result of clustering. The non-merged coefficient is the basis for the merged coefficient's emergence, which is a result of the non-merged coefficient's disappearance after merging. All of the calculations depend on input data and the deduced result is strictly in alphabetical order. It should be noted that the newest or final coefficients were worked out or generated, whereas the non-merged coefficients as well as the involved geographic regions had to be eliminated or discarded. The newest clustering coefficients were constantly generated, subsequently disappearing with the circulation. MSCA, in agreement with the value and huge contribution by SCA methods, can correct errors or inaccuracy that caused by merging or descending order during clustering by SCA method. It especially avoids some lost branches in the clustering result that are very important to the relationship, and cannot find any similarity level that requires indication in some detail. In summary, the MSCA method can solve many of the problems of the SCA method. The clustering achieves greater accuracy, which makes the results fit ecological reality. Also, our modified MSCA method can easily perform macroscopic clustering analysis of ecosystem data, which has never been completely accomplished before.</abstract><doi>10.5846/stxb201203090319</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1000-0933
ispartof Sheng tai xue bao, 2013-06, Vol.33 (11), p.3480-3487
issn 1000-0933
language chi ; eng
recordid cdi_proquest_miscellaneous_1770317426
source EZB-FREE-00999 freely available EZB journals
subjects Clustering
Coefficients
Ecology
Merging
Samples
Similarity
Statistical analysis
Statistical methods
title Comparison of merged and non-merged similarity clustering analysis methods
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-15T09%3A04%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Comparison%20of%20merged%20and%20non-merged%20similarity%20clustering%20analysis%20methods&rft.jtitle=Sheng%20tai%20xue%20bao&rft.au=Liu,%20X&rft.date=2013-06-01&rft.volume=33&rft.issue=11&rft.spage=3480&rft.epage=3487&rft.pages=3480-3487&rft.issn=1000-0933&rft_id=info:doi/10.5846/stxb201203090319&rft_dat=%3Cproquest_cross%3E1770317426%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1429901002&rft_id=info:pmid/&rfr_iscdi=true