Analysis of variance for genomic sequences in unbalanced designs
In the study of genetic divergence among organisms, generally the analysis is done directly from the DNA molecule. Therefore, a possible outcome is categorical, being one out of four categories (looking at the nucleotide level). Light and Margolin (1971) developed an analysis of variance for categor...
Gespeichert in:
Veröffentlicht in: | Brazilian journal of probability and statistics 2007-12, Vol.21 (2), p.203-223 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In the study of genetic divergence among organisms, generally the analysis is done directly from the DNA molecule. Therefore, a possible outcome is categorical, being one out of four categories (looking at the nucleotide level). Light and Margolin (1971) developed an analysis of variance for categorical data (CATANOVA) and Pinheiro et al. (2000) employed a similar measure of variation and extended the CATANOVA procedure taking into account several positions in the DNA sequence for balanced designs. Here we consider a methodology for multiple category data with a different number of sample units (i.e., sequences) in each group, that is, the sampling design is unbalanced. In order to test the null hypothesis of homogeneity among groups, the asymptotic distribution of the test statistic is derived and its power is evaluated. An application to real data is illustrated using resampling methods to generate the empirical distribution of the test statistic and a simulation study is performed to evaluate the asymptotic behavior of the distribution of the test statistic. |
---|---|
ISSN: | 0103-0752 2317-6199 |