In silico screening for phenotype-associated expressed sequences
The present invention provides methods for determining whether a nucleic acid sequence is a marker for a phenotype or cell type of interest which comprises providing a database of expressed sequence tag sequences (EST's) from the species; placing said EST's in groups termed clusters based...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The present invention provides methods for determining whether a nucleic acid sequence is a marker for a phenotype or cell type of interest which comprises providing a database of expressed sequence tag sequences (EST's) from the species; placing said EST's in groups termed clusters based on homology of EST's within each cluster; determining for each cluster the total number of EST's within said cluster; ordering said clusters sequentially based on the number of EST's in each cluster; dividing said ordered clusters into subranges based on the number of EST's per cluster; determining for each cluster subrange obtained from step (e) the number EST's within said cluster which are expressed in said predetermined cell type of interest; calculating according to a normal distribution the number of clusters in each subrange expected to contain a predetermined threshold percentage of EST's expressed in said cell type of interest, wherein said threshold percentage is a percentage from about 10% to about 100%; determining the number of clusters in each subrange observed to contain said predetermined threshold percentage of EST's expressed in said predetermined cell type; and identifying subranges having an observed number of clusters that meet said predetermined threshold percentage greater than the number of clusters expected to meet said predetermined threshold percentage for the subrange according to normal distribution; wherein if the percentage of EST's expressed in said cell type of interest in a cluster identified is equal to or greater than said predetermined threshold percentage, the cluster contains a nucleic acid that is a marker for the cell type of interest. |
---|