IN SILICO SCREENING FOR PHENOTYPE-ASSOCIATED EXPRESSED SEQUENCES

The present invention provides methods for determining whether a nucleic acid sequence is a marker for a phenotype or cell type of interest which comprises providing a database of expressed sequence tag sequences (EST's) from the species; placing said EST's in groups termed clusters based...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: KOZLOV, ANDREY PETROVICH, LOBASHEV, ANDREY VLADIMIROVICH, BARANOVA, ANNA VJACHESLAVOVNA, YANKOVSKY, NIKOLAY KAZIMIROVICH, KRUKOVSKAYA, LARISA, LEONIDOVNA
Format: Patent
Sprache:eng ; fre
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The present invention provides methods for determining whether a nucleic acid sequence is a marker for a phenotype or cell type of interest which comprises providing a database of expressed sequence tag sequences (EST's) from the species; placing said EST's in groups termed clusters based on homology of EST's within each cluster; determining for each cluster the total number of EST"s within said cluster; ordering said clusters sequentially based on the number of EST's in each cluster; dividing said ordered clusters into subranges based on the number of EST's per cluster; determining for each cluster subrange obtained from step (e) the number EST's within said cluster which are expressed in said predetermined cell type of interest; calculating according to a normal distribution the number of clusters in each subrange expected to contain a predetermined threshold percentage of EST's expressed in said cell type of interest, wherein said threshold percentage is a percentage from about 10% to about 100%; determining the number of clusters in each subrange observed to contain said predetermined threshold percentage of EST's expressed in said predetermined cell type; and identifying subranges having an observed number of clusters that meet said predetermined threshold percentage greater than the number of clusters expected to meet said predetermined threshold percentage for the subrange according to normal distribution; wherein if the percentage of EST's expressed in said cell type of interest in a cluster identified is equal to or greater than said predetermined threshold percentage, the cluster contains a nucleic acid that is a marker for the cell type of interest. L'invention concerne des procédés permettant de déterminer si une séquence d'acides nucléiques est un marqueur pour un phénotype ou un type de cellule recherché et qui consistent à mettre en oeuvre une base de données de séquences étiquettes exprimées (EST) d'espèces, à réaliser un placement de ces EST en groupes appelés grappes basé sur une homologie des EST à l'intérieur de chaque grappe, à déterminer pour chaque grappe le nombre total d'EST dans cette grappe, à ordonner ces grappes sur une base séquentielle selon le nombre d'EST dans chaque grappe, à diviser les grappes rangées en sous domaines en fonction du nombre d'EST par grappe, à déterminer pour chaque sous domaine de grappe, obtenu dans l'étape (e), le nombre d'EST de la grappe qui sont exprimés dans le type de cellule déterminée d'intérêt,