Absence/presence calling in microarray-based CGH experiments with non-model organisms

Structural variations in genomes are commonly studied by (micro)array-based comparative genomic hybridization. The data analysis methods to infer copy number variation in model organisms (human, mouse) are established. In principle, the procedures are based on signal ratios between test and referenc...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nucleic acids research 2014-06, Vol.42 (11), p.e94-e94
Hauptverfasser: Jonker, Martijs J, de Leeuw, Wim C, Marinković, Marino, Wittink, Floyd R A, Rauwerda, Han, Bruning, Oskar, Ensink, Wim A, Fluit, Ad C, Boel, C H, Jong, Mark de, Breit, Timo M
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Structural variations in genomes are commonly studied by (micro)array-based comparative genomic hybridization. The data analysis methods to infer copy number variation in model organisms (human, mouse) are established. In principle, the procedures are based on signal ratios between test and reference samples and the order of the probe targets in the genome. These procedures are less applicable to experiments with non-model organisms, which frequently comprise non-sequenced genomes with an unknown order of probe targets. We therefore present an additional analysis approach, which does not depend on the structural information of a reference genome, and quantifies the presence or absence of a probe target in an unknown genome. The principle is that intensity values of target probes are compared with the intensities of negative-control probes and positive-control probes from a control hybridization, to determine if a probe target is absent or present. In a test, analyzing the genome content of a known bacterial strain: Staphylococcus aureus MRSA252, this approach proved to be successful, demonstrated by receiver operating characteristic area under the curve values larger than 0.9995. We show its usability in various applications, such as comparing genome content and validating next-generation sequencing reads from eukaryotic non-model organisms.
ISSN:0305-1048
1362-4962
DOI:10.1093/nar/gku343