Construction of a de novo assembly pipeline using multiple transcriptome data sets from Cypripedium macranthos (Orchidaceae)

The family Orchidaceae comprises the most species of any monocotyledonous family and has interesting characteristics such as seed germination induced by mycorrhizal fungi and flower morphology that co-adapted with pollinators. In orchid species, genomes have been decoded for only a few horticultural...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:PloS one 2023-06, Vol.18 (6), p.e0286804-e0286804
Hauptverfasser: Kambara, Kota, Fujino, Kaien, Shimura, Hanako
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The family Orchidaceae comprises the most species of any monocotyledonous family and has interesting characteristics such as seed germination induced by mycorrhizal fungi and flower morphology that co-adapted with pollinators. In orchid species, genomes have been decoded for only a few horticultural species, and there is little genetic information available. Generally, for species lacking sequenced genomes, gene sequences are predicted by de novo assembly of transcriptome data. Here, we devised a de novo assembly pipeline for transcriptome data from the wild orchid Cypripedium (lady slipper orchid) in Japan by mixing multiple data sets and integrating assemblies to create a more complete and less redundant contig set. Among the assemblies generated by combining various assemblers, Trinity and IDBA-Tran yielded good assembly with higher mapping rates and percentages of BLAST hit contigs and complete BUSCO. Using this contig set as a reference, we analyzed differential gene expression between protocorms grown aseptically or with mycorrhizal fungi to detect gene expressions required for mycorrhizal interaction. A pipeline proposed in this study can construct a highly reliable contig set with little redundancy even when multiple transcriptome data are mixed, and can provide a reference that is adaptable to DEG analysis and other downstream analysis in RNA-seq.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0286804