Data for: Detection and diversity of Phytophthora species from declining Quercus suber stands using both DNA metabarcoding and soil baiting techniques

This dataset on Zenodo accompanies the manuscript Salvatore et al. (2024), Detection and diversity of Phytophthora species from declining Quercus suber stands using both DNA metabarcoding and soil baiting techniques. There are two files: metadata.tsv - plain text table as tab-separated variables raw...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Seddaiu, Salvatore, Riddell, Carolyn, Piras, Giovanni, Ruiu, Pino Angelo, Sarais, Luca, Mello, Antonietta, Brandano, Andrea, Cock, Peter J. A., Green, Sarah, Scanu, Bruno
Format: Dataset
Sprache:eng
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This dataset on Zenodo accompanies the manuscript Salvatore et al. (2024), Detection and diversity of Phytophthora species from declining Quercus suber stands using both DNA metabarcoding and soil baiting techniques. There are two files: metadata.tsv - plain text table as tab-separated variables raw_data.tar.gz - compressed archive of 56 paired raw FASTQ files This represents a subset of one complete Illumina Nano MiSeq plate run at the James Hutton Institute also containing a small number of unrelated samples using the same protocol. To repeat the analysis described in the paper, first install THAPBI PICT. See https://github.com/peterjc/thapbi-pict/ for instructions. At the time of the paper, v1.0.16 was the current release. Next, decompress the raw data into a folder of paired gzipped FASTQ files. There is no need to decompress those: $ tar -zxvf raw_data.tar.gz $ ls -1 raw_data/ If you wish, verify the checksums to confirm the data integrity: $ cd raw_data/   $ md5sum -c MD5SUM.txt $ cd .. Setup output directories:   $ mkdir -p intermediate/ summary/ Run the THAPBI PICT pipeline:   $ thapbi_pict pipeline -m 1s3g -f 0 -a 15 -i raw_data/ \ -s intermediate/ -o summary/sardinia_20240912_v1.0.16 \ -t metadata.tsv -u -x 8 -c 4,5,3,2,7,6 The options here are as follows: -m - use the 1s3g classifier (see methods) -f - set to zero to disable the fractional abundance threshold -a - set a lower absolute abundance threshold -i - location of the input raw data -s - optional location to store intermediate files -o - output stem for reports -t - filename for tab-separated-variable metadata -u - show unsequenced samples defined in the metadata -x - which metadata column contains Illumina FASTQ filename stems -c - which metadata columns to include in the report. This leaves the -d option with the default provided ITS1 database. We are NOT taking advantage of the negative controls to automatically set a blanket minimum abundance as Control-Plate-3-Mix-3-Dry-P3-c_S56_L001 sadly has over 3000 Phytophthora reads: That takes under a minute to run, and classifies most of the samples. Opening the output file summary/sardinia_20240912_v1.0.16.ITS1.samples.1s3g.xlsx in Excel or similar should show you a table resembling Table 3 in the paper, without the baiting results, but with one row per sequencing sample, and additional columns with per-sample per-species read counts etc. The similarly named reads file as one row per unique amplicon sequence variant (ASV), and columns for ea
DOI:10.5281/zenodo.13753737