ESTprep: preprocessing cDNA sequence reads

Motivation: High accuracy of data always governs the large-scale gene discovery projects. The data should not only be trustworthy but should be correctly annotated for various features it contains. Sequence errors are inherent in single-pass sequences such as ESTs obtained from automated sequencing....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics 2003-07, Vol.19 (11), p.1318-1324
Hauptverfasser: Scheetz, Todd E., Trivedi, Nishank, Roberts, Chad A., Kucaba, Tamara, Berger, Brian, Robinson, Natalie L., Birkett, Clayton L., Gavin, Allen J., O’Leary, Brian, Braun, Terry A., Bonaldo, Maria F., Robinson, John P., Sheffield, Val C., Soares, Marcelo B., Casavant, Thomas L.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Motivation: High accuracy of data always governs the large-scale gene discovery projects. The data should not only be trustworthy but should be correctly annotated for various features it contains. Sequence errors are inherent in single-pass sequences such as ESTs obtained from automated sequencing. These errors further complicate the automated identification of EST-related sequencing. A tool is required to prepare the data prior to advanced annotation processing and submission to public databases. Results: This paper describes ESTprep, a program designed to preprocess expressed sequence tag (EST) sequences. It identifies the location of features present in ESTs and allows the sequence to pass only if it meets various quality criteria. Use of ESTprep has resulted in substantial improvement in accurate EST feature identification and fidelity of results submitted to GenBank. Availability: The program is freely available for download from http://genome.uiowa.edu/pubsoft/software.html Contact: tscheetz@eng.uiowa.edu * To whom correspondence should be addressed.
ISSN:1367-4803
1460-2059
1367-4811
DOI:10.1093/bioinformatics/btg159