MAFin: Motif Detection in Multiple Alignment Files

Motivation: Genome and Proteome Alignments, represented by the Multiple Alignment File (MAF) format, have become a standard approach in the field of comparative genomics and proteomics. However, current approaches lack a direct method for motif detection within MAF files. To address this gap, we pre...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-10
Hauptverfasser: Patsakis, Michail, Provatas, Kimonas, Baltoumas, Fotis A, Chantzi, Nikol, Mouratidis, Ioannis, Pavlopoulos, Georgios A, Georgakopoulos-Soares, Ilias
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Patsakis, Michail
Provatas, Kimonas
Baltoumas, Fotis A
Chantzi, Nikol
Mouratidis, Ioannis
Pavlopoulos, Georgios A
Georgakopoulos-Soares, Ilias
description Motivation: Genome and Proteome Alignments, represented by the Multiple Alignment File (MAF) format, have become a standard approach in the field of comparative genomics and proteomics. However, current approaches lack a direct method for motif detection within MAF files. To address this gap, we present MAFin, a novel tool that enables efficient motif detection and conservation analysis in MAF files, streamlining genomic and proteomic research. Results: We developed MAFin, the first motif detection tool for Multiple Alignment Format files. MAFin enables the multithreaded search of conserved motifs using three approaches: 1) by using user-specified k-mers to search the sequences. 2) with regular expressions, in which case one or more patterns are searched, and 3) with predefined Position Weight Matrices. Once the motif has been found, MAFin detects the motif instances and calculates the conservation across the aligned sequences. MAFin also calculates a conservation percentage, which provides information about the conservation levels of each motif across the aligned sequences, based on the number of matches relative to the length of the motif. A set of statistics enable the interpretation of each motif's conservation level, and the detected motifs are exported in JSON and CSV files for downstream analyses. Availability: MAFin is released as a Python package under the GPL license as a multi-platform application and is available at: https://github.com/Georgakopoulos-Soares-lab/MAFin. Contact: izg5139@psu.edu
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3117170460</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3117170460</sourcerecordid><originalsourceid>FETCH-proquest_journals_31171704603</originalsourceid><addsrcrecordid>eNqNyr0KwjAUQOEgCBbtO1xwLuSnbcStqMGlm3sRuZVbYlKbm_fXwQdwOsN3VqLQxqjqUGu9EWVKk5RSt1Y3jSmE7jtH4Qh9ZBrhjIwPphiAAvTZM80eofP0DC8MDI48pp1Yj3efsPx1K_bucjtdq3mJ74yJhynmJXxpMEpZZWXdSvPf9QFt1jLr</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3117170460</pqid></control><display><type>article</type><title>MAFin: Motif Detection in Multiple Alignment Files</title><source>Free E- Journals</source><creator>Patsakis, Michail ; Provatas, Kimonas ; Baltoumas, Fotis A ; Chantzi, Nikol ; Mouratidis, Ioannis ; Pavlopoulos, Georgios A ; Georgakopoulos-Soares, Ilias</creator><creatorcontrib>Patsakis, Michail ; Provatas, Kimonas ; Baltoumas, Fotis A ; Chantzi, Nikol ; Mouratidis, Ioannis ; Pavlopoulos, Georgios A ; Georgakopoulos-Soares, Ilias</creatorcontrib><description>Motivation: Genome and Proteome Alignments, represented by the Multiple Alignment File (MAF) format, have become a standard approach in the field of comparative genomics and proteomics. However, current approaches lack a direct method for motif detection within MAF files. To address this gap, we present MAFin, a novel tool that enables efficient motif detection and conservation analysis in MAF files, streamlining genomic and proteomic research. Results: We developed MAFin, the first motif detection tool for Multiple Alignment Format files. MAFin enables the multithreaded search of conserved motifs using three approaches: 1) by using user-specified k-mers to search the sequences. 2) with regular expressions, in which case one or more patterns are searched, and 3) with predefined Position Weight Matrices. Once the motif has been found, MAFin detects the motif instances and calculates the conservation across the aligned sequences. MAFin also calculates a conservation percentage, which provides information about the conservation levels of each motif across the aligned sequences, based on the number of matches relative to the length of the motif. A set of statistics enable the interpretation of each motif's conservation level, and the detected motifs are exported in JSON and CSV files for downstream analyses. Availability: MAFin is released as a Python package under the GPL license as a multi-platform application and is available at: https://github.com/Georgakopoulos-Soares-lab/MAFin. Contact: izg5139@psu.edu</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Alignment ; Availability ; Conservation ; Format ; Proteomics ; Sequences</subject><ispartof>arXiv.org, 2024-10</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by-sa/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>781,785</link.rule.ids></links><search><creatorcontrib>Patsakis, Michail</creatorcontrib><creatorcontrib>Provatas, Kimonas</creatorcontrib><creatorcontrib>Baltoumas, Fotis A</creatorcontrib><creatorcontrib>Chantzi, Nikol</creatorcontrib><creatorcontrib>Mouratidis, Ioannis</creatorcontrib><creatorcontrib>Pavlopoulos, Georgios A</creatorcontrib><creatorcontrib>Georgakopoulos-Soares, Ilias</creatorcontrib><title>MAFin: Motif Detection in Multiple Alignment Files</title><title>arXiv.org</title><description>Motivation: Genome and Proteome Alignments, represented by the Multiple Alignment File (MAF) format, have become a standard approach in the field of comparative genomics and proteomics. However, current approaches lack a direct method for motif detection within MAF files. To address this gap, we present MAFin, a novel tool that enables efficient motif detection and conservation analysis in MAF files, streamlining genomic and proteomic research. Results: We developed MAFin, the first motif detection tool for Multiple Alignment Format files. MAFin enables the multithreaded search of conserved motifs using three approaches: 1) by using user-specified k-mers to search the sequences. 2) with regular expressions, in which case one or more patterns are searched, and 3) with predefined Position Weight Matrices. Once the motif has been found, MAFin detects the motif instances and calculates the conservation across the aligned sequences. MAFin also calculates a conservation percentage, which provides information about the conservation levels of each motif across the aligned sequences, based on the number of matches relative to the length of the motif. A set of statistics enable the interpretation of each motif's conservation level, and the detected motifs are exported in JSON and CSV files for downstream analyses. Availability: MAFin is released as a Python package under the GPL license as a multi-platform application and is available at: https://github.com/Georgakopoulos-Soares-lab/MAFin. Contact: izg5139@psu.edu</description><subject>Alignment</subject><subject>Availability</subject><subject>Conservation</subject><subject>Format</subject><subject>Proteomics</subject><subject>Sequences</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNyr0KwjAUQOEgCBbtO1xwLuSnbcStqMGlm3sRuZVbYlKbm_fXwQdwOsN3VqLQxqjqUGu9EWVKk5RSt1Y3jSmE7jtH4Qh9ZBrhjIwPphiAAvTZM80eofP0DC8MDI48pp1Yj3efsPx1K_bucjtdq3mJ74yJhynmJXxpMEpZZWXdSvPf9QFt1jLr</recordid><startdate>20241014</startdate><enddate>20241014</enddate><creator>Patsakis, Michail</creator><creator>Provatas, Kimonas</creator><creator>Baltoumas, Fotis A</creator><creator>Chantzi, Nikol</creator><creator>Mouratidis, Ioannis</creator><creator>Pavlopoulos, Georgios A</creator><creator>Georgakopoulos-Soares, Ilias</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20241014</creationdate><title>MAFin: Motif Detection in Multiple Alignment Files</title><author>Patsakis, Michail ; Provatas, Kimonas ; Baltoumas, Fotis A ; Chantzi, Nikol ; Mouratidis, Ioannis ; Pavlopoulos, Georgios A ; Georgakopoulos-Soares, Ilias</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31171704603</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Alignment</topic><topic>Availability</topic><topic>Conservation</topic><topic>Format</topic><topic>Proteomics</topic><topic>Sequences</topic><toplevel>online_resources</toplevel><creatorcontrib>Patsakis, Michail</creatorcontrib><creatorcontrib>Provatas, Kimonas</creatorcontrib><creatorcontrib>Baltoumas, Fotis A</creatorcontrib><creatorcontrib>Chantzi, Nikol</creatorcontrib><creatorcontrib>Mouratidis, Ioannis</creatorcontrib><creatorcontrib>Pavlopoulos, Georgios A</creatorcontrib><creatorcontrib>Georgakopoulos-Soares, Ilias</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>ProQuest Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Patsakis, Michail</au><au>Provatas, Kimonas</au><au>Baltoumas, Fotis A</au><au>Chantzi, Nikol</au><au>Mouratidis, Ioannis</au><au>Pavlopoulos, Georgios A</au><au>Georgakopoulos-Soares, Ilias</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>MAFin: Motif Detection in Multiple Alignment Files</atitle><jtitle>arXiv.org</jtitle><date>2024-10-14</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Motivation: Genome and Proteome Alignments, represented by the Multiple Alignment File (MAF) format, have become a standard approach in the field of comparative genomics and proteomics. However, current approaches lack a direct method for motif detection within MAF files. To address this gap, we present MAFin, a novel tool that enables efficient motif detection and conservation analysis in MAF files, streamlining genomic and proteomic research. Results: We developed MAFin, the first motif detection tool for Multiple Alignment Format files. MAFin enables the multithreaded search of conserved motifs using three approaches: 1) by using user-specified k-mers to search the sequences. 2) with regular expressions, in which case one or more patterns are searched, and 3) with predefined Position Weight Matrices. Once the motif has been found, MAFin detects the motif instances and calculates the conservation across the aligned sequences. MAFin also calculates a conservation percentage, which provides information about the conservation levels of each motif across the aligned sequences, based on the number of matches relative to the length of the motif. A set of statistics enable the interpretation of each motif's conservation level, and the detected motifs are exported in JSON and CSV files for downstream analyses. Availability: MAFin is released as a Python package under the GPL license as a multi-platform application and is available at: https://github.com/Georgakopoulos-Soares-lab/MAFin. Contact: izg5139@psu.edu</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-10
issn 2331-8422
language eng
recordid cdi_proquest_journals_3117170460
source Free E- Journals
subjects Alignment
Availability
Conservation
Format
Proteomics
Sequences
title MAFin: Motif Detection in Multiple Alignment Files
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-17T19%3A19%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=MAFin:%20Motif%20Detection%20in%20Multiple%20Alignment%20Files&rft.jtitle=arXiv.org&rft.au=Patsakis,%20Michail&rft.date=2024-10-14&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3117170460%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3117170460&rft_id=info:pmid/&rfr_iscdi=true