Pan-European Data Harmonization for Biobanks in ADOPT BBMRI-ERIC

Abstract Background  High-quality clinical data and biological specimens are key for medical research and personalized medicine. The Biobanking and Biomolecular Resources Research Infrastructure-European Research Infrastructure Consortium (BBMRI-ERIC) aims to facilitate access to such biological res...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied clinical informatics 2019-08, Vol.10 (4), p.679-692
Hauptverfasser: Mate, Sebastian, Kampf, Marvin, Rödle, Wolfgang, Kraus, Stefan, Proynova, Rumyana, Silander, Kaisa, Ebert, Lars, Lablans, Martin, Schüttler, Christina, Knell, Christian, Eklund, Niina, Hummel, Michael, Holub, Petr, Prokosch, Hans-Ulrich
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 692
container_issue 4
container_start_page 679
container_title Applied clinical informatics
container_volume 10
creator Mate, Sebastian
Kampf, Marvin
Rödle, Wolfgang
Kraus, Stefan
Proynova, Rumyana
Silander, Kaisa
Ebert, Lars
Lablans, Martin
Schüttler, Christina
Knell, Christian
Eklund, Niina
Hummel, Michael
Holub, Petr
Prokosch, Hans-Ulrich
description Abstract Background  High-quality clinical data and biological specimens are key for medical research and personalized medicine. The Biobanking and Biomolecular Resources Research Infrastructure-European Research Infrastructure Consortium (BBMRI-ERIC) aims to facilitate access to such biological resources. The accompanying ADOPT BBMRI-ERIC project kick-started BBMRI-ERIC by collecting colorectal cancer data from European biobanks. Objectives  To transform these data into a common representation, a uniform approach for data integration and harmonization had to be developed. This article describes the design and the implementation of a toolset for this task. Methods  Based on the semantics of a metadata repository, we developed a lexical bag-of-words matcher, capable of semiautomatically mapping local biobank terms to the central ADOPT BBMRI-ERIC terminology. Its algorithm supports fuzzy matching, utilization of synonyms, and sentiment tagging. To process the anonymized instance data based on these mappings, we also developed a data transformation application. Results  The implementation was used to process the data from 10 European biobanks. The lexical matcher automatically and correctly mapped 78.48% of the 1,492 local biobank terms, and human experts were able to complete the remaining mappings. We used the expert-curated mappings to successfully process 147,608 data records from 3,415 patients. Conclusion  A generic harmonization approach was created and successfully used for cross-institutional data harmonization across 10 European biobanks. The software tools were made available as open source.
doi_str_mv 10.1055/s-0039-1695793
format Article
fullrecord <record><control><sourceid>pubmed_cross</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6739205</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>31509880</sourcerecordid><originalsourceid>FETCH-LOGICAL-c395t-bdedf79ab9798d06b07e0d352a698076b9a81112647671a62a4e022833e7cb9c3</originalsourceid><addsrcrecordid>eNp1kE1PAjEQhhujEYJcPZr9A8V2S78uRr4UEgyE4Lnp7nalCC1pFxP99S4BiR6cy0wy7zyTPADcYtTBiNL7CBEiEmImKZfkAjSxYBIikvLLX3MDtGNco7oow0Lwa9AgmCIpBGqCx7l2cLQPfme0S4a60slYh6139ktX1ruk9CHpW59p9x4T65LecDZfJv3-y2ICR4vJ4AZclXoTTfvUW-D1abQcjOF09jwZ9KYwJ5JWMCtMUXKpM8mlKBDLEDeoIDTVTArEWSa1wBinrMsZx5qlumtQmgpCDM8zmZMWeDhyd_tsa4rcuCrojdoFu9XhU3lt1d-Nsyv15j8U40SmiNaAzhGQBx9jMOX5FiN10KmiOuhUJ531wd3vj-f4j7w6AI-BamXN1qi13wdXO_gP-A0j_Xx3</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Pan-European Data Harmonization for Biobanks in ADOPT BBMRI-ERIC</title><source>MEDLINE</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>PubMed Central</source><creator>Mate, Sebastian ; Kampf, Marvin ; Rödle, Wolfgang ; Kraus, Stefan ; Proynova, Rumyana ; Silander, Kaisa ; Ebert, Lars ; Lablans, Martin ; Schüttler, Christina ; Knell, Christian ; Eklund, Niina ; Hummel, Michael ; Holub, Petr ; Prokosch, Hans-Ulrich</creator><creatorcontrib>Mate, Sebastian ; Kampf, Marvin ; Rödle, Wolfgang ; Kraus, Stefan ; Proynova, Rumyana ; Silander, Kaisa ; Ebert, Lars ; Lablans, Martin ; Schüttler, Christina ; Knell, Christian ; Eklund, Niina ; Hummel, Michael ; Holub, Petr ; Prokosch, Hans-Ulrich</creatorcontrib><description>Abstract Background  High-quality clinical data and biological specimens are key for medical research and personalized medicine. The Biobanking and Biomolecular Resources Research Infrastructure-European Research Infrastructure Consortium (BBMRI-ERIC) aims to facilitate access to such biological resources. The accompanying ADOPT BBMRI-ERIC project kick-started BBMRI-ERIC by collecting colorectal cancer data from European biobanks. Objectives  To transform these data into a common representation, a uniform approach for data integration and harmonization had to be developed. This article describes the design and the implementation of a toolset for this task. Methods  Based on the semantics of a metadata repository, we developed a lexical bag-of-words matcher, capable of semiautomatically mapping local biobank terms to the central ADOPT BBMRI-ERIC terminology. Its algorithm supports fuzzy matching, utilization of synonyms, and sentiment tagging. To process the anonymized instance data based on these mappings, we also developed a data transformation application. Results  The implementation was used to process the data from 10 European biobanks. The lexical matcher automatically and correctly mapped 78.48% of the 1,492 local biobank terms, and human experts were able to complete the remaining mappings. We used the expert-curated mappings to successfully process 147,608 data records from 3,415 patients. Conclusion  A generic harmonization approach was created and successfully used for cross-institutional data harmonization across 10 European biobanks. The software tools were made available as open source.</description><identifier>ISSN: 1869-0327</identifier><identifier>EISSN: 1869-0327</identifier><identifier>DOI: 10.1055/s-0039-1695793</identifier><identifier>PMID: 31509880</identifier><language>eng</language><publisher>Stuttgart · New York: Georg Thieme Verlag KG</publisher><subject>Biological Specimen Banks - standards ; Colorectal Neoplasms ; Europe ; Humans ; Reference Standards ; Research Article</subject><ispartof>Applied clinical informatics, 2019-08, Vol.10 (4), p.679-692</ispartof><rights>Georg Thieme Verlag KG Stuttgart · New York.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c395t-bdedf79ab9798d06b07e0d352a698076b9a81112647671a62a4e022833e7cb9c3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6739205/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6739205/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,27901,27902,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31509880$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Mate, Sebastian</creatorcontrib><creatorcontrib>Kampf, Marvin</creatorcontrib><creatorcontrib>Rödle, Wolfgang</creatorcontrib><creatorcontrib>Kraus, Stefan</creatorcontrib><creatorcontrib>Proynova, Rumyana</creatorcontrib><creatorcontrib>Silander, Kaisa</creatorcontrib><creatorcontrib>Ebert, Lars</creatorcontrib><creatorcontrib>Lablans, Martin</creatorcontrib><creatorcontrib>Schüttler, Christina</creatorcontrib><creatorcontrib>Knell, Christian</creatorcontrib><creatorcontrib>Eklund, Niina</creatorcontrib><creatorcontrib>Hummel, Michael</creatorcontrib><creatorcontrib>Holub, Petr</creatorcontrib><creatorcontrib>Prokosch, Hans-Ulrich</creatorcontrib><title>Pan-European Data Harmonization for Biobanks in ADOPT BBMRI-ERIC</title><title>Applied clinical informatics</title><addtitle>Appl Clin Inform</addtitle><description>Abstract Background  High-quality clinical data and biological specimens are key for medical research and personalized medicine. The Biobanking and Biomolecular Resources Research Infrastructure-European Research Infrastructure Consortium (BBMRI-ERIC) aims to facilitate access to such biological resources. The accompanying ADOPT BBMRI-ERIC project kick-started BBMRI-ERIC by collecting colorectal cancer data from European biobanks. Objectives  To transform these data into a common representation, a uniform approach for data integration and harmonization had to be developed. This article describes the design and the implementation of a toolset for this task. Methods  Based on the semantics of a metadata repository, we developed a lexical bag-of-words matcher, capable of semiautomatically mapping local biobank terms to the central ADOPT BBMRI-ERIC terminology. Its algorithm supports fuzzy matching, utilization of synonyms, and sentiment tagging. To process the anonymized instance data based on these mappings, we also developed a data transformation application. Results  The implementation was used to process the data from 10 European biobanks. The lexical matcher automatically and correctly mapped 78.48% of the 1,492 local biobank terms, and human experts were able to complete the remaining mappings. We used the expert-curated mappings to successfully process 147,608 data records from 3,415 patients. Conclusion  A generic harmonization approach was created and successfully used for cross-institutional data harmonization across 10 European biobanks. The software tools were made available as open source.</description><subject>Biological Specimen Banks - standards</subject><subject>Colorectal Neoplasms</subject><subject>Europe</subject><subject>Humans</subject><subject>Reference Standards</subject><subject>Research Article</subject><issn>1869-0327</issn><issn>1869-0327</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp1kE1PAjEQhhujEYJcPZr9A8V2S78uRr4UEgyE4Lnp7nalCC1pFxP99S4BiR6cy0wy7zyTPADcYtTBiNL7CBEiEmImKZfkAjSxYBIikvLLX3MDtGNco7oow0Lwa9AgmCIpBGqCx7l2cLQPfme0S4a60slYh6139ktX1ruk9CHpW59p9x4T65LecDZfJv3-y2ICR4vJ4AZclXoTTfvUW-D1abQcjOF09jwZ9KYwJ5JWMCtMUXKpM8mlKBDLEDeoIDTVTArEWSa1wBinrMsZx5qlumtQmgpCDM8zmZMWeDhyd_tsa4rcuCrojdoFu9XhU3lt1d-Nsyv15j8U40SmiNaAzhGQBx9jMOX5FiN10KmiOuhUJ531wd3vj-f4j7w6AI-BamXN1qi13wdXO_gP-A0j_Xx3</recordid><startdate>20190801</startdate><enddate>20190801</enddate><creator>Mate, Sebastian</creator><creator>Kampf, Marvin</creator><creator>Rödle, Wolfgang</creator><creator>Kraus, Stefan</creator><creator>Proynova, Rumyana</creator><creator>Silander, Kaisa</creator><creator>Ebert, Lars</creator><creator>Lablans, Martin</creator><creator>Schüttler, Christina</creator><creator>Knell, Christian</creator><creator>Eklund, Niina</creator><creator>Hummel, Michael</creator><creator>Holub, Petr</creator><creator>Prokosch, Hans-Ulrich</creator><general>Georg Thieme Verlag KG</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>5PM</scope></search><sort><creationdate>20190801</creationdate><title>Pan-European Data Harmonization for Biobanks in ADOPT BBMRI-ERIC</title><author>Mate, Sebastian ; Kampf, Marvin ; Rödle, Wolfgang ; Kraus, Stefan ; Proynova, Rumyana ; Silander, Kaisa ; Ebert, Lars ; Lablans, Martin ; Schüttler, Christina ; Knell, Christian ; Eklund, Niina ; Hummel, Michael ; Holub, Petr ; Prokosch, Hans-Ulrich</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c395t-bdedf79ab9798d06b07e0d352a698076b9a81112647671a62a4e022833e7cb9c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Biological Specimen Banks - standards</topic><topic>Colorectal Neoplasms</topic><topic>Europe</topic><topic>Humans</topic><topic>Reference Standards</topic><topic>Research Article</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mate, Sebastian</creatorcontrib><creatorcontrib>Kampf, Marvin</creatorcontrib><creatorcontrib>Rödle, Wolfgang</creatorcontrib><creatorcontrib>Kraus, Stefan</creatorcontrib><creatorcontrib>Proynova, Rumyana</creatorcontrib><creatorcontrib>Silander, Kaisa</creatorcontrib><creatorcontrib>Ebert, Lars</creatorcontrib><creatorcontrib>Lablans, Martin</creatorcontrib><creatorcontrib>Schüttler, Christina</creatorcontrib><creatorcontrib>Knell, Christian</creatorcontrib><creatorcontrib>Eklund, Niina</creatorcontrib><creatorcontrib>Hummel, Michael</creatorcontrib><creatorcontrib>Holub, Petr</creatorcontrib><creatorcontrib>Prokosch, Hans-Ulrich</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Applied clinical informatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mate, Sebastian</au><au>Kampf, Marvin</au><au>Rödle, Wolfgang</au><au>Kraus, Stefan</au><au>Proynova, Rumyana</au><au>Silander, Kaisa</au><au>Ebert, Lars</au><au>Lablans, Martin</au><au>Schüttler, Christina</au><au>Knell, Christian</au><au>Eklund, Niina</au><au>Hummel, Michael</au><au>Holub, Petr</au><au>Prokosch, Hans-Ulrich</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Pan-European Data Harmonization for Biobanks in ADOPT BBMRI-ERIC</atitle><jtitle>Applied clinical informatics</jtitle><addtitle>Appl Clin Inform</addtitle><date>2019-08-01</date><risdate>2019</risdate><volume>10</volume><issue>4</issue><spage>679</spage><epage>692</epage><pages>679-692</pages><issn>1869-0327</issn><eissn>1869-0327</eissn><abstract>Abstract Background  High-quality clinical data and biological specimens are key for medical research and personalized medicine. The Biobanking and Biomolecular Resources Research Infrastructure-European Research Infrastructure Consortium (BBMRI-ERIC) aims to facilitate access to such biological resources. The accompanying ADOPT BBMRI-ERIC project kick-started BBMRI-ERIC by collecting colorectal cancer data from European biobanks. Objectives  To transform these data into a common representation, a uniform approach for data integration and harmonization had to be developed. This article describes the design and the implementation of a toolset for this task. Methods  Based on the semantics of a metadata repository, we developed a lexical bag-of-words matcher, capable of semiautomatically mapping local biobank terms to the central ADOPT BBMRI-ERIC terminology. Its algorithm supports fuzzy matching, utilization of synonyms, and sentiment tagging. To process the anonymized instance data based on these mappings, we also developed a data transformation application. Results  The implementation was used to process the data from 10 European biobanks. The lexical matcher automatically and correctly mapped 78.48% of the 1,492 local biobank terms, and human experts were able to complete the remaining mappings. We used the expert-curated mappings to successfully process 147,608 data records from 3,415 patients. Conclusion  A generic harmonization approach was created and successfully used for cross-institutional data harmonization across 10 European biobanks. The software tools were made available as open source.</abstract><cop>Stuttgart · New York</cop><pub>Georg Thieme Verlag KG</pub><pmid>31509880</pmid><doi>10.1055/s-0039-1695793</doi><tpages>14</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1869-0327
ispartof Applied clinical informatics, 2019-08, Vol.10 (4), p.679-692
issn 1869-0327
1869-0327
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6739205
source MEDLINE; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; PubMed Central
subjects Biological Specimen Banks - standards
Colorectal Neoplasms
Europe
Humans
Reference Standards
Research Article
title Pan-European Data Harmonization for Biobanks in ADOPT BBMRI-ERIC
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T13%3A26%3A30IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pubmed_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Pan-European%20Data%20Harmonization%20for%20Biobanks%20in%20ADOPT%20BBMRI-ERIC&rft.jtitle=Applied%20clinical%20informatics&rft.au=Mate,%20Sebastian&rft.date=2019-08-01&rft.volume=10&rft.issue=4&rft.spage=679&rft.epage=692&rft.pages=679-692&rft.issn=1869-0327&rft.eissn=1869-0327&rft_id=info:doi/10.1055/s-0039-1695793&rft_dat=%3Cpubmed_cross%3E31509880%3C/pubmed_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/31509880&rfr_iscdi=true