Pan-European Data Harmonization for Biobanks in ADOPT BBMRI-ERIC
Abstract Background High-quality clinical data and biological specimens are key for medical research and personalized medicine. The Biobanking and Biomolecular Resources Research Infrastructure-European Research Infrastructure Consortium (BBMRI-ERIC) aims to facilitate access to such biological res...
Gespeichert in:
Veröffentlicht in: | Applied clinical informatics 2019-08, Vol.10 (4), p.679-692 |
---|---|
Hauptverfasser: | , , , , , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 692 |
---|---|
container_issue | 4 |
container_start_page | 679 |
container_title | Applied clinical informatics |
container_volume | 10 |
creator | Mate, Sebastian Kampf, Marvin Rödle, Wolfgang Kraus, Stefan Proynova, Rumyana Silander, Kaisa Ebert, Lars Lablans, Martin Schüttler, Christina Knell, Christian Eklund, Niina Hummel, Michael Holub, Petr Prokosch, Hans-Ulrich |
description | Abstract
Background
High-quality clinical data and biological specimens are key for medical research and personalized medicine. The Biobanking and Biomolecular Resources Research Infrastructure-European Research Infrastructure Consortium (BBMRI-ERIC) aims to facilitate access to such biological resources. The accompanying ADOPT BBMRI-ERIC project kick-started BBMRI-ERIC by collecting colorectal cancer data from European biobanks.
Objectives
To transform these data into a common representation, a uniform approach for data integration and harmonization had to be developed. This article describes the design and the implementation of a toolset for this task.
Methods
Based on the semantics of a metadata repository, we developed a lexical bag-of-words matcher, capable of semiautomatically mapping local biobank terms to the central ADOPT BBMRI-ERIC terminology. Its algorithm supports fuzzy matching, utilization of synonyms, and sentiment tagging. To process the anonymized instance data based on these mappings, we also developed a data transformation application.
Results
The implementation was used to process the data from 10 European biobanks. The lexical matcher automatically and correctly mapped 78.48% of the 1,492 local biobank terms, and human experts were able to complete the remaining mappings. We used the expert-curated mappings to successfully process 147,608 data records from 3,415 patients.
Conclusion
A generic harmonization approach was created and successfully used for cross-institutional data harmonization across 10 European biobanks. The software tools were made available as open source. |
doi_str_mv | 10.1055/s-0039-1695793 |
format | Article |
fullrecord | <record><control><sourceid>pubmed_cross</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6739205</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>31509880</sourcerecordid><originalsourceid>FETCH-LOGICAL-c395t-bdedf79ab9798d06b07e0d352a698076b9a81112647671a62a4e022833e7cb9c3</originalsourceid><addsrcrecordid>eNp1kE1PAjEQhhujEYJcPZr9A8V2S78uRr4UEgyE4Lnp7nalCC1pFxP99S4BiR6cy0wy7zyTPADcYtTBiNL7CBEiEmImKZfkAjSxYBIikvLLX3MDtGNco7oow0Lwa9AgmCIpBGqCx7l2cLQPfme0S4a60slYh6139ktX1ruk9CHpW59p9x4T65LecDZfJv3-y2ICR4vJ4AZclXoTTfvUW-D1abQcjOF09jwZ9KYwJ5JWMCtMUXKpM8mlKBDLEDeoIDTVTArEWSa1wBinrMsZx5qlumtQmgpCDM8zmZMWeDhyd_tsa4rcuCrojdoFu9XhU3lt1d-Nsyv15j8U40SmiNaAzhGQBx9jMOX5FiN10KmiOuhUJ531wd3vj-f4j7w6AI-BamXN1qi13wdXO_gP-A0j_Xx3</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Pan-European Data Harmonization for Biobanks in ADOPT BBMRI-ERIC</title><source>MEDLINE</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>PubMed Central</source><creator>Mate, Sebastian ; Kampf, Marvin ; Rödle, Wolfgang ; Kraus, Stefan ; Proynova, Rumyana ; Silander, Kaisa ; Ebert, Lars ; Lablans, Martin ; Schüttler, Christina ; Knell, Christian ; Eklund, Niina ; Hummel, Michael ; Holub, Petr ; Prokosch, Hans-Ulrich</creator><creatorcontrib>Mate, Sebastian ; Kampf, Marvin ; Rödle, Wolfgang ; Kraus, Stefan ; Proynova, Rumyana ; Silander, Kaisa ; Ebert, Lars ; Lablans, Martin ; Schüttler, Christina ; Knell, Christian ; Eklund, Niina ; Hummel, Michael ; Holub, Petr ; Prokosch, Hans-Ulrich</creatorcontrib><description>Abstract
Background
High-quality clinical data and biological specimens are key for medical research and personalized medicine. The Biobanking and Biomolecular Resources Research Infrastructure-European Research Infrastructure Consortium (BBMRI-ERIC) aims to facilitate access to such biological resources. The accompanying ADOPT BBMRI-ERIC project kick-started BBMRI-ERIC by collecting colorectal cancer data from European biobanks.
Objectives
To transform these data into a common representation, a uniform approach for data integration and harmonization had to be developed. This article describes the design and the implementation of a toolset for this task.
Methods
Based on the semantics of a metadata repository, we developed a lexical bag-of-words matcher, capable of semiautomatically mapping local biobank terms to the central ADOPT BBMRI-ERIC terminology. Its algorithm supports fuzzy matching, utilization of synonyms, and sentiment tagging. To process the anonymized instance data based on these mappings, we also developed a data transformation application.
Results
The implementation was used to process the data from 10 European biobanks. The lexical matcher automatically and correctly mapped 78.48% of the 1,492 local biobank terms, and human experts were able to complete the remaining mappings. We used the expert-curated mappings to successfully process 147,608 data records from 3,415 patients.
Conclusion
A generic harmonization approach was created and successfully used for cross-institutional data harmonization across 10 European biobanks. The software tools were made available as open source.</description><identifier>ISSN: 1869-0327</identifier><identifier>EISSN: 1869-0327</identifier><identifier>DOI: 10.1055/s-0039-1695793</identifier><identifier>PMID: 31509880</identifier><language>eng</language><publisher>Stuttgart · New York: Georg Thieme Verlag KG</publisher><subject>Biological Specimen Banks - standards ; Colorectal Neoplasms ; Europe ; Humans ; Reference Standards ; Research Article</subject><ispartof>Applied clinical informatics, 2019-08, Vol.10 (4), p.679-692</ispartof><rights>Georg Thieme Verlag KG Stuttgart · New York.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c395t-bdedf79ab9798d06b07e0d352a698076b9a81112647671a62a4e022833e7cb9c3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6739205/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6739205/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,27901,27902,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31509880$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Mate, Sebastian</creatorcontrib><creatorcontrib>Kampf, Marvin</creatorcontrib><creatorcontrib>Rödle, Wolfgang</creatorcontrib><creatorcontrib>Kraus, Stefan</creatorcontrib><creatorcontrib>Proynova, Rumyana</creatorcontrib><creatorcontrib>Silander, Kaisa</creatorcontrib><creatorcontrib>Ebert, Lars</creatorcontrib><creatorcontrib>Lablans, Martin</creatorcontrib><creatorcontrib>Schüttler, Christina</creatorcontrib><creatorcontrib>Knell, Christian</creatorcontrib><creatorcontrib>Eklund, Niina</creatorcontrib><creatorcontrib>Hummel, Michael</creatorcontrib><creatorcontrib>Holub, Petr</creatorcontrib><creatorcontrib>Prokosch, Hans-Ulrich</creatorcontrib><title>Pan-European Data Harmonization for Biobanks in ADOPT BBMRI-ERIC</title><title>Applied clinical informatics</title><addtitle>Appl Clin Inform</addtitle><description>Abstract
Background
High-quality clinical data and biological specimens are key for medical research and personalized medicine. The Biobanking and Biomolecular Resources Research Infrastructure-European Research Infrastructure Consortium (BBMRI-ERIC) aims to facilitate access to such biological resources. The accompanying ADOPT BBMRI-ERIC project kick-started BBMRI-ERIC by collecting colorectal cancer data from European biobanks.
Objectives
To transform these data into a common representation, a uniform approach for data integration and harmonization had to be developed. This article describes the design and the implementation of a toolset for this task.
Methods
Based on the semantics of a metadata repository, we developed a lexical bag-of-words matcher, capable of semiautomatically mapping local biobank terms to the central ADOPT BBMRI-ERIC terminology. Its algorithm supports fuzzy matching, utilization of synonyms, and sentiment tagging. To process the anonymized instance data based on these mappings, we also developed a data transformation application.
Results
The implementation was used to process the data from 10 European biobanks. The lexical matcher automatically and correctly mapped 78.48% of the 1,492 local biobank terms, and human experts were able to complete the remaining mappings. We used the expert-curated mappings to successfully process 147,608 data records from 3,415 patients.
Conclusion
A generic harmonization approach was created and successfully used for cross-institutional data harmonization across 10 European biobanks. The software tools were made available as open source.</description><subject>Biological Specimen Banks - standards</subject><subject>Colorectal Neoplasms</subject><subject>Europe</subject><subject>Humans</subject><subject>Reference Standards</subject><subject>Research Article</subject><issn>1869-0327</issn><issn>1869-0327</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp1kE1PAjEQhhujEYJcPZr9A8V2S78uRr4UEgyE4Lnp7nalCC1pFxP99S4BiR6cy0wy7zyTPADcYtTBiNL7CBEiEmImKZfkAjSxYBIikvLLX3MDtGNco7oow0Lwa9AgmCIpBGqCx7l2cLQPfme0S4a60slYh6139ktX1ruk9CHpW59p9x4T65LecDZfJv3-y2ICR4vJ4AZclXoTTfvUW-D1abQcjOF09jwZ9KYwJ5JWMCtMUXKpM8mlKBDLEDeoIDTVTArEWSa1wBinrMsZx5qlumtQmgpCDM8zmZMWeDhyd_tsa4rcuCrojdoFu9XhU3lt1d-Nsyv15j8U40SmiNaAzhGQBx9jMOX5FiN10KmiOuhUJ531wd3vj-f4j7w6AI-BamXN1qi13wdXO_gP-A0j_Xx3</recordid><startdate>20190801</startdate><enddate>20190801</enddate><creator>Mate, Sebastian</creator><creator>Kampf, Marvin</creator><creator>Rödle, Wolfgang</creator><creator>Kraus, Stefan</creator><creator>Proynova, Rumyana</creator><creator>Silander, Kaisa</creator><creator>Ebert, Lars</creator><creator>Lablans, Martin</creator><creator>Schüttler, Christina</creator><creator>Knell, Christian</creator><creator>Eklund, Niina</creator><creator>Hummel, Michael</creator><creator>Holub, Petr</creator><creator>Prokosch, Hans-Ulrich</creator><general>Georg Thieme Verlag KG</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>5PM</scope></search><sort><creationdate>20190801</creationdate><title>Pan-European Data Harmonization for Biobanks in ADOPT BBMRI-ERIC</title><author>Mate, Sebastian ; Kampf, Marvin ; Rödle, Wolfgang ; Kraus, Stefan ; Proynova, Rumyana ; Silander, Kaisa ; Ebert, Lars ; Lablans, Martin ; Schüttler, Christina ; Knell, Christian ; Eklund, Niina ; Hummel, Michael ; Holub, Petr ; Prokosch, Hans-Ulrich</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c395t-bdedf79ab9798d06b07e0d352a698076b9a81112647671a62a4e022833e7cb9c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Biological Specimen Banks - standards</topic><topic>Colorectal Neoplasms</topic><topic>Europe</topic><topic>Humans</topic><topic>Reference Standards</topic><topic>Research Article</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mate, Sebastian</creatorcontrib><creatorcontrib>Kampf, Marvin</creatorcontrib><creatorcontrib>Rödle, Wolfgang</creatorcontrib><creatorcontrib>Kraus, Stefan</creatorcontrib><creatorcontrib>Proynova, Rumyana</creatorcontrib><creatorcontrib>Silander, Kaisa</creatorcontrib><creatorcontrib>Ebert, Lars</creatorcontrib><creatorcontrib>Lablans, Martin</creatorcontrib><creatorcontrib>Schüttler, Christina</creatorcontrib><creatorcontrib>Knell, Christian</creatorcontrib><creatorcontrib>Eklund, Niina</creatorcontrib><creatorcontrib>Hummel, Michael</creatorcontrib><creatorcontrib>Holub, Petr</creatorcontrib><creatorcontrib>Prokosch, Hans-Ulrich</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Applied clinical informatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mate, Sebastian</au><au>Kampf, Marvin</au><au>Rödle, Wolfgang</au><au>Kraus, Stefan</au><au>Proynova, Rumyana</au><au>Silander, Kaisa</au><au>Ebert, Lars</au><au>Lablans, Martin</au><au>Schüttler, Christina</au><au>Knell, Christian</au><au>Eklund, Niina</au><au>Hummel, Michael</au><au>Holub, Petr</au><au>Prokosch, Hans-Ulrich</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Pan-European Data Harmonization for Biobanks in ADOPT BBMRI-ERIC</atitle><jtitle>Applied clinical informatics</jtitle><addtitle>Appl Clin Inform</addtitle><date>2019-08-01</date><risdate>2019</risdate><volume>10</volume><issue>4</issue><spage>679</spage><epage>692</epage><pages>679-692</pages><issn>1869-0327</issn><eissn>1869-0327</eissn><abstract>Abstract
Background
High-quality clinical data and biological specimens are key for medical research and personalized medicine. The Biobanking and Biomolecular Resources Research Infrastructure-European Research Infrastructure Consortium (BBMRI-ERIC) aims to facilitate access to such biological resources. The accompanying ADOPT BBMRI-ERIC project kick-started BBMRI-ERIC by collecting colorectal cancer data from European biobanks.
Objectives
To transform these data into a common representation, a uniform approach for data integration and harmonization had to be developed. This article describes the design and the implementation of a toolset for this task.
Methods
Based on the semantics of a metadata repository, we developed a lexical bag-of-words matcher, capable of semiautomatically mapping local biobank terms to the central ADOPT BBMRI-ERIC terminology. Its algorithm supports fuzzy matching, utilization of synonyms, and sentiment tagging. To process the anonymized instance data based on these mappings, we also developed a data transformation application.
Results
The implementation was used to process the data from 10 European biobanks. The lexical matcher automatically and correctly mapped 78.48% of the 1,492 local biobank terms, and human experts were able to complete the remaining mappings. We used the expert-curated mappings to successfully process 147,608 data records from 3,415 patients.
Conclusion
A generic harmonization approach was created and successfully used for cross-institutional data harmonization across 10 European biobanks. The software tools were made available as open source.</abstract><cop>Stuttgart · New York</cop><pub>Georg Thieme Verlag KG</pub><pmid>31509880</pmid><doi>10.1055/s-0039-1695793</doi><tpages>14</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1869-0327 |
ispartof | Applied clinical informatics, 2019-08, Vol.10 (4), p.679-692 |
issn | 1869-0327 1869-0327 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6739205 |
source | MEDLINE; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; PubMed Central |
subjects | Biological Specimen Banks - standards Colorectal Neoplasms Europe Humans Reference Standards Research Article |
title | Pan-European Data Harmonization for Biobanks in ADOPT BBMRI-ERIC |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T13%3A26%3A30IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pubmed_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Pan-European%20Data%20Harmonization%20for%20Biobanks%20in%20ADOPT%20BBMRI-ERIC&rft.jtitle=Applied%20clinical%20informatics&rft.au=Mate,%20Sebastian&rft.date=2019-08-01&rft.volume=10&rft.issue=4&rft.spage=679&rft.epage=692&rft.pages=679-692&rft.issn=1869-0327&rft.eissn=1869-0327&rft_id=info:doi/10.1055/s-0039-1695793&rft_dat=%3Cpubmed_cross%3E31509880%3C/pubmed_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/31509880&rfr_iscdi=true |