Now showing 1 - 2 of 2
  • 2015-06-10Journal Article Research Paper
    [["dc.bibliographiccitation.firstpage","344"],["dc.bibliographiccitation.issue","2"],["dc.bibliographiccitation.journal","Metabolites"],["dc.bibliographiccitation.lastpage","363"],["dc.bibliographiccitation.volume","5"],["dc.contributor.author","Hauschild, Anne-Christin"],["dc.contributor.author","Frisch, Tobias"],["dc.contributor.author","Baumbach, Jörg Ingo"],["dc.contributor.author","Baumbach, Jan"],["dc.date.accessioned","2021-09-17T08:41:42Z"],["dc.date.available","2021-09-17T08:41:42Z"],["dc.date.issued","2015-06-10"],["dc.description.abstract","Computational breath analysis is a growing research area aiming at identifying volatile organic compounds (VOCs) in human breath to assist medical diagnostics of the next generation. While inexpensive and non-invasive bioanalytical technologies for metabolite detection in exhaled air and bacterial/fungal vapor exist and the first studies on the power of supervised machine learning methods for profiling of the resulting data were conducted, we lack methods to extract hidden data features emerging from confounding factors. Here, we present Carotta, a new cluster analysis framework dedicated to uncovering such hidden substructures by sophisticated unsupervised statistical learning methods. We study the power of transitivity clustering and hierarchical clustering to identify groups of VOCs with similar expression behavior over most patient breath samples and/or groups of patients with a similar VOC intensity pattern. This enables the discovery of dependencies between metabolites. On the one hand, this allows us to eliminate the effect of potential confounding factors hindering disease classification, such as smoking. On the other hand, we may also identify VOCs associated with disease subtypes or concomitant diseases. Carotta is an open source software with an intuitive graphical user interface promoting data handling, analysis and visualization. The back-end is designed to be modular, allowing for easy extensions with plugins in the future, such as new clustering methods and statistics. It does not require much prior knowledge or technical skills to operate. We demonstrate its power and applicability by means of one artificial dataset. We also apply Carotta exemplarily to a real-world example dataset on chronic obstructive pulmonary disease (COPD). While the artificial data are utilized as a proof of concept, we will demonstrate how Carotta finds candidate markers in our real dataset associated with confounders rather than the primary disease (COPD) and bronchial carcinoma (BC). Carotta is publicly available at http://carotta.compbio.sdu.dk [1]."],["dc.identifier.doi","10.3390/metabo5020344"],["dc.identifier.pmid","26065494"],["dc.identifier.uri","https://resolver.sub.uni-goettingen.de/purl?gro-2/89616"],["dc.language.iso","en"],["dc.relation.issn","2218-1989"],["dc.title","Carotta: Revealing Hidden Confounder Markers in Metabolic Breath Profiles"],["dc.type","journal_article"],["dc.type.internalPublication","no"],["dc.type.subtype","original_ja"],["dspace.entity.type","Publication"]]
    Details DOI PMID PMC
  • 2022Journal Article
    [["dc.bibliographiccitation.firstpage","2278"],["dc.bibliographiccitation.issue","8"],["dc.bibliographiccitation.journal","Bioinformatics"],["dc.bibliographiccitation.lastpage","2286"],["dc.bibliographiccitation.volume","38"],["dc.contributor.author","Hauschild, Anne-Christin"],["dc.contributor.author","Lemanczyk, Marta"],["dc.contributor.author","Matschinske, Julian"],["dc.contributor.author","Frisch, Tobias"],["dc.contributor.author","Zolotareva, Olga"],["dc.contributor.author","Holzinger, Andreas"],["dc.contributor.author","Baumbach, Jan"],["dc.contributor.author","Heider, Dominik"],["dc.contributor.editor","Wren, Jonathan"],["dc.date.accessioned","2022-06-08T07:59:06Z"],["dc.date.available","2022-06-08T07:59:06Z"],["dc.date.issued","2022"],["dc.description.abstract","Abstract Motivation Limited data access has hindered the field of precision medicine from exploring its full potential, e.g. concerning machine learning and privacy and data protection rules. Our study evaluates the efficacy of federated Random Forests (FRF) models, focusing particularly on the heterogeneity within and between datasets. We addressed three common challenges: (i) number of parties, (ii) sizes of datasets and (iii) imbalanced phenotypes, evaluated on five biomedical datasets. Results The FRF outperformed the average local models and performed comparably to the data-centralized models trained on the entire data. With an increasing number of models and decreasing dataset size, the performance of local models decreases drastically. The FRF, however, do not decrease significantly. When combining datasets of different sizes, the FRF vastly improve compared to the average local models. We demonstrate that the FRF remain more robust and outperform the local models by analyzing different class-imbalances. Our results support that FRF overcome boundaries of clinical research and enables collaborations across institutes without violating privacy or legal regulations. Clinicians benefit from a vast collection of unbiased data aggregated from different geographic locations, demographics and other varying factors. They can build more generalizable models to make better clinical decisions, which will have relevance, especially for patients in rural areas and rare or geographically uncommon diseases, enabling personalized treatment. In combination with secure multi-party computation, federated learning has the power to revolutionize clinical practice by increasing the accuracy and robustness of healthcare AI and thus paving the way for precision medicine. Availability and implementation The implementation of the federated random forests can be found at https://featurecloud.ai/. Supplementary information Supplementary data are available at Bioinformatics online."],["dc.identifier.doi","10.1093/bioinformatics/btac065"],["dc.identifier.uri","https://resolver.sub.uni-goettingen.de/purl?gro-2/110631"],["dc.language.iso","en"],["dc.notes.intern","DOI-Import GROB-575"],["dc.relation.eissn","1460-2059"],["dc.relation.issn","1367-4803"],["dc.title","Federated Random Forests can improve local performance of predictive models for various healthcare applications"],["dc.type","journal_article"],["dc.type.internalPublication","unknown"],["dspace.entity.type","Publication"]]
    Details DOI