The Importance of Taxonomic Classification Software and Machine Learning Algorithms for the Prediction of Colorectal Cancer

Sebastian Mølvang Dall, Thomas Yssing Michaelsen, Mads Albertsen

Research output: Contribution to conference without publisher/journalPosterResearch

7 Downloads (Pure)

Abstract

Colorectal cancer (CRC) is the development of cancer in the rectum or colon and represents a rising global burden ranking third in terms of incidence and second in terms of mortality. It is estimated the global burden of CRC will increase by 60% to 2.9 million new cases and 1.5 million new deaths by 2040. Due to the burden of CRC several countries, including Denmark, have implemented a national screening program for early detection of CRC. The method, iFOBT, implemented in Denmark measures hemoglobin in stool, and in case of a positive test, the patient is invited to a colonoscopy for final diagnosis. However, the iFOBT has a high false-positive rate (FPR) of 45%, which resulted in 9,800 unnecessary colonoscopies in 2019, amounting to approximately 43 million DKK and 7,350 hours.

In recent years, the combination of machine learning algorithms (MLA) and shotgun metagenomics have established strong associations between the gut microbiota and cancer status in patients, representing a potential new tool for CRC screening. In this thesis the impact of three taxonomic classification software (MetaPhlAn3, Kraken2, Kaiju) and four MLA’s (Neural Net, XGBoost, Random Forest, LASSO) on CRC prediction were tested. Kraken2 resulted in the best prediction of CRC, a significantly better prediction than Kaiju. Furthermore, XGboost and Random Forest performed on average better than other MLA’s. Also, CRC prediction could be achieved with as few as 100,000 reads, when using Kraken2.
Translated title of the contributionVigtigheden af taksonomisk klassificeringssoftware og machine learning algoritmer for prædiktion af kolorektal cancer
Original languageEnglish
Publication date15 Nov 2021
Publication statusPublished - 15 Nov 2021
EventDanish Microbiological Society congress 2021 - Marmorhallen, Frederiksberg, Copenhagen, Denmark
Duration: 15 Nov 202115 Nov 2021
https://dms.dk/congress

Conference

ConferenceDanish Microbiological Society congress 2021
LocationMarmorhallen, Frederiksberg
Country/TerritoryDenmark
CityCopenhagen
Period15/11/202115/11/2021
Internet address

Fingerprint

Dive into the research topics of 'The Importance of Taxonomic Classification Software and Machine Learning Algorithms for the Prediction of Colorectal Cancer'. Together they form a unique fingerprint.

Cite this