Prediction of Acute Leukemia Types using R

Authors

  • Rubeena Rustum, Dr. P. Varaprasada Rao

Abstract

Leukemia is the most prevalent blood cancer which is commonly found in young children. Blood Cancers which is also known as hematologic cancers.  Here, a linear SVM with SCAD regularization was trained on this data in order to predict the class of acute leukemia for unseen patients. The SVM correctly classified thirty three of thirty four patients in the test set. Furthermore, it selected genes with accession numbers M27891_at, M96326_rna1_at, and Y00787_s_at as being positively associated with AML onset and negatively associated with ALL onset. This association consistently appeared in bootstrap resampling simulations, and it may offer insight into acute leukemia onset for medical researchers. These results support the conventional wisdom that sparse linear methods can yield accurate and interpretable classifiers for microarray data. But the model trained here would need to be validated on a larger dataset in order to statistically justify its use as a diagnosis tool.

Published

2020-02-29

Issue

Section

Articles