ABSTRACT
miRNAs are small noncoding RNA molecules, mainly responsible for post-transcriptional control of gene expressions. Machine learning is becoming more and more widely used in breast tumor classification and diagnosis. In this paper, we compared the performance of different machine learning methods, such as Random Forest (RF), eXtreme Gradient Boosting(XGBoost) and Light Gradient Boosting Machine(LightGBM), for miRNAs identification in breast cancer patients. The performance comparison of each algorithm was evaluated based on the accuracy and logistic loss and where LightGBM was found better performing in several aspects. hsa-mir-139 was found as an important target for the breast cancer classification. As a powerful tool, LightGBM can be used to identify and classify miRNA target in breast cancer.
- Shi, J., Sahiner, B., Chan, H. P., Ge, J., Hadjiiski, L., Helvie, M. A., Nees, A., Wu, Y. T., Wei, J., and Zhou, C. et al. 2008. Characterization of mammographic masses based on level set segmentation with new image features and patient information. Medical physics. Vol. 35, no. 1, 280--290.Google Scholar
- Ganesan, K., Acharya, U. R., Chua, C. K., Min, L. C., Abraham, K. T., and Ng, K.-H. 2013. Computer-aided breast cancer detection using mammograms: a review, IEEE Reviews in Biomedical Engineering. Vol. 6, 77--98.Google ScholarCross Ref
- Alpaydin, E. 2014. Introduction to machine learning. MIT press. Google ScholarDigital Library
- Oliva, D. and Cuevas, E. 2017. Advances and applications of optimised algorithms in image processing. Intelligent systems reference library (ISSN 1868-4394). Vol. 117. Google ScholarDigital Library
- Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., and Dubourg, V. et al. 2011. Scikit-learn: Machine learning in python. Journal of Machine Learning Research. Vol. 12, no. Oct, 2825--2830. Google ScholarDigital Library
- Asri, H., Mousannif, H., Al Moatassime, H., and Noel, T. 2016. Using machine learning algorithms for breast cancer risk prediction and diagnosis. Procedia Computer Science. Vol. 83, 1064--1069.Google ScholarCross Ref
- Abreu, P. H., Santos, M. S., Abreu, M. H., Andrade, B., and Silva, D. C. 2016. Predicting breast cancer recurrence using machine learning techniques: A systematic review. ACM Computing Surveys (CSUR). Vol. 49, no. 3, 52. Google ScholarDigital Library
- Ahmad, L., Eshlaghy, A., Poorebrahimi, A., Ebrahimi, M., and Razavi, A. 2013. Using three machine learning techniques for predicting breast cancer recurrence. J Health Med Inform. Vol. 4, no. 124, 3.Google Scholar
- Kourou, K., Exarchos, T. P., Exarchos, K. P., Karamouzis, M. V., and Fotiadis, D. I. 2015. Machine learning applications in cancer prognosis and prediction. Computational and structural biotechnology journal. Vol. 13, 8--17.Google Scholar
- Liaw, A. and Wiener, M. 2002. Classification and regression by randomforest. R news. Vol. 2, no. 3, 18--22.Google Scholar
- Chen, T. and Guestrin, C. 2016. Xgboost: A scalable tree boosting system. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 785--794. Google ScholarDigital Library
- Meng, Q., Ke, G., Wang, T., Chen, W., Ye, Q., Ma, Z. M., and Liu, T. 2016. A communication-efficient parallel algorithm for decision tree. In Advances in Neural Information Processing Systems. 1271--1279. Google ScholarDigital Library
- Ranka, S. and Singh, V. 1998. Clouds: A decision tree classifier for large datasets. In Proceedings of the 4th Knowledge Discovery and Data Mining Conference. 2--8. Google ScholarDigital Library
- Jin, R. and Agrawal, G. 2003. Communication and memory efficient parallel decision tree construction. In Proceedings of the 2003 SIAM International Conference on Data Mining. SIAM, 119--129.Google Scholar
- Rask, L., Balslev, E., Søkilde, R., Høgdall, E., Flyger, H., Eriksen, J., and Litman, T. 2014. Differential expression of mir-139, mir-486 and mir-21 in breast cancer patients sub-classified according to lymph node status. Cellular Oncology. Vol. 37, no. 3, 215--227.Google ScholarCross Ref
- Krishnan, K., Steptoe, A. L., Martin, H. C., Pattabiraman, D. R., Nones, K., Waddell, N., Mariasegaram, M., Simpson, P. T., Lakhani, S. R., and Vlassov, A. et al. 2013. mir-139-5p is a regulator of metastatic pathways in breast cancer. Rna. Vol. 19, no. 12, 1767--1780.Google ScholarCross Ref
- Dong, G., Liang, X., Wang, D., Gao, H., Wang, L., Wang, L., Liu, J., and Du, Z. 2014. High expression of mir-21 in triplenegative breast cancers was correlated with a poor prognosis and promoted tumor cell in vitro proliferation. Medical oncology. Vol. 31, no. 7, 1--10.Google Scholar
- Lee, J. A., Lee, H. Y., Lee, E. S., Kim, I., and Bae, J. W. 2011. Prognostic implications of microrna-21 overexpression in invasive ductal carcinomas of the breast. Journal of breast cancer. Vol. 14, no. 4, 269--275.Google ScholarCross Ref
- Lowery, A. J., Miller, N., Dwyer, R. M., and Kerin, M. J. 2010. Dysregulated mir-183 inhibits migration in breast cancer cells. BMC cancer. Vol. 10, no. 1, 502.Google Scholar
- Li, P., Sheng, C., Huang, L., Zhang, H., Huang, L., Cheng, Z., and Zhu, Q. 2014. Mir-183/-96/-182 cluster is upregulated in most breast cancers and increases cell proliferation and migration. Breast cancer research. Vol. 16, no. 6, 473.Google Scholar
Index Terms
- LightGBM: An Effective miRNA Classification Method in Breast Cancer Patients
Recommendations
Prediction of microRNAs involved in immune system diseases through network based features
Display Omitted We present an immune miRNA classifier based on novel network and motif features.Our integrated approach aims to discriminate immune miRNAs from non-immune miRNAs.The combined network, motif, sequence and structure features provides ...
A comparative analysis of gradient boosting algorithms
AbstractThe family of gradient boosting algorithms has been recently extended with several interesting proposals (i.e. XGBoost, LightGBM and CatBoost) that focus on both speed and accuracy. XGBoost is a scalable ensemble technique that has demonstrated to ...
Research on Cancer Diagnosis Method Based on LightGBM-Gridsearchcv
BDE '22: Proceedings of the 4th International Conference on Big Data EngineeringCancer has become a non-negligible problem that threatens human health in today's society. The traditional methods of cancer diagnosis usually use cell morphology, histopathology and other methods. Nowadays, the use of machine learning technology to ...
Comments