Wechat

Website

Chinese Journal of Oncology Prevention and Treatment ›› 2022, Vol. 14 ›› Issue (1): 70-75.doi: 10.3969/j.issn.1674-5671.2022.01.12

Previous Articles     Next Articles

Predicting the pathological complete response of breast cancer patients to neoadjuvant chemotherapy based on machine learning

  

  • Online:2022-02-25 Published:2022-03-11

Abstract:  Objective To develop a machine learning model based on the clinical and pathological characteristics data in the breast cancer electronic medical record system to predict the pathological complete response (pCR) after neoadjuvant chemotherapy (NAC). Methods The clinical information on the breast cancer patients who received NAC treatment and curative surgery in Qingdao Municipal Hospital from January 2015 to December 2020 were retrospectively collected. The patients were randomly divided into training set and validation set in a ratio of 7:3. Five machine learning models were built in the training set, including Logistic regression (LR), artificial neural network (ANN), naive bayes (NB), random forest (RF) and XGboost models. The area under the receiver operating characteristic (ROC) curve (AUC), accuracy, sensitivity and specificity were used to evaluate the predictive ability of machine learning. Results A total of 742 patients were included in the analysis, 533 in the training set and 209 in the validation set. After feature engineering, the properties such as age, CA-15-3, ER status, PR status, HER2 status, Ki-67, T stage, N stage, and NAC plan were selected to construct a prediction model. Among the five machine learning models, the XGboost model had the highest performance, with AUC of 0.850 and 0.834 in the training set and the validation set, respectively. Conclusions The XGboost model constructed based on the machine learning, pre-treatment clinical and pathological features has good efficacy in predicting the pCR response of breast cancer patients after NAC, providing a basis for the formulation of subsequent treatment strategies for patients.

Key words: Breast cancer, Neoadjuvant chemotherapy, Complete pathological response, Machine learning, XGboost

CLC Number: 

  • R737.9