Skip to main content

Implementation of a machine learning application in preoperative risk assessment for hip repair surgery



This study aims to develop a machine learning-based application in a real-world medical domain to assist anesthesiologists in assessing the risk of complications in patients after a hip surgery.


Data from adult patients who underwent hip repair surgery at Chi-Mei Medical Center and its 2 branch hospitals from January 1, 2013, to March 31, 2020, were analyzed. Patients with incomplete data were excluded. A total of 22 features were included in the algorithms, including demographics, comorbidities, and major preoperative laboratory data from the database. The primary outcome was a composite of adverse events (in-hospital mortality, acute myocardial infarction, stroke, respiratory, hepatic and renal failure, and sepsis). Secondary outcomes were intensive care unit (ICU) admission and prolonged length of stay (PLOS). The data obtained were imported into 7 machine learning algorithms to predict the risk of adverse outcomes. Seventy percent of the data were randomly selected for training, leaving 30% for testing. The performances of the models were evaluated by the area under the receiver operating characteristic curve (AUROC). The optimal algorithm with the highest AUROC was used to build a web-based application, then integrated into the hospital information system (HIS) for clinical use.


Data from 4,448 patients were analyzed; 102 (2.3%), 160 (3.6%), and 401 (9.0%) patients had primary composite adverse outcomes, ICU admission, and PLOS, respectively. Our optimal model had a superior performance (AUROC by DeLong test) than that of ASA-PS in predicting the primary composite outcomes (0.810 vs. 0.629, p < 0.01), ICU admission (0.835 vs. 0.692, p < 0.01), and PLOS (0.832 vs. 0.618, p < 0.01).


The hospital-specific machine learning model outperformed the ASA-PS in risk assessment. This web-based application gained high satisfaction from anesthesiologists after online use.

Peer Review reports


A comprehensive preoperative evaluation can enhance the quality of patient care and is associated with a reduced mortality rate [1,2,3]. In preoperative evaluation clinics, the anesthesiologist must evaluate the patient's medical history and laboratory data, determine the patient’s physical status, and draw up a preoperative management plan in a limited time. After the initial assessment, the anesthesiologist discusses these risks with the patient and the surgical team [2]. In the case of emergency or urgent surgeries, all these tasks must be achieved with high efficiency [4].

The incidence of hip fractures is gradually increasing because the aging population is continually growing, and it is causing a heavy burden on society [5, 6]. Almost all patients with hip fractures would require surgical treatment, and anesthesia intervention is inevitable. These patients are mostly geriatric and have many comorbidities.

Several estimation tools have been used to help physicians assess operative risks [7,8,9]. These models range from simple scoring systems, such as the American Society of Anesthesiologist-Physical Status (ASA-PS) [10], to more complex calculators, such as the Surgical Risk Preoperative Assessment System (SURPAS) and the American College of Surgeons National Surgical Quality Improvement Program (ACS-NSQIP®) Surgical Risk Calculator [11,12,13]. Although the former is easy to use, it ignores many important parameters, such as sex, age, comorbidities, and laboratory data; while the latter includes more parameters, it requires tedious work. For instance, the ACS-NSQIP, which is an open-access web-based online tool that is currently gaining worldwide acceptance, involves manually entering 19 to 21 patient-specific variables [13,14,15]. In a busy medical ecology that requires efficiency, developing an automated preoperative evaluation system that incorporates multiple parameters is imminent.

We aimed to develop a machine learning-based application that can assist anesthesiologists in assessing specific adverse outcomes for patients required to undergo hip repair surgery. We hypothesized that a machine learning algorithm, which includes variables such as patient demographics, comorbidities, laboratory data, and anesthesiologists’ initial assessment, may have superior performance in risk assessment than the ASA-PS scoring. Through the assistance of the data-driven application, anesthesiologists will be able to effectively evaluate patients and precisely inform them of the operative risks, allowing them to have shared decision-making in a real-world medical domain.


Study design

We established a multidisciplinary team including anesthesiologists, data scientists, and information engineers for this retrospective study. Data were extracted from the Chi-Mei Medical Center’s hospital information systems (HIS) database to build the AI prediction models. We deleted the medical record number and all types of personal identification of each patient to protect their privacy.

All methods were carried out per relevant guidelines and regulations of Chi Mei Medical Center. The construction of the database was approved by the institutional review board (Serial No. 10906–008). Informed consent was waived because of the retrospective design of the study, which only involved secondary analysis of existing data and had no direct patient contact. After machine learning model training and performance testing, the optimal models were deployed into the existing HIS to assist anesthesiologists in performing preoperative risk assessment for patients with hip fractures (Fig. 1).

Fig. 1
figure 1

Flow chart of study

Study setting and patient selection

Adult patients aged 20 years and above who underwent surgical treatment of a hip fracture at Chi Mei Medical Center and its 2 branch hospitals in Tainan City, Taiwan, from January 1, 2013, to March 31, 2020, were selected based on current procedural terminology (CPT) codes. The CPT codes for enrollment included hip fixation codes (27,235, 27,236, 27,244, and 27,245), hemiarthroplasty (27,125), and total hip arthroplasty (27,130) with an admission diagnosis of hip fracture (ICD-9 codes 820.x or ICD-10 codes S70-S79). Patients with incomplete perioperative data, those whose body weight was 30 kg and below, and those whose height was 100 cm and below were excluded from the study. A total of 5,301 patients were initially reviewed for this study, but only 4,448 patients were included after considering the exclusion criteria.

Feature variables

Using a similar method from previous studies [16,17,18], established clinical importance and clinical expert opinion were used to select 22 preoperative variables from the HIS dataset as inputs to the algorithm for calculating the risk of adverse events of interest. The feature variables retrieved from the HIS database include (1) patient demographics (e.g., age, sex, body mass index, and smoking status); (2) preoperative comorbidities (e.g., heart diseases such as coronary artery disease, congestive heart failure, old myocardial infarction, previous cerebral stroke, dialysis use, presence of COPD; (3) laboratory values (e.g., serum sodium, white blood cell count, hematocrit, platelet count, creatinine, blood urea nitrogen, creatinine, albumin, and prothrombin time); and (4) operative and anesthetic variables (e.g., ASA-PS status, mode of anesthesia, and the anticipated arterial line and central venous pressure monitoring). These features were integrated into the algorithm for machine learning. To develop the models, the patients were randomly split into a training cohort (70%) and a testing cohort (30%). This separation helped ensure that the test set was kept completely independent from the training set.

Study outcomes

This study’s primary outcome was a composite of postoperative adverse events, including (1) in-hospital mortality and death within 48 h after discharge (discharge death code); (2) acute stroke (ICD-9-CM codes 430 to 436 and 997.02, and ICD-10-CM codes I609, I619, I6789); (3) acute myocardial infarction (AMI) (ICD-9-CM code 410, and ICD-10-CM codes I21 or I23); (4) acute respiratory failure (ICD-9-CM codes 518.81 to 518.82, 518.84, and 518.5, and ICD-10-CM codes J96); (5) sepsis (ICD-9-CM codes 038, and ICD-10-CM codes R65); (6) acute liver failure (ICD-9-CM codes 570, and ICD-10-CM codes K7200); and (7) acute renal failure (ICD-9-CM codes 584.9 and ICD-10-CM codes 570, and ICD-10-CM codes with ICD-10-CM codes K7200); and (7) dialysis code (ICD-9-CM codes 584). Moreover, postoperative intensive care unit (ICU) admission and prolonged length of stay (PLOS) were set as secondary outcomes.

Machine learning algorithms

The models were trained with 7 machine learning algorithms consisting of (1) logistic regression, (2) random forest, (3) k nearest neighbor (KNN), (4) support vector machines (SVM), (5) light gradient boosting machine (light GBM), (6) eXtreme gradient boosting (XGBoost), and (7) deep learning of multilayer perception (MLP). To address the issue of class imbalance in the training cohort, the synthetic minority oversampling technique was utilized. Python programming language with scikit.learn machine learning library was used for model building. Grid searching with 5-fold cross-validation for hyperparameter tuning for each algorithm was performed to obtain the optimal models. The goal of the algorithms is to predict the primary and secondary outcomes.

Model performance

Each model was used to predict the test set. The specificity, sensitivity, accuracy, and area under the receiver operating characteristic curve (AUROC) were calculated and the models’ predictive performances were compared based on the AUROC value.

Implementation of web-service application to HIS

The optimal algorithm with the highest AUROC was used to build a web-based application, then integrated into the HIS for pre-anesthetic patient evaluation.

Anesthesiologist satisfaction score after AI-assisted risk assessment

After each completion of the AI-assisted risk assessment, the system automatically requests its users to grade their level of satisfaction from 0 (most dissatisfied) to 5 (highly satisfied). The first month, which was done online, was employed as the benchmark reference to compare the changes in the satisfaction of anesthesiologists during the study period from July 2020 to April 2021.

Incidence of adverse outcomes before and after web-based application deployment in HIS

We deployed the AI-assisted risk assessment application online beginning July 1, 2020. To assess whether the developed application improved the medical outcomes, we compared the incidences of primary composite adverse outcomes, ICU admission, and PLOS before (July 2019 to April 2020) and after (July 2020 to April 2021) AI-assisted risk assessment.

Statistical analysis

Descriptive statistical analysis of the data was performed using SPSS 13.0 for Windows (SPSS, Inc., Il, USA). Continuous variables were defined as the means and standard deviations or medians and ranges. Countable variables were defined with numbers and percentages. The models’ predictive performances were compared with each other and with conventional ASA-PS risk stratification based on the AUROC value using the Delong test [19]. A series of one-way analyses of variance were conducted to examine the differences in the satisfaction score among the five-month groups. After, Tukey's honestly significant difference post hoc test was performed to detect the intergroup differences. The level of significance was set at a p-value less than 0.01.



From January 1, 2013, to March 31, 2020, a total of 5,301 adult patients who had hip fractures and received hip repair surgery under GA or NA were identified. After removing excluded patients, data from 4,448 patients underwent analysis. From this, 3,114 patients (70%) were randomly allocated for training the machine learning models, and 1,334 patients (30%) were set as the validation cohort (Fig. 1).

Patient demographics and characteristics of the training and testing data sets are summarized in Table 1. The mean age of the patients was 65.3 years, and they are mostly females (57.6%). Approximately 70.8% of them were stratified as ASA-PS 3 status. The event rates were 2.3% (N = 120), 3.6% (N = 160), and 9.0% (N = 401) for composite primary adverse events, ICU admission, and PLOS, respectively. As shown in Table 1, patients with primary composite adverse events and ICU admission were older, mostly males, had anemia, longer prothrombin time, and activated partial thromboplastin time (aPTT), and had comorbidities such as chronic respiratory diseases, cancer, heart disease, dementia, or advanced chronic kidney disease (stage 4 and 5).

Table 1 Demographic data of study patients

Correlation analysis (Table 2) identified the correlation between each feature's outcome. For composite primary adverse outcomes, the most relevant features were anticipated intraoperative arterial line and central venous catheter monitoring, preoperative hemoglobin, and respiratory comorbidity; for ICU admission, the most relevant features were ASA-PS status, arterial line central venous pressure monitoring, and preoperative hemoglobin; and for PLOS, the key factors included emergency surgery, ASA-PS, central venous monitoring, ALT, eGFR, P.T., and comorbid of respiratory disease.

Table 2 The correlation coefficients between each feature and each outcome

Prediction of primary composite adverse outcomes

In the machine learning prediction of primary composite adverse outcomes such as in-hospital mortality, mortality within 48 h after discharge, and major organ injury, the sensitivity by logistic regression, SVM, and lightGBM all reached 0.710; the SVM had the highest specificity (0.716) followed by KNN (0.711) (Table 3). All models, except KNN, achieved high AUROCs which were between 0.734 (XGBoost, 95% C.I.: 0.636 ~ 0.831) and 0.794 (Logistic regression, 95% C.I.: 0.718 ~ 0.869) (Appendix 1).

Table 3 Predictive performance of machine learning algorithms on primary composite adverse outcomes*

Prediction of postoperative ICU admission

As shown in Table 4, in the prediction of ICU admission, MLP (0.812) and logistic regression (0.792) had the highest sensitivity. Further, logistic regression (0.791), lightGBM (0.769), and the random forest (0.760) had the highest specificity. Moreover, logistic regression, lightGBM, and the random forest had the highest accuracy (between 0.760 to 0.791). Except for KNN and SVM, all models had AUROCs between 0.825 (XGBoost, 95% C.I.: 0.772 ~ 0.878) and 0.856 (Logistic regression, 95% C.I.: 0.804 ~ 0.908) (Appendix 2).

Table 4 Predictive performance of machine learning algorithms on ICU admission*

Prediction of prolonged length-of-stay (PLOS)

The results of model prediction on PLOS are shown in Table 5. The random forest and lightGBM had higher sensitivity (0.783 and 0.767, respectively) and specificity (0.783 and 0.774, respectively) than other algorithms. Moreover, the random forest had the best performance with the highest AUROCs (0.854; 95% C.: 0.818 ~ 0.890) (Table 5 and Appendix 3).

Table 5 Predictive performance of machine learning algorithms on prolonged length-of-stay*

The performance of machine learning models and ASA-PS

Based on the results, an AI web-based application was constructed using logistic regression for adverse outcomes and ICU admission, and the random forest for PLOS. Figure 2 shows a snapshot of the AI web service application for predicting adverse outcomes in the pre-anesthetic visit clinic.

Fig. 2
figure 2

A Snapshot of the web-based application in the hospital information system

The results in Table 6 indicate that the machine learning AI web-based application had superior AUROC scores (Delong test, P < 0.001) than ASA-PS stratification in terms of primary composite adverse outcomes (0.776 vs. 0.629), ICU admission (0.844 vs. 0.629), and PLOS (0.854 vs. 0.618).

Table 6 Comparison of AI models with ASA-PS for primary composite adverse outcomes, ICU admission and prolonged length of hospital stay

The incidences of adverse outcomes before and after AI-assisted application deployment

Table 7 demonstrates the demographics and incidences of adverse outcomes in 545 and 500 patients before and after implementing the online web-based application. There was no statistically significant decrease in the incidence of primary composite adverse events (3.3 vs. 1.6%, p = 0.117) or ICU admission (4.4 vs. 2.4%, p = 0.109) after the web-based application was initially employed for clinical use.

Table 7 The incidences of major adverse outcomes before and after AI web-based application online use

The satisfaction score web-based application from anesthesiologists

The AI Assist Application was launched on July 1, 2020, and by April 30, 2021, a total of 500 patients were evaluated under the assistance of AI. Figure 3 illustrates that the satisfaction score rose from 3.21 ± 0.51 (1st month online) to 4.70 ± 0.56 (10th month). The score was significantly higher starting in the 4th month after the application was launched (P < 0.01).

Fig. 3
figure 3

Anesthesiologists’ satisfaction ratings of the web-based application since its implementation


In this retrospective study using the HIS database, machine learning methods were applied in our hospital-specific real-world medical domain to assist anesthesiologists in their preoperative risk assessment for patients required to undergo hip fracture repair surgery in terms of primary composite adverse outcomes (mortality and major organ injuries), the need for ICU admission and PLOS. The newly developed AI assist application was found to have significantly higher sensitivity, specificity, accuracy, and performance (AUROC) than that of the ASA-PS, the traditional and most widely used risk stratification method. The major strength of this study is its successful integration of the AI-assisted app into the hospital’s HIS system. The novel contribution of this study is that the machine learning algorithm empowered the ASA-PS scoring, allowing more specific prognostic assessments for patients undergoing hip surgery. Moreover, this online app is user-friendly and received high satisfaction scores from anesthesiologists who used it.

Machine learning can simultaneously deal with numerous variables by building statistical models based on outliers and nonlinear interactions among variable [20, 21]. To use our application, the anesthesiologist, after evaluating the patient, first inputs the ASA-PS followed by the mode of anesthesia and whether an arterial line or a ventral venous catheter is anticipated. Then, the AI assist application automatically captures 22 features, which are important independent risk factors [22, 23], from the HIS. All the anesthesiologist has to do is click the calculation button. The application will start to run and calculate the risk scores for in-hospital mortality, ICU admission, and PLOS. The results of the calculation are then displayed on the computer screen.

The results of this study demonstrated that modern AI computer systems can not only collect and display data but can also play an active role in assisting physicians with their risk assessments, allowing them to make a shared decision with the patient or their family members.

Previous machine learning applications in hip fracture have demonstrated high potential for automated detection of hip fractures on radiographs and hip fracture risk prediction [24,25,26]. Further, recent studies have tried to build a precise model for mortality of hip fracture surgery [27]. This current research not only developed an AI-based risk predictive model that performed better than ASA-PS in terms of risk assessment for hip fracture surgery but also successfully incorporated the application into the hospital’s existing HIS, allowing it to be used in daily practice. It is important for hospitals to establish a more reliable and available model of risk assessment for patients who need to undergo hip fracture surgery. A precise model could help improve physicians’ shared decision-making with their patients and assist in evaluating the need for critical care monitoring after surgery. Machine learning techniques can integrate a large amount of data already captured in the HIS, which offers a prediction model with better predictive performance and facilitates automation.

This research does not suggest the discontinuation of the ASA-PS system nor does it refute the need for human intelligence; instead, it aims to add a machine-learning algorithm to facilitate efficiency in preoperative risk assessment. Anesthesiologists need to judge the patient’s ASA physical status and decide whether an arterial line or CVC is anticipated perioperatively. The AI assist application takes the above information, in conjunction with patient data captured from the HIS, to calculate the risk of adverse events.

Machine learning algorithms have been proven to more accurately assess the risks associated with anesthesia and surgery [28, 29]. A study published by Ehlers et al. [28] used the insurance claims database and calculated the Naïve Bayes algorithm to predict the risk of postoperative complications and showed superiority to Charlson's comorbidity index. In the present study, more variables were adapted, and the data were used for the training of 7 algorithms. A recent study by Li et al. [30] reported AI prediction using the random forest algorithm to predict 1-year mortality after hip repair surgery. They collected data from 1,330 and 744 patients to train and validate the AI algorithm, respectively. In the present study, data from 3,114 patients for training and 1,334 for validation were included. Because the sample size is greater, the statistical power is therefore stronger. Moreover, other than mortality, the risk of ICU admission and PLOS were also estimated.

Aside from the widely accepted ASA-PS system, there are also some pre-anesthetic risk stratification tools being used in hospitals, such as the ACS-NSQUIP, an open-access online tool based on the logistic model [31]. Our AI application shares some similarities with ACS-NSQIP. Both tools have some variables in common, such as age, sex, ASA, emergency surgery, and CPT procedure code. The ACS-NSQIP is calculated based on logistic regression, while our AI application used seven algorithms, including logistic regression, for machine learning. After comparing the AUROC of these algorithms, the best algorithm was selected to build the AI prediction tool. Moreover, although there is now an open-excess ACS-NSQIP, it has not been built into the Chi Mei Medical Center HIS system yet. Therefore, the present study did not include this as a reference comparison. Further comparative studies are very worth conducting in the future.

Although anesthesiologists have affirmed the online application of this study's app, we still have not observed a significant effect on reducing the incidence of adverse outcomes, ICU admission, and PLOS. It may require a longer observation time and a larger population of patients to justify the efficacy of this web-based application.


Some limitations are inherent in retrospective machine learning projects using hospital-specific databases. First, the accuracy of prediction algorithms at specific hospitals may be limited by hospital-specific factors. However, the methodology could theoretically be generalized to similar hospitals with similar patient races or under similar health insurance systems. Second, this study was dependent on the correctness of the ICD-9- or ICD-10-CM coding while identifying study cases, comorbidities, and complications. These codes were reviewed and validated by auditors of medical records for the insurance system to ensure the accuracy of the claims; however, there is still the possibility of miscoding and misclassifying some diseases and conditions. Third, the study’s data were extracted from a single medical institution and its 2 branch hospitals; thus, an underlying referral bias might have existed. Therefore, to obtain a more generalizable result, external validation using patient cohorts from other institutions is required. Fourth, since this is a retrospective study, further validation in a prospective manner to demonstrate predictive capability is needed. Fifth, postoperative mortality in the study’s model was limitedly captured from in-hospital complications and in-hospital death or death within 48 h after discharge. Our models demonstrated relatively short-term mortality or adverse events because some patients, especially those who had complications or were dissatisfied with the surgical service, might have transferred to other hospitals without a referral. Therefore, the study endpoints were limited to the in-hospital period. Sixth, non-geriatric hip fracture may have different pathophysiologic mechanisms from geriatric hip fractures and may require different assessment tool. In our preliminary analysis, we subgrouped study patients into those above 50 (n = 3551) and below 50 years (n = 897), however, this number of non-geriatric patients was insufficient to support machine learning. Therefore, the current research cannot meet this need.


The AI assist application developed using a machine learning algorithm was found to be helpful for anesthesiologists in evaluating the risks associated with hip surgery more efficiently and accurately than the traditional ASA-PS stratification method. Although this study may be limited by hospital-specific factors, it could still be generalized to hospitals with similar patient races and comparable health insurance systems. Moreover, this web-based application gained a high satisfaction score from anesthesiologists, which implies an urgent need for automated artificial intelligence assistance in preoperative risk assessment.

Availability of data and materials

The data that support the findings of this study are available from the corresponding author on reasonable request. (



Support vector machine


K nearest neighbor

light GBM:

A light gradient boosting machine


Multilayer perception


An extreme gradient boosting


Area under the receiver operating characteristic curve


Confidence interval


Body mass index


American Society of Anesthesiologist-physical status


General anesthesia


Central venous catheter


Alanine aminotransferase


Estimated glomerular filtration rate




Chronic kidney disease


  1. Nidadavolu LS, et al. Preoperative Evaluation of the Frail Patient. Anesth Analg. 2020;130(6):1493–503.

    Article  Google Scholar 

  2. Tassler A, Kaye R. Preoperative Assessment of Risk Factors. Otolaryngol Clin North Am. 2016;49(3):517–29.

    Article  Google Scholar 

  3. Blitz JD, et al. Preoperative Evaluation Clinic Visit Is Associated with Decreased Risk of In-hospital Postoperative Mortality. Anesthesiology. 2016;125(2):280–94.

    Article  Google Scholar 

  4. Committee on S, et al. Practice advisory for preanesthesia evaluation: an updated report by the American Society of Anesthesiologists Task Force on Preanesthesia Evaluation. Anesthesiol. 2012;116(3):522–38.

    Article  Google Scholar 

  5. Clement RC, et al. Economic viability of geriatric hip fracture centers. Orthopedics. 2013;36(12):e1509–14.

    Article  Google Scholar 

  6. Mohd-Tahir NA, Li SC. Economic burden of osteoporosis-related hip fracture in Asia: a systematic review. Osteoporos Int. 2017;28(7):2035–44.

    Article  Google Scholar 

  7. Maxwell MJ, Moran CG, Moppett IK. Development and validation of a preoperative scoring system to predict 30 day mortality in patients undergoing hip fracture surgery. Br J Anaesth. 2008;101(4):511–7.

    CAS  Article  Google Scholar 

  8. Endo A, et al. Prediction Model of In-Hospital Mortality After Hip Fracture Surgery. J Orthop Trauma. 2018;32(1):34–8.

    Article  Google Scholar 

  9. Jiang HX, et al. Development and initial validation of a risk score for predicting in-hospital and 1-year mortality in patients with hip fractures. J Bone Miner Res. 2005;20(3):494–500.

    CAS  Article  Google Scholar 

  10. Bjorgul K, Novicoff WM, Saleh KJ. American Society of Anesthesiologist Physical Status score may be used as a comorbidity index in hip fracture surgery. J Arthroplasty. 2010;25(6 Suppl):134–7.

    Article  Google Scholar 

  11. Bronsert MR, et al. The value of the “Surgical Risk Preoperative Assessment System” (SURPAS) in preoperative consultation for elective surgery: a pilot study. Patient Saf Surg. 2020;14:31.

    Article  Google Scholar 

  12. Trickey AW, Ding Q, Harris AHS. How Accurate Are the Surgical Risk Preoperative Assessment System (SURPAS) Universal Calculators in Total Joint Arthroplasty? Clin Orthop Relat Res. 2020;478(2):241–51.

    Article  Google Scholar 

  13. Meguid RA, et al. Surgical Risk Preoperative Assessment System (SURPAS): III. Accurate Preoperative Prediction of 8 Adverse Outcomes Using 8 Predictor Variables. Ann Surg. 2016;264(1):23–31.

    Article  Google Scholar 

  14. Meguid RA, et al. Surgical Risk Preoperative Assessment System (SURPAS): II. Parsimonious Risk Models for Postoperative Adverse Outcomes Addressing Need for Laboratory Variables and Surgeon Specialty-specific Models. Ann Surg. 2016;264(1):10–22.

    Article  Google Scholar 

  15. Scotton G, et al. Is the ACS-NSQIP Risk Calculator Accurate in Predicting Adverse Postoperative Outcomes in the Emergency Setting? An Italian Single-center Preliminary Study. World J Surg. 2020;44(11):3710–9.

    Article  Google Scholar 

  16. Zhang PI, et al. Real-time AI prediction for major adverse cardiac events in emergency department patients with chest pain. Scand J Trauma Resusc Emerg Med. 2020;28(1):93.

    Article  Google Scholar 

  17. Boehm O, Baumgarten G, Hoeft A. Preoperative patient assessment: Identifying patients at high risk. Best Pract Res Clin Anaesthesiol. 2016;30(2):131–43.

    CAS  Article  Google Scholar 

  18. Meguid RA, et al. Surgical Risk Preoperative Assessment System (SURPAS): I. Parsimonious, Clinically Meaningful Groups of Postoperative Complications by Factor Analysis. Ann Surg. 2016;263(6):1042–8.

    Article  Google Scholar 

  19. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837–45.

    CAS  Article  Google Scholar 

  20. Zou J, Han Y, So SS. Overview of artificial neural networks. Methods Mol Biol. 2008;458:15–23.

    PubMed  Google Scholar 

  21. Deo RC. Machine Learning in Medicine. Circulation. 2015;132(20):1920–30.

    Article  Google Scholar 

  22. Cher EWL, et al. Comorbidity as the dominant predictor of mortality after hip fracture surgeries. Osteoporos Int. 2019;30(12):2477–83.

    Article  Google Scholar 

  23. Chang W, et al. Preventable risk factors of mortality after hip fracture surgery: Systematic review and meta-analysis. Int J Surg. 2018;52:320–8.

    Article  Google Scholar 

  24. Cheng CT, et al. Application of a deep learning algorithm for detection and visualization of hip fractures on plain pelvic radiographs. Eur Radiol. 2019;29(10):5469–77.

    Article  Google Scholar 

  25. Engels A, et al. Osteoporotic hip fracture prediction from risk factors available in administrative claims data - A machine learning approach. PLoS One. 2020;15(5):e0232969.

    CAS  Article  Google Scholar 

  26. Kruse C, Eiken P, Vestergaard P. Machine Learning Principles Can Improve Hip Fracture Prediction. Calcif Tissue Int. 2017;100(4):348–60.

    CAS  Article  Google Scholar 

  27. Chen CY, et al. Artificial Neural Network and Cox Regression Models for Predicting Mortality after Hip Fracture Surgery: A Population-Based Comparison. Medicina (Kaunas). 2020;56(5):243.

    Article  Google Scholar 

  28. Ehlers AP, et al. Improved Risk Prediction Following Surgery Using Machine Learning Algorithms. EGEMS (Wash DC). 2017;5(2):3.

    Google Scholar 

  29. Gabriel RA, et al. A Predictive Model for Determining Patients Not Requiring Prolonged Hospital Length of Stay After Elective Primary Total Hip Arthroplasty. Anesth Analg. 2019;129(1):43–50.

    Article  Google Scholar 

  30. Li Y, et al. A novel machine learning algorithm, Bayesian networks model, to predict the high-risk patients with cardiac surgery-associated acute kidney injury. Clin Cardiol. 2020;43(7):752–61.

    Article  Google Scholar 

  31. Cohen ME, et al. An Examination of American College of Surgeons NSQIP Surgical Risk Calculator Accuracy. J Am Coll Surg. 2017;224(5):787-795.e1.

    Article  Google Scholar 

Download references


We give our sincerest appreciation to the information department of the Chi Mei Medical Center for the technical help they dutifully provided.


This study was supported by grants from the Chi Mei Medical Center (number CMFHR109102).

Author information

Authors and Affiliations



YYL reviewed the literature, designed the study, and drafted the manuscript. JJW, SHH, CLK, and JYC reviewed the literature and interpreted the results. JJW revised the manuscript and provided administrative and technical support. CCC and CFL conceived and helped design the study, coordinated and interpreted the results, performed the statistical analyses, and revised the manuscript. All authors have read and approved the final version of this manuscript.

Corresponding authors

Correspondence to Chung-Feng Liu or Chin-Chen Chu.

Ethics declarations

Ethics approval and consent to participate

All methods were performed in accordance with relevant guidelines and regulations (Declaration of Helsinki). The construction of the database was approved by the institutional review board (Serial No. 10906–008). The need for informed consent was waived by the ethics committee/Institutional Review Board of Chi Mei Medical Center, because of the retrospective nature of the study.

Consent for publication

Not applicable.

Competing interests

All authors declare that there are no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1. Appendix 1.

ROC curves for each machine learning model after testing the validation datasets on the risk of adverse events prediction.

Additional file 2. Appendix 2.

ROC curves for each machine learning model after testing using the validation datasets on intensive care unit admission prediction.

Additional file 3. Appendix 3.

ROC curves for each machine learning model after testing using the validation datasets on prolonged hospital stay prediction.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Li, YY., Wang, JJ., Huang, SH. et al. Implementation of a machine learning application in preoperative risk assessment for hip repair surgery. BMC Anesthesiol 22, 116 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Hip surgery
  • Risk assessment
  • Machine learning